Audit-Ready AI: The Step-by-Step Guide to Passing a Conformity Assessment

Jasper Claes

Ai comformity assesment

Table of Contents

✅ Key Takeaways — Before You Read Further:

  • A conformity assessment is the mandatory gate your high-risk AI system must pass before it can legally operate in the EU market — not a post-launch formality.
  • Most Annex III high-risk AI systems qualify for internal self-assessment (no third party required) — but the documentation standard is identical to a Notified Body assessment.
  • Notified Body assessments — required for biometric identification and certain other systems — are in high demand and have lead times of 3 to 6 months in 2026. Book early.
  • This guide gives you the complete step-by-step process: the two assessment routes, what assessors actually check, the pre-submission checklist, how simulation tools accelerate readiness, and the most common reasons systems fail.

Every engineering team I work with arrives at the conformity assessment stage with some version of the same question: “We think we’ve done everything right — how do we actually prove it?”

That question is exactly right. Conformity assessment is the structured, documented process through which you prove — to a standard that a regulator can independently verify — that your high-risk AI system meets every applicable requirement of the EU AI Act. It is not a conversation or a self-declaration of good intentions. It is an evidence-based evaluation against a specific legal checklist, resulting in either a confirmed conformity finding or a list of non-conformities that must be remediated before market placement.

Getting through it successfully requires three things: complete documentation, a clear understanding of what assessors actually look for, and — ideally — a simulation run that catches your gaps before the real assessment does. This guide covers all three in detail.

For the technical documentation that underpins the conformity assessment, start with our Article 11 & Annex IV Technical File pillar guide. For the governance framework that surrounds the process, see our AI Governance Framework guide. For the full regulatory context, read our Ultimate Guide to EU AI Act Compliance.

What Is an AI Conformity Assessment Under the EU AI Act?

A conformity assessment is the process by which a provider of a high-risk AI system verifies that the system complies with all the requirements set out in Chapter III of the EU AI Act — specifically Articles 8 through 15, covering risk management, data governance, technical documentation, logging, transparency, human oversight, accuracy, and cybersecurity.

The assessment is required under Article 43 and must be completed before the system is placed on the EU market or put into service. Once completed successfully, it produces two legal instruments:

  1. The EU Declaration of Conformity (Article 47) — a formal legal statement signed by the provider confirming compliance
  2. The right to affix the CE marking (Article 49) — the conformity mark that signals to market surveillance authorities and customers that the system has passed assessment

Conformity assessment is not a one-time event. If you make a substantial modification to the system after initial assessment — defined under Article 3(23) as a change that affects compliance or alters the intended purpose — you must repeat the conformity assessment for the modified system. Build this into your product change management process from the start.

The Two AI Conformity Assessment Routes: Self-Assessment vs. Notified Body

The EU AI Act establishes two distinct assessment routes under Article 43. Which route applies to your system depends on its category — not your preference.

Route 1: Internal Conformity Assessment (Self-Certification)

The majority of high-risk AI systems under Annex III are eligible for internal conformity assessment — meaning the provider conducts the assessment themselves, without involving a third-party Notified Body. This is analogous to the CE self-declaration route available for many product categories under EU product safety law.

“Internal” does not mean informal or undemanding. The documentation standard for self-assessment is identical to that required for a Notified Body assessment. The difference is that a qualified, independent third party is not checking your work — which places the full evidentiary burden on your own documentation quality.

Systems eligible for internal assessment: All Annex III high-risk AI systems except those in the categories listed below that require third-party assessment.

The internal assessment procedure (Annex VI) requires:

  1. Verification that the quality management system under Article 17 is implemented and effective
  2. Examination of the Technical File to confirm it is complete and demonstrates compliance
  3. A documented conformity finding signed by an authorised representative
  4. Issuance of the EU Declaration of Conformity
  5. Application of the CE marking
  6. Registration in the EU AI database (Article 71)

Route 2: Third-Party Notified Body Assessment

Certain high-risk AI system categories require assessment by a Notified Body — an independent conformity assessment organisation designated by an EU member state authority and notified to the European Commission. Notified Bodies must meet strict competence, impartiality, and operational requirements under Article 44.

Systems requiring Notified Body assessment:

  • AI systems intended to be used as safety components of products subject to third-party conformity assessment under Annex II legislation (e.g., medical devices Class IIa and above, machinery where a Notified Body is already involved)
  • Remote biometric identification systems other than those used to verify a person’s identity with their prior explicit consent

For all other Annex III systems — employment AI, credit scoring AI, education AI, critical infrastructure AI — internal self-assessment is the applicable route.

Assessment RouteWho Conducts ItApplicable SystemsTypical TimelineCost Range
Internal Assessment (Annex VI)Provider’s own qualified teamMost Annex III systems (employment, credit, education, infrastructure, justice)4–12 weeks depending on documentation completenessInternal resource cost only
Notified Body Assessment (Annex VII)Designated third-party Notified BodyBiometric identification systems; AI in Annex II regulated products (Class IIa+)3–6 months (demand-driven in 2026)€15,000–€80,000+ depending on system complexity

What a Notified Body Actually Looks for in an AI Act Assessment

Whether you are self-certifying or going through a Notified Body, the substantive evaluation is the same — the Notified Body simply conducts it independently rather than you conducting it on your own system. Understanding what assessors actually examine gives you a precise target for your preparation.

Having advised companies going through early conformity assessments in 2025 and 2026, and having worked alongside assessment teams, here is what actually gets scrutinised most closely:

1. The Completeness and Internal Consistency of the Technical File

The first thing any assessor does is check whether the Technical File exists, is complete against the Annex IV checklist, and is internally consistent — meaning the system described in Section 1 (intended purpose) matches the system tested in Section 6 (testing results) matches the system monitored in Section 7 (logging and oversight).

Internal inconsistency is one of the most common assessment findings. A system described in Section 1 as “for use with candidates for senior management roles” that has performance data in Section 6 based entirely on entry-level role datasets has an internal consistency problem. An architecture diagram in Section 2 that shows a human review step that doesn’t appear in the Section 7 operational instructions is another classic example.

Assessors are specifically trained to cross-reference sections. They are not reading the Technical File linearly — they are checking whether Section 1’s claims about intended purpose are supported by Section 3’s training data composition, whether Section 4’s claimed performance levels are validated by Section 6’s test results, and whether Section 5’s cybersecurity claims are backed by actual test evidence.

2. The Article 9 Risk Management System — Is It Real or Cosmetic?

Article 9 requires a continuous, iterative risk management system — not a risk assessment that was completed once during development and never revisited. Assessors probe the authenticity of risk management documentation by looking for:

  • Dated risk logs that show risk identification happening across the development lifecycle, not all at once immediately before assessment
  • Evidence of risk mitigation in the system design — where a risk was identified, is there a corresponding design decision that addresses it?
  • Residual risk acknowledgment — does the documentation honestly disclose risks that remain after mitigation, or does it claim zero residual risk? Zero residual risk claims are a red flag for assessors.
  • A process for ongoing risk identification — how will new risks identified in production be captured and addressed?

A risk management document that was clearly written in the week before the assessment — because it contains no historical version information, no dated entries, and no evidence of evolution — will trigger detailed follow-up questions that a system with a genuine, ongoing risk management process can answer easily and a system without one cannot.

3. Training Data Quality and Bias Evidence

Assessors look very carefully at Section 3 (data governance) because this is where the most consequential compliance decisions are hidden. The specific areas of scrutiny:

  • Representativeness claims: Is the claim that the training data is representative of the deployment population actually supported by demographic breakdown data?
  • Bias detection methodology: Was a recognised bias detection methodology applied? The NIST AI RMF and the Google PAIR Guidebook provide widely recognised methodologies. Assessors look for alignment with established practice, not proprietary approaches that cannot be independently evaluated.
  • Bias mitigation evidence: Where bias was found, what was done about it? Is there evidence that the mitigation actually worked — i.e., post-mitigation performance data?
  • Honest disclosure of residual bias: Just as with risk management, a claim of zero bias is implausible for most real-world systems and will be challenged. Honest disclosure of known limitations — with a documented rationale for why the residual bias is acceptable in the specific use case context — is the expected standard.

4. Human Oversight: Technical Reality vs. Documented Claims

Article 14 requires that human operators can understand, monitor, and override AI system outputs. Assessors verify this is a technical reality, not just a policy statement. They look for:

  • Demonstrable override mechanisms — can an assessor actually see a human override function in the system interface or API?
  • Deployment instructions that clearly explain to deployers what oversight they must maintain and how to exercise override controls
  • Training documentation confirming that human operators have received the Article 4 AI literacy training necessary to exercise meaningful oversight
  • Evidence against automation bias — is the system designed to present AI outputs in a way that encourages critical human evaluation rather than passive acceptance?

For the detailed implementation requirements for Article 14 human oversight controls, see our post on designing human oversight to meet Article 14 standards.

5. Post-Market Monitoring: A Plan That Could Actually Work

Every assessor reviews the post-market monitoring plan (Article 72) — and experienced assessors can quickly distinguish between a plan that was written to satisfy the documentation requirement and one that is operationally realistic.

A credible post-market monitoring plan specifies:

  • The specific performance metrics being monitored and the data collection mechanism
  • The threshold values that trigger review, escalation, or system shutdown
  • The frequency of performance reviews and who conducts them
  • The process for users and deployers to report problems
  • The incident classification system (what constitutes a serious incident under Article 73)
  • The retention and analysis of operational logs

A plan that says “we will monitor the system’s performance” without specifying metrics, thresholds, or responsible parties will receive a non-conformity finding. For the operational methodology behind post-market monitoring, see our post on managing model drift and post-market monitoring requirements.

How to Use AI Audit Simulation Tools to Accelerate Conformity Assessment Readiness

The most effective preparation strategy for a conformity assessment is running a structured simulation — a systematic check of your documentation and system design against every assessable requirement — before the real assessment begins. This is what mock audits do, and in 2026 they are no longer optional best practice for well-resourced teams: they are standard operating procedure for any organisation that cannot afford to fail an assessment and then wait 3–6 months for a rescheduled Notified Body slot.

What an AI Act Audit Simulation Actually Tests

An effective audit simulation is not a checklist review. It is a structured adversarial exercise that tests your documentation the way an assessor tests it — by trying to find the gaps, inconsistencies, and unsupported claims that will generate non-conformity findings in a real assessment.

A well-designed simulation covers five evaluation dimensions:

Simulation DimensionWhat It EvaluatesAI Act ReferenceCommon Finding
Documentation CompletenessAre all 8 Annex IV sections present and substantively complete?Article 11, Annex IVSections 3 (data) and 5 (cybersecurity) are most commonly incomplete
Internal ConsistencyDo claims in one section contradict or lack support from another?Annex IV cross-sectionIntended purpose scope vs. training data population mismatch
Evidence QualityAre compliance claims backed by actual evidence, or are they assertions?Articles 9, 10, 15Bias claims unsupported by subgroup performance data
Process AuthenticityDoes the risk management and testing documentation show genuine process history?Article 9, Article 17Risk logs with no historical version data or dated entries
Operational RealismAre post-market monitoring and oversight plans operationally credible?Articles 14, 72Monitoring plans without specific metrics, thresholds, or owners

Simulation vs. Self-Checklist: Why the Distinction Matters

Many teams run a self-checklist — working through the Annex IV requirements and checking each off as “done.” This is necessary but not sufficient. A checklist review tells you whether sections exist. A simulation tells you whether the sections would survive scrutiny.

The key difference is the adversarial posture. A checklist review is conducted by the same team that wrote the documentation — who are naturally inclined to interpret ambiguous sections charitably. A simulation applies the same sceptical scrutiny a trained assessor would bring: testing whether claims are supported, whether processes are genuine, and whether the system as documented is the system as built.

Unorma’s Audit Simulation tool (F08) runs exactly this kind of structured adversarial check — testing documentation completeness, internal consistency, evidence quality, and operational realism across all Article 9–15 requirements. It surfaces specific non-conformity findings with the Article reference, the severity assessment, and the recommended remediation — giving your team a prioritised fix list before your real assessment begins. For the full methodology behind mock audits, see our post: What Is a Mock Audit? Using Simulation to Detect AI Compliance Gaps.

The Pre-Submission Conformity Assessment Checklist

This is the checklist your team should complete before triggering either your internal assessment or submitting to a Notified Body. It is organised by the sequence in which assessors typically work through the evaluation.

Phase 1: Technical File Completeness Check

Pre-Submission ItemAnnex IV SectionCommon Gap
Intended purpose documented with specific use case, user population, and negative scopeSection 1Out-of-scope uses not explicitly excluded
System architecture diagram shows full data flow, model components, human decision points, and external dependenciesSection 2Failure states and fallback behaviours not shown
Training dataset provenance documented to source level with terms of use referencesSection 3Intermediate processing steps not documented
Bias assessment conducted with subgroup performance data across all relevant protected characteristicsSection 3Aggregate metrics only; no demographic breakdown
Model card complete including known limitations and conditions of degraded performanceSection 4Known failure modes omitted or understated
Adversarial robustness testing conducted and results documented (attack types tested, results, mitigations)Section 5Only general IT security documented; no AI-specific testing
Test results include pre-determined performance thresholds with evidence they were set before testingSection 6Thresholds match results suspiciously perfectly; no pre-test plan on file
Logging specification documented with retention period and data fields capturedSection 7Logging described in general terms without technical specification
Post-market monitoring plan specifies metrics, thresholds, responsible parties, and review cadenceSection 7Generic monitoring statement with no operational specifics

Phase 2: Quality Management System (Article 17) Check

QMS ItemArticle
Quality management system documented and operational — covers compliance strategy, design processes, data governance, risk management, post-market monitoringArticle 17
Change management procedure documented — defines what constitutes a substantial modification and requires compliance team notificationArticle 3(23)
Incident reporting procedure documented — defines serious incident classification and notification timeline to national authorityArticle 73
AI literacy training records available for all staff working with the systemArticle 4
Legal Instrument ItemArticle
EU Declaration of Conformity drafted with all required fields — provider name/address, system identity, conformity assessment procedure, harmonised standards referenced, authorised signatoryArticle 47
CE marking application reviewed — correct format and placement identified for system and accompanying documentationArticle 49
EU AI database registration fields prepared — system description, intended purpose, conformity assessment reference, provider contact detailsArticle 71, Annex VIII
10-year documentation retention plan confirmed — Technical File, Declaration of Conformity, and all supporting evidence will be retained and reconstructableArticle 18

The Most Common Reasons AI Systems Fail Conformity Assessment

Based on early assessment experience across Europe in 2025–2026, the following are the most frequent reasons systems receive non-conformity findings — and what to do about each.

Failure Reason 1: The “Day Before” Technical File

The Technical File was assembled in the final weeks before assessment — which means it lacks the historical depth that demonstrates a genuine compliance process. Risk logs have no evolution. Dataset documentation doesn’t reference specific dataset versions. Test plans have no timestamps preceding test execution.

Fix: Start your Technical File at the beginning of development, not at the end. Even rough initial drafts of each section — dated and version-controlled — demonstrate that the documentation evolved alongside the system. The ISO/IEC 42001 AI Management System standard provides the quality management system framework that, when implemented from the start, naturally generates the documented process history assessors are looking for.

Failure Reason 2: Aggregate-Only Performance Metrics

The system achieved excellent overall accuracy metrics — but has no performance breakdown by demographic subgroup. When an assessor asks “how does this perform for women over 50 vs. men under 30?” or “what is the false positive rate for applicants from non-English-speaking countries?” there is no answer in the documentation.

Fix: Build subgroup evaluation into your test pipeline from the start, not as a post-hoc analysis. The Google PAIR Guidebook’s fairness evaluation section provides practical methodology for structuring demographic performance analysis. Every high-risk AI system that makes decisions about people needs this data.

Failure Reason 3: Human Oversight That Exists Only on Paper

The Technical File describes human oversight mechanisms — review workflows, override controls, escalation procedures. But when an assessor asks to see the override function, or to review records of when human operators actually exercised oversight, there is no evidence it happens in practice.

Fix: Design audit trails for human oversight from day one. Every human review action, every override decision, every escalation should be logged. This log is both an Article 12 compliance requirement and your evidence of functional human oversight under Article 14. See our post on creating an immutable audit trail for the technical architecture.

Failure Reason 4: GPAI Dependency Undocumented

The system relies on a third-party GPAI model API. The Technical File doesn’t document this dependency, what version of the model is used, what the model provider’s compliance status is, or how changes to the underlying model are monitored and managed.

Fix: Every third-party AI component in your system must appear in the Technical File — with the provider name, the model version or API version, the terms of use, and the monitoring plan for upstream changes. See our post on GPAI model compliance requirements for the full supplier documentation framework.

Failure Reason 5: Post-Market Monitoring Plan With No Operational Infrastructure

The post-market monitoring plan is a well-written document. But when the assessor asks “show me the monitoring dashboard” or “what alert would fire if model accuracy dropped below your threshold?”, the infrastructure doesn’t exist yet.

Fix: Your post-market monitoring plan must be operationally live — not just documented — at the point of assessment. The monitoring infrastructure (logging pipeline, performance dashboards, alert thresholds) must be built and tested before you submit for assessment. For the technical implementation guide, see our post on managing model drift and post-market monitoring.

Working With Notified Bodies: A Practical Guide for 2026

If your system requires Notified Body assessment — specifically biometric identification systems or AI embedded in higher-risk regulated products — here is what to expect from the engagement.

How to Select a Notified Body for AI Act Assessment

Notified Bodies for the EU AI Act are designated by national authorities and listed in the NANDO (New Approach Notified and Designated Organisations) database, maintained by the European Commission. In 2026, the number of designated AI Act Notified Bodies is still relatively limited — demand significantly exceeds capacity for complex systems, and lead times reflect this.

Selection criteria beyond availability:

  • Technical domain expertise: Does the Notified Body have assessors with genuine AI and ML expertise, or is their background primarily in traditional product safety assessment? Ask specifically about the qualifications of the team that will assess your system.
  • Sector experience: Notified Bodies with experience in your specific sector (healthcare AI, financial AI, HR AI) will assess your system’s context more accurately.
  • Pre-assessment services: Many Notified Bodies offer pre-assessment consultations or readiness reviews before formal assessment. These are valuable investments — they tell you what the assessor will focus on before it counts.
  • Language and jurisdiction: Practical considerations matter. An assessor who can conduct the assessment in your working language, with familiarity with your national authority’s enforcement approach, simplifies the process.

The Notified Body Assessment Process: Stage by Stage

A typical Notified Body assessment for an AI Act high-risk system follows this sequence:

  1. Application and scoping (2–4 weeks): You submit an application describing the system and the assessment scope. The Notified Body reviews it, confirms they have the technical competence to assess, and issues a scoping document and fee estimate.
  2. Technical documentation review (4–8 weeks): You submit the full Technical File. The Notified Body conducts a desk review, identifying any initial gaps or questions. Expect a detailed query list — this is normal and does not indicate the system will fail.
  3. System demonstration and technical assessment (1–3 weeks): The Notified Body’s technical assessors interact with the system — reviewing logs, testing override mechanisms, running queries against the system, and verifying that the documented architecture matches the deployed reality.
  4. Findings and response (2–4 weeks): The Notified Body issues a formal assessment report with any non-conformity findings. You address findings with remediation evidence. Minor non-conformities can typically be resolved within the assessment process; major non-conformities may require the assessment to be paused while remediation is completed.
  5. Certificate issuance: Once all findings are resolved, the Notified Body issues an EU Type-Examination Certificate (for systems assessed under Annex VII). You issue the Declaration of Conformity and register in the EU AI database.

Post-Assessment: Maintaining Conformity After Market Placement

Passing your conformity assessment is not the end of the compliance obligation — it is the beginning of ongoing conformity maintenance. Three ongoing requirements demand operational attention after market placement:

Triggering Reassessment: The Substantial Modification Watch

Build a formal substantial modification review into every product change process. When a model update, dataset change, or new deployment context is proposed, the first compliance question should be: “does this constitute a substantial modification under Article 3(23)?” If yes, a new conformity assessment is required before the modified system can operate.

For Notified Body-assessed systems, the Notified Body must be notified of planned substantial modifications before they are implemented — not after. Failure to do so can invalidate the existing certificate.

Serious Incident Reporting Under Articles 73–75

Your incident reporting procedure should already be in place as part of the conformity assessment — but it becomes operationally critical after launch. Serious incidents (where the AI system caused or contributed to death, serious injury, or significant infrastructure disruption) must be reported to your national market surveillance authority without undue delay. Timelines are: 15 days for serious incidents; 3 days for life-threatening incidents; immediate notification for incidents with broad societal impact.

Establish clear, documented incident classification criteria before market placement. Your team should be able to classify an incident without needing a legal opinion in the moment.

Annual Compliance Review

Even without a substantial modification, conduct a formal annual review of your Technical File and compliance programme. AI systems evolve in use — new edge cases emerge, user behaviour shifts, model performance changes subtly over time. An annual review catches drift from the original conformity finding before it becomes a gap that a market surveillance authority finds first.

Unorma’s Audit Simulation tool is designed to be run on a recurring basis — quarterly or semi-annually — to surface emerging compliance gaps against the current state of your documentation and system. The AI System Inventory tracks all your systems, their conformity status, and their next review dates in a single auditable register.

Frequently Asked Questions

Does our conformity assessment cover all EU member states, or do we need separate assessments per country?

A conformity assessment conducted under the EU AI Act is valid across all 27 EU member states. The CE marking and Declaration of Conformity signal compliance with the regulation as a whole — there are no national variations requiring separate assessment (unlike some product safety regimes where national specifications differ). However, national market surveillance authorities may conduct their own post-market audits of systems operating in their jurisdiction — and their documentation access rights under Article 64 mean your Technical File must be available in a language accessible to them, typically English plus the official language of your primary market.

Can we use an AI audit simulation tool as a substitute for the formal conformity assessment?

No. An audit simulation is a preparation tool — a structured exercise that identifies gaps in your documentation and system design before formal assessment. It does not produce a legally valid Declaration of Conformity or a Notified Body certificate. Its value is in significantly reducing the probability of non-conformity findings in the real assessment, and in giving your team a structured, prioritised remediation list. Think of it as the dress rehearsal, not the performance.

What happens if a system fails its conformity assessment?

For internal self-assessment, a “failure” means your own review identifies non-conformities — gaps in documentation or system design that must be remediated before you can sign the Declaration of Conformity. You simply address the gaps and conduct the assessment again. For Notified Body assessment, non-conformity findings are formal and categorised: minor non-conformities can typically be resolved within the assessment process with documentary evidence; major non-conformities require the assessment to be suspended while remediation is completed and verified. There is no regulatory penalty for a failed assessment — only for placing a non-compliant system on the market without completing conformity assessment.

How long does an EU AI Act conformity assessment take?

Internal self-assessment: the timeline is entirely within your control and depends on documentation completeness. With a complete Technical File and quality management system in place, an internal assessment can be conducted in 4 to 8 weeks. Notified Body assessment: expect 3 to 6 months from application to certificate in 2026, reflecting current demand and limited capacity. For systems requiring Notified Body assessment, start the engagement no later than 6 months before your target market placement date.

Which AI systems require a Notified Body for their conformity assessment?

Under Article 43, Notified Body assessment is required for: (1) AI systems that constitute safety components of products subject to third-party conformity assessment under Annex II legislation where a Notified Body is already involved — primarily Class IIa and above medical devices, certain machinery, and aviation products; and (2) remote biometric identification systems other than those used to verify identity with prior explicit consent. All other Annex III high-risk AI systems — including employment AI, credit scoring AI, education AI, and critical infrastructure AI — are eligible for internal self-assessment.

What is an AI conformity assessment under the EU AI Act?

A conformity assessment is the mandatory, evidence-based process through which a provider of a high-risk AI system verifies that the system complies with all applicable requirements under Chapter III of the EU AI Act (Articles 8–15). It must be completed before the system is placed on the EU market. Successful assessment produces an EU Declaration of Conformity and the right to apply the CE marking. Two routes are available: internal self-assessment (most Annex III systems) and third-party Notified Body assessment (biometric identification systems and AI in higher-risk regulated products).

Find out how audit-ready you actually are — before an assessor does.

Unorma’s Audit Simulation runs a structured adversarial check across all Article 9–15 requirements — testing documentation completeness, internal consistency, evidence quality, and operational realism. Get a compliance score and a prioritised fix list in under 30 minutes.Run Your Free Audit Simulation →
Build Your Technical File First — Article 11 & Annex IV Guide →
← Back to the Ultimate Guide to EU AI Act Compliance

Share this post

Leave a Reply