Creating an Immutable Audit Trail: Why the Evidence Vault is Mandatory

⚡ TL;DR

AI compliance record keeping is not optional housekeeping — it is a direct legal obligation under Articles 12, 17, 18, 19, and 72 of the EU AI Act, with specific retention minimums and access requirements.
An “evidence vault” is the structured, tamper-evident repository that holds every compliance artefact your organisation needs to demonstrate: that a risk was identified, that a mitigation was applied, that a test was run before results were known, and that a human reviewed the output when they were supposed to.
The difference between a compliant evidence vault and a shared folder of compliance documents is immutability, traceability, and accessibility on demand — and market surveillance authorities are increasingly asking to see all three.

There is a version of AI compliance that looks thorough but isn’t: a team creates all the required documents, stores them in a SharePoint folder, and considers the record-keeping obligation met. Then an enforcement situation arises — a regulator requests evidence, a deployer raises a liability question, or a serious incident triggers a market surveillance investigation. At that point, the question isn’t whether the documents exist. It’s whether they can prove when they were created, who approved them, which version of the system they apply to, and whether they were actually followed in practice.

That is a very different standard from “we have a folder of compliance files.” Meeting it requires deliberate infrastructure — an evidence vault designed from the start for regulatory access, tamper-evidence, and long-term retention. This post explains precisely what the EU AI Act requires from your compliance record keeping, what an evidence vault must contain to meet those requirements, and how to build one that serves both your internal governance needs and your regulatory obligations.

For the broader audit-readiness framework this record-keeping infrastructure supports, see our pillar guide: Audit-Ready AI: The Step-by-Step Guide to Passing a Conformity Assessment. For the Technical File those records feed into, see our Article 11 & Annex IV guide.

What the EU AI Act Actually Requires: The Legal Landscape for Record Keeping

AI compliance record keeping obligations are scattered across five Articles of the EU AI Act, each targeting a different category of record. Understanding which Article governs which record type prevents the common error of conflating them or addressing some while inadvertently neglecting others.

Article	Record Type	Who Keeps It	Minimum Retention	Key Requirement
Article 12	Operational logs — AI system events, inputs, outputs, decisions	Provider (by design); Deployer (operationally)	6 months minimum; longer for regulated sectors	Automatic logging capability built into the system; logs must capture enough context to reconstruct decisions
Article 17	Quality Management System records — policies, procedures, decisions, audits	Provider	10 years post market withdrawal	QMS documentation must reflect actual practice, not aspirational policy; records must demonstrate the QMS operates as documented
Article 18	Technical File — the complete Annex IV documentation set	Provider	10 years post market withdrawal	All historical versions must be retained, not just the current version; available to authorities on request within days
Article 19	Conformity assessment records — assessment procedure documentation, test results, Declaration of Conformity	Provider	10 years post market withdrawal	Must be available to market surveillance authorities on demand; Declaration must be accessible to general public
Article 72	Post-market monitoring records — performance data, incident reports, corrective action records	Provider	Duration of post-market monitoring plan; serious incident records indefinitely	Monitoring data must be systematic and traceable; corrective actions must be linked to the triggering incident

The common thread across all five Articles: records must not only exist but must be demonstrably authentic, attributable, and accessible. A PDF created after the fact, backdated to look contemporaneous, fails on all three counts.

The Four Properties of an Immutable Evidence Vault

An evidence vault is not a document management system with a compliance label. It has four specific properties that distinguish it from general document storage and that make it capable of satisfying regulatory requests.

Property 1: Tamper-Evidence

Every record in the vault must carry cryptographic or equivalent tamper-evidence — a mechanism that makes any retroactive modification detectable. In practice, this means append-only storage architectures where records are written once and cannot be overwritten; cryptographic hashing of documents at creation time with the hash stored separately; or blockchain-based timestamping that creates an externally verifiable creation record.

The tamper-evidence requirement is most critical for records whose value depends on proving temporal sequence: test plans that must predate test results; risk assessments that must predate design decisions; threshold documentation that must predate performance evaluations. Without tamper-evidence, a regulator examining those records has no way to verify the claimed sequence rather than a post-hoc reconstruction.

Open-source tooling: Git-based repositories with signed commits provide basic append-only evidence for document version history. For production-grade tamper-evidence, dedicated audit log storage systems such as immudb (open-source immutable database) or cloud-native audit log services (AWS CloudTrail with integrity validation, Google Cloud Audit Logs) provide cryptographic immutability at scale.

Property 2: Traceability

Every record in the vault must be traceable — linked to the specific system version it applies to, the specific person who created or approved it, and the specific compliance requirement it satisfies. Without traceability, records become an unstructured evidence pile that takes weeks to navigate during a regulatory request or litigation discovery process.

Traceability requires a metadata schema applied consistently to every record: system name and version, record type (risk assessment, test result, incident report, etc.), Article and Annex IV section reference, author, approver, creation date, and links to related records (the incident report linked to the corrective action; the test result linked to the pre-registered test plan). Dublin Core metadata terms provide a lightweight standard schema that can be applied to compliance documents without proprietary tooling.

Property 3: Completeness over Time

The vault must contain the complete compliance history of every AI system — not just current records, but the full version lineage. A market surveillance authority investigating a serious incident in 2027 needs to be able to examine the Technical File, risk assessment, and test results that applied to the version of the system operating at the time of the incident — not the current version.

This means: every Technical File update creates a new version record linked to the preceding version; every model retrain creates a performance record linked to the corresponding model version; every risk log update preserves the complete previous log rather than overwriting entries. Version control systems like Git handle this naturally for text documents; for binary artefacts (model weights, evaluation datasets), use content-addressed storage with immutable references.

Property 4: Accessibility on Demand

Article 64 gives market surveillance authorities the right to request access to source code, logs, and documentation. Article 18 requires the Technical File to be available on request. The practical requirement is that you can produce a complete, organised compliance package for a specific system at a specific point in time within days — not weeks of archive hunting.

This requires: an index that allows querying by system name, version, time period, and record type; export functionality that produces a structured package from that index; and access control that allows designated persons to respond to regulatory requests without requiring the entire technical team to participate. The ENISA AI Threat Landscape also recommends audit log accessibility as a core AI security control — making the evidence vault simultaneously a compliance artefact and a security control.

What Your Evidence Vault Must Contain: The Complete Record Inventory

Across the five Article obligations, a complete evidence vault for a high-risk AI system contains six categories of records:

Category 1: Design and Development Records

Architecture decision records (ADRs) with dates showing when design decisions were made
Third-party component compliance records — documentation from each upstream provider
Data governance records: dataset registry entries, DVC pipeline lineage, preprocessing documentation, licence records
Bias evaluation records: test methodology, tool configuration, results by demographic subgroup, acceptability assessment

Category 2: Risk Management Records

Risk register with timestamped entries — risk identification date, assessed severity, assigned mitigation, mitigation completion date, residual risk assessment
Risk review meeting records showing periodic review activity across the system lifecycle
Evidence linking each identified risk to a design decision or control — demonstrating the risk management system influences actual product decisions

Category 3: Testing and Validation Records

Test plans with creation timestamps predating test execution
Pre-registered performance thresholds with timestamps predating evaluation runs
Complete evaluation results including per-subgroup metrics and failure case analysis
Adversarial robustness testing records — attack types tested, methodology, results
Cybersecurity testing records — penetration test reports, SAST/DAST results

Category 4: Operational Logs

System event logs: every AI decision made, with input summary, output, confidence score, timestamp, and user/context identifier
Human override logs: every instance where a human operator reviewed and modified or rejected an AI output
System error and anomaly logs: every instance of unexpected behaviour, with context and resolution
Access logs: who accessed which capabilities of the system, when

Category 5: Post-Market Monitoring Records

Performance monitoring reports — regular snapshots of production performance metrics against documented thresholds
Incident reports — every serious incident, with timeline, affected users, root cause, and corrective action
Corrective action records — linked to the triggering incident, with completion dates
Model update records — for each model version update, the performance comparison and documentation update record

Category 6: Conformity and Approval Records

Conformity assessment procedure record — the completed Annex VI self-assessment or Notified Body assessment report
Declaration of Conformity — signed, dated, with authorised signatory details
EU AI database registration confirmation
All Technical File versions with change history

The Logging Architecture: Building Article 12 Compliance In

Article 12’s operational logging requirement is the most technically specific record-keeping obligation — and the one most often addressed at too low a level of granularity. The requirement is not just that the system produces logs; it is that the logs capture enough information to reconstruct AI-assisted decisions after the fact, support the post-market monitoring required by Article 72, and allow the investigating authority to understand what the system did in any specific incident.

The minimum logging specification for a high-risk AI system’s operational logs:

Decision record: Unique decision ID; timestamp (UTC, millisecond precision); model name and version; input features (or a hash of inputs where raw data is sensitive); output (recommendation, score, classification); confidence or uncertainty indicator
Context record: Deployer identifier; operator identifier; use context; whether human override was applied and what the override decision was
System state record: Model version; configuration version; feature flags active; upstream data source version (e.g., RAG knowledge base version)

Implement logs using append-only storage — Apache Kafka with log compaction disabled is a common enterprise choice for high-volume AI decision logging that is both append-only and queryable. For lower-volume systems, a write-once database table with a cryptographic chain (each record includes the hash of the previous record) provides tamper-evidence without enterprise infrastructure overhead.

Separate AI decision logs from general application logs in storage, access control, and retention policy. Decision logs are regulated compliance artefacts; general application logs are operational infrastructure. Conflating them creates both access control risks (compliance-sensitive data accessible to too many people) and retention risks (compliance records deleted under general log rotation policies).

Frequently Asked Questions

What is the minimum record retention period under the EU AI Act?

The minimum retention periods vary by record type. Operational logs (Article 12) must be retained for at least six months by deployers who control the logging infrastructure; providers must design systems capable of this retention. Technical File records (Article 18), Quality Management System records (Article 17), and conformity assessment records (Article 19) must all be retained for ten years after the AI system is withdrawn from the market. Post-market monitoring records (Article 72) must be retained for the duration of the monitoring programme, and serious incident records should be retained indefinitely given their potential use in future liability proceedings. Note that GDPR data minimisation obligations create a competing constraint for records that contain personal data — reconcile the retention obligation with your GDPR retention policy by retaining anonymised or pseudonymised versions after the GDPR retention period expires.

What does “immutable” mean in the context of AI compliance record keeping?

In this context, immutability means that once a compliance record is created and stored, it cannot be modified retroactively without detection. This is distinct from “read-only” (which can still be replaced). True immutability is typically achieved through append-only storage architectures where existing entries cannot be overwritten; cryptographic hashing where a hash of each document is stored at creation and recomputed on access to detect modification; or blockchain timestamping where external verification of creation time is available independently of the storing organisation’s systems. The practical purpose is to give regulators confidence that compliance records reflect what actually happened rather than a post-hoc reconstruction.

There is a genuine tension here. GDPR Article 17 (right to erasure) could in principle require deletion of operational log records containing an individual’s data. AI Act Article 12 requires those same records to be retained for six months. The resolution recognised by data protection authorities is that the AI Act retention obligation constitutes a legitimate legal basis for retaining the records beyond what would otherwise be permissible under GDPR — provided the records are retained only for the minimum period specified in the AI Act, access is restricted to persons with a genuine compliance or legal need, and the records are deleted promptly when the retention period expires. Document this reconciliation in your Records of Processing Activities. See our post on the DPO’s role in AI governance for more on GDPR/AI Act interactions.

These platforms can form part of the evidence vault infrastructure, but they require specific configurations to meet the immutability and access requirements. S3 with Object Lock in Compliance mode provides WORM (write-once, read-many) storage with legally enforceable retention policies. SharePoint with versioning and access logging can provide traceability but not tamper-evidence without additional controls. Google Drive alone provides neither tamper-evidence nor the granular access auditing that a regulatory request would require. Whatever platform you use, the four properties — tamper-evidence, traceability, completeness over time, and accessibility on demand — must be explicitly configured, not assumed to exist by default.

Who within our organisation should own the evidence vault?

Ownership should be joint between compliance and engineering, with the DPO in an oversight role. Compliance owns the record retention policy, access controls, and regulatory response process. Engineering owns the technical infrastructure, logging architecture, and automated feeds into the vault. The DPO owns the reconciliation between AI Act retention obligations and GDPR data protection requirements, and the response to any data subject access requests or erasure requests that affect vault records. A clear RACI — with a single named person accountable for producing a complete compliance package within 72 hours of a regulatory request — is the operational test of whether vault ownership is genuinely established.

Build your compliance evidence vault before you need it.

Unorma’s AI System Inventory includes an integrated evidence vault — append-only record storage with cryptographic timestamps, full Annex IV version history, and one-click compliance package export for regulatory requests.Explore the Unorma Evidence Vault →

Jasper Claes

Jasper Claes is a Compliance Manager and consultant specializing in AI governance for high-scale technology companies operating in regulated markets. He advises product and legal teams on implementing practical compliance frameworks aligned with evolving regulations such as the EU AI Act. Through his writing, Jasper focuses on translating complex regulatory requirements into clear, actionable guidance for teams building and deploying AI systems.