Skip to content
Confir.
Blog

Building an AI Data-Governance Program Under Article 10

Guide23 May 2026· 16 min read· 3,117 words

How to implement Article 10 EU AI Act in practice: dataset provenance, bias assessment, the Art 10(5) exception, GDPR overlap, and Annex IV docs.

Article 10 of Regulation (EU) 2024/1689 requires providers of high-risk AI systems to subject their training, validation, and testing datasets to documented governance practices. This is not a general data-protection rule — that is GDPR's territory. Article 10 is specifically about the quality, provenance, representativeness, and lineage of the datasets that shape a high-risk model's behaviour. If your system falls under Annex III (recruitment screening, credit scoring, biometric identification, and the other six high-risk areas), Article 10 binds you as a provider. Deployers have a lighter but real role too, as discussed below.

This page is about how to implement an Article 10 data-governance program in practice. The legal text of Article 10 itself — what it says and why it exists — is covered on the Article 10 reference page. Read that page alongside this one.

The high-risk deadline has moved. Under the Digital Omnibus agreed in May 2026, stand-alone Annex III systems must comply from 2 December 2027 (Annex I embedded systems from 2 August 2028), pushing back the original August 2026 date. That is lead time, not a reason to defer — building a defensible data-governance record takes several months even for a well-organised team.


What Article 10 Actually Covers

Article 10 applies to three dataset types: training data (used to fit the model), validation data (used to tune hyperparameters and select architecture during development), and test data (held back for the final pre-deployment evaluation). All three must be governed. Validation data is frequently overlooked because it feels internal to the development process — regulators examining a conformity assessment file will look for it.

For each dataset type, Article 10 imposes four substantive requirements:

  1. Relevance and representativeness — datasets must reflect the intended purpose and deployment population. A credit-scoring model trained on applicants from a single region is not representative of a pan-EU deployment, and your technical file must either justify that limitation or document how you compensated for it.
  2. Sufficient quantity — enough data to meet the statistical reliability the use case demands. There is no fixed threshold; documented rationale is the standard.
  3. Completeness and freedom from errors — quality-assurance processes must identify missing values, inconsistencies, and duplicates. You do not need zero errors; you need documented procedures and remediation records.
  4. Bias assessment and the duty to mitigate — Article 10(2) requires you to examine datasets for biases that could lead to discriminatory outcomes or other risks. Where bias is found, you must take measures to address it. This duty to mitigate, not merely document, is often understated.

One provision is worth calling out on its own: Article 10(5). Ordinarily, GDPR Article 9 prohibits processing special-category data (racial or ethnic origin, health, sexual orientation, and the rest). Article 10(5) carves out a narrow exception: providers may process special-category data — subject to appropriate safeguards — specifically and solely for the purpose of detecting and correcting bias in high-risk AI systems. The exception does not extend to other purposes. If you invoke it, document the safeguards explicitly; the justification must sit in your technical file.


The GDPR Intersection

Data-governance for AI does not operate in isolation from GDPR. The two frameworks overlap in important ways.

GDPR Article 5 (data-processing principles) requires that personal data be accurate, adequate, relevant, and limited to what is necessary. These principles run in parallel with Article 10's representativeness and completeness requirements. A dataset that fails Article 5's accuracy principle likely fails Article 10's completeness standard too — but the obligations are independent, and each needs its own documented compliance trail.

GDPR Article 9 (special-category data) intersects directly with the Article 10(5) bias-detection exception. If you process health, biometric, or origin data to audit your model for bias, you need both a valid legal basis under GDPR and a documented invocation of the Article 10(5) exception under the AI Act. Neither alone is sufficient.

GDPR Article 35 (Data Protection Impact Assessment) and AI Act Article 27 (Fundamental Rights Impact Assessment) are distinct instruments. A DPIA examines personal-data processing risks; a FRIA — required for certain public-body deployers and deployers of Annex III 5(b) and 5(c) systems — examines fundamental-rights impact. Article 27(4) allows the FRIA to build on an existing DPIA. They are not interchangeable.

If your legal team is maintaining a data-processing register under GDPR Article 30, the dataset inventory required by Article 10 can be developed alongside it — but the Article 10 record needs detail that a GDPR register typically does not capture: provenance at the field level, preprocessing transformations, bias-test results, version history.


Provenance and Collection: Where Data Comes From Matters

A data-governance program starts before a single row enters a model. Article 10 requires you to document how each dataset was collected and where it originated. For training data assembled from historical records, the questions are: who generated the data, under what conditions, at what point in time, and subject to what selection criteria? For data sourced from third parties or public datasets, you need the provider's documentation of collection methodology, and if that documentation does not exist or is inadequate, you bear the risk.

Practically, this means maintaining a dataset register that captures for each dataset: the name and version, the source (internal system, third-party supplier, public corpus), the collection date range, the collection method, any licensing or consent terms, the intended purpose, and the preprocessing applied before use. This is not optional record-keeping — it feeds directly into Annex IV (the technical documentation content areas required by Article 11), specifically the section on training methodologies and datasets.

For a 40-person HR-tech company building a CV-screening tool (Annex III, point 4(a)), the dataset register might show that training data came from the company's own historical hiring decisions over five years. That provenance immediately raises a question Article 10 forces you to answer: do those historical decisions reflect past discriminatory patterns? The answer shapes the bias-assessment work ahead.


Preparation: Labelling, Cleaning, and the Paper Trail

Data preparation — cleaning, labelling, annotation — must also be documented. If you are using labelled data, the labelling protocol matters: who labelled it, what instructions were given, what inter-annotator agreement rate was achieved, and how disagreements were resolved. If the labelling instructions encoded a contested category (for example, "qualified candidate" defined by criteria that themselves contain bias), the model inherits that bias regardless of how carefully the data is cleaned.

Cleaning procedures must be logged. If you imputed missing values, document the imputation method and the proportion of records affected. If you excluded records, document the exclusion criteria and the volume removed. If you standardised formats across data sources, document the transformation rules. Regulators do not expect a pristine dataset; they expect evidence that you took quality seriously and made documented decisions.

Version-control your datasets. If you retrain on an updated dataset six months after initial deployment, the new version needs its own governance record. The conformity assessment conducted under Article 43 covers the system at a point in time; if the training data materially changes, the assessment may need to be revisited.


Representativeness and the Deployment Population

Article 10's representativeness requirement has a practical test: does your dataset reflect the population the system will encounter in production? A credit-scoring model trained on data from one member state is not representative if it will be deployed across the EU. A recruitment tool trained on applications from one industry sector will behave differently when applied to another.

Representativeness is not the same as demographic balance. You do not need equal numbers from every subgroup. What you need is documented evidence that the distribution in your dataset is appropriate for the intended deployment context, and where gaps exist, a reasoned assessment of whether those gaps create material bias risk.

One common mistake: conflating representativeness with size. A large dataset that systematically excludes a population is less representative than a smaller, well-stratified one. Document both size and composition.


Bias Assessment and the Mitigation Duty

This is the most technically demanding part of Article 10, and the part most often misread. The obligation is not merely to detect bias — it is to detect and then take measures to address it. Documentation of a bias without action does not satisfy Article 10.

A practical bias-assessment process for a high-risk system involves:

  1. Identifying the demographic characteristics relevant to the use case (gender, age, national origin, and others protected under the EU Charter of Fundamental Rights and applicable national law).
  2. Disaggregating model performance metrics (accuracy, false-positive rate, false-negative rate) by those characteristics.
  3. Documenting the observed disparities and their magnitude.
  4. Assessing whether the disparities are acceptable given the system's purpose and risk level, or whether they indicate a problem that must be addressed.
  5. If mitigation is required: implementing it (rebalancing data, adjusting thresholds, retraining, excluding biased features) and documenting what was done.
  6. Retesting after mitigation and documenting the results.

You are not required to achieve identical performance across all groups — that is often impossible and can itself introduce distortions. You are required to have examined the question, documented your findings, and taken proportionate action. The test dataset is the primary vehicle for this; it should be designed to enable disaggregated analysis, which means it needs to contain enough examples from each relevant subgroup to support statistically meaningful comparisons.

If demographic information is not available in your dataset, document that limitation. Use proxy analysis where appropriate, and document the methodology. If the gap cannot be closed, document compensatory measures — for example, enhanced human oversight under Article 14 at deployment, or post-market monitoring under Article 72 with bias-detection checks once real-world outputs accumulate.


Data Quality Gates

A quality gate is a checkpoint at which data is evaluated against defined criteria before it progresses to the next stage of development. Building quality gates into your data pipeline operationalises Article 10; it is also more defensible than a retrospective quality review conducted only when regulators ask.

A minimum set of gates for a high-risk AI development process:

  • Ingestion gate: automated checks for completeness (required fields populated), format consistency (dates in expected format, numeric values within expected range), and absence of known corrupt record patterns.
  • Pre-training gate: review of dataset composition against the representativeness requirements documented for this system; confirmation that training and validation sets are separate.
  • Pre-test gate: confirmation that test data has not been used in any prior development phase; review of test dataset composition for bias-assessment coverage.
  • Post-bias-assessment gate: sign-off that bias analysis has been conducted, results documented, and any required mitigation completed before the system advances to conformity assessment.

Each gate produces a record — a log entry, a signed-off checklist, an automated report. Those records feed the Article 11 technical file.


Documentation That Feeds Annex IV

Article 11 requires providers of high-risk AI systems to maintain technical documentation as specified in Annex IV. Annex IV lists nine content areas. Several of them directly require Article 10 evidence:

  • Annex IV, point 3: description of the training, validation, and testing data, including provenance, preprocessing, selection criteria, characteristics, and how the data meets Article 10 requirements.
  • Annex IV, point 5: description of any pre-trained models used and their data governance.
  • Annex IV, point 6: description of the development methodology, which must include how data governance was conducted.

Your Article 10 dataset register, quality-gate records, bias-assessment reports, and mitigation decisions are the raw material that populates these Annex IV sections. If you treat data governance as a development activity and technical documentation as a separate compliance activity, you will duplicate effort and create inconsistencies. The more efficient approach: design your data-governance records so that they can be directly incorporated into the Annex IV technical file.

The technical documentation must be compiled before the conformity assessment under Article 43, retained for ten years after the system is placed on the market (Article 18), and made available to competent authorities on request.


Roles and Lineage

Providers carry the primary Article 10 burden. If you develop a high-risk AI system under your own name and place it on the market or put it into service, you are a provider under Article 16 and Article 10 applies in full. This includes SaaS companies shipping AI-driven products and companies integrating third-party models into systems they label and sell.

Deployers — organisations that use a high-risk system developed by someone else — are not directly bound by Article 10's development-phase requirements. But deployers are not passive. Article 26 requires deployers to use systems in accordance with the provider's instructions, and where a deployer uses its own data with a high-risk system (for fine-tuning or operational processing), that data use comes within the governance expectation. Deployers also bear responsibility for post-market monitoring under Article 72, which includes detecting bias drift as real-world outputs accumulate.

Role-shift under Article 25 is worth noting explicitly. A deployer that substantially modifies a high-risk system, or that places it on the market under its own name, becomes a provider for the modified or rebranded system. The Article 10 obligations then apply in full to whatever the original provider's governance records did not cover.

Data lineage — the audit trail showing where data came from, how it was transformed, and how it was used — serves two Article 10 functions. First, it enables the competent authority to verify your governance claims. Second, it enables you to trace a bias finding back to its source: if a post-market bias detection reveals a problem, lineage documentation tells you whether the issue originates in training data composition, in preprocessing decisions, or in deployment-context drift.


How Confir Helps

Confir's AITR module structures the data-governance evidence your technical file needs. The intake questions map to Article 10's requirements — dataset provenance, preparation procedures, representativeness assessment, bias-detection methodology, and mitigation documentation. Because the logic is rule-based and deterministic, the same inputs produce the same structured output every time; there is no inference gap between what you enter and what appears in the generated Annex IV documentation pack.

The generated technical documentation includes the Article 11 dataset sections pre-populated from your AITR responses, ready for review and sign-off before the Article 43 conformity assessment.


Frequently Asked Questions

Does Article 10 apply to AI systems that are not high-risk?

No. Article 10 data-governance obligations apply only to high-risk AI systems as classified under Article 6 with reference to Annex III (and Annex I for product-safety components). Limited-risk systems (Article 50) and minimal-risk systems have no Article 10 obligations. If you are unsure whether your system qualifies as high-risk, the Article 6(3) filter allows a system that falls within an Annex III area to be excluded if it poses no significant risk of harm — but any system that profiles natural persons is always high-risk, and you must document your classification reasoning either way.

What is the Article 10(5) special-category-data exception and when can I use it?

Article 10(5) permits providers to process GDPR special-category data (health data, biometric data, racial or ethnic origin, and others listed in GDPR Article 9) for the specific purpose of detecting and correcting bias in high-risk AI datasets, subject to appropriate safeguards. The exception is narrow: it covers bias detection and correction, not general model training. You must document the safeguards applied, maintain a GDPR-compliant legal basis alongside the Article 10(5) invocation, and include the justification in your technical file. Invoking the exception for purposes beyond bias work voids its protection.

Who bears Article 10 responsibility when the provider uses a third-party base model?

The provider of the high-risk system is responsible for Article 10 compliance for the whole system — including any third-party base model incorporated into it. If the base-model vendor supplies data-governance documentation (training data characteristics, bias assessments), you should obtain and incorporate it. If that documentation is unavailable or inadequate, you must conduct your own evaluation of the base model's behaviour in your deployment context and document the findings. You cannot discharge your Article 10 obligation simply by pointing to a third party.

What does "representativeness" mean in practice for a company deploying across multiple EU member states?

Representativeness is assessed relative to the intended deployment population and use context. If your high-risk system will make consequential decisions about people across multiple member states, your training data should reflect the distribution of characteristics those people bring — including geographic, linguistic, socioeconomic, and demographic variation. You do not need exactly proportional sampling, but you need a documented assessment of whether material subgroups are represented at a level sufficient to support reliable model performance, and where they are not, a documented rationale or compensatory measure.

How does Article 10 interact with GDPR's data-minimisation principle?

GDPR Article 5(1)(c) requires personal data to be limited to what is necessary for the purpose (data minimisation). Article 10 requires datasets to be relevant, representative, and sufficient. These can pull in opposite directions: you may need more data from certain subgroups to satisfy representativeness while GDPR disfavours collecting data beyond necessity. The resolution is purpose specificity and documented proportionality: data collected specifically for bias-detection under the Article 10(5) exception is expressly permitted; data collected broadly "in case it helps" is not. Design your data-collection protocols to serve defined purposes and document those purposes before collection.

What penalty applies if Article 10 requirements are not met?

Non-compliance with Article 10 is non-compliance with a high-risk AI requirement, which falls under Article 99(4) of the Act: fines up to €15,000,000 or 3% of total worldwide annual turnover, whichever is higher. For SMEs and start-ups, Article 99(6) caps the fine at the lower of the percentage or the fixed amount — a genuine proportionality protection. There is no separate "Article 84" category, and figures such as "€30 million or 6%" do not exist in the Regulation.

Does documentation need to be in a specific format or language?

The Regulation does not prescribe a format. Annex IV sets out the required content for technical documentation; how you organise and present that content is your choice. Competent authorities in each member state operate in their own language, so if you anticipate an audit in Germany or France, having documentation available in that language is practical risk management even if not a strict legal requirement. Automated tools that generate structured Annex IV output in exportable form help significantly here.


Related guides

Manage your EU AI Act compliance in one place

Confir automates risk classification, technical documentation, and audit trails for any company. No consultants. No 6-month projects. 7-day free trial.

Start free trial →