Skip to content
Confir.
Glossary

Training Data, Validation Data and Testing Data Under the EU AI Act

Definition16 March 2026· 10 min read

EU AI Act definitions of training, validation and testing data (Article 3) and data governance requirements for high-risk AI under Article 10.

Under Regulation (EU) 2024/1689, training data is data used to train an AI system by fitting its learnable parameters. It is one of three interrelated data categories the Act defines and regulates — alongside validation data and testing data — because the quality of those datasets is the single greatest determinant of whether a high-risk AI system will behave safely, fairly, and as intended.

For providers of high-risk AI, this is not a theoretical concern. Article 10 imposes concrete data governance requirements — quality criteria, bias examination, representativeness checks, and documentation obligations — that must be satisfied before a system is placed on the market or put into service.


The EU AI Act definitions

Article 3 of Regulation (EU) 2024/1689 defines all three data categories:

Training data (Article 3, point 29): data used for training an AI system through fitting its learnable parameters. These are the parameters — weights, coefficients — that the system adjusts during the learning process. The training dataset shapes the model's behaviour across its full operational range.

Validation data (Article 3, point 30): data used for evaluating and tuning a trained AI system and for tuning its learning process, including its hyperparameters and other non-learnable configuration settings. The purpose of validation is to catch overfitting — where a model performs well on training data but poorly on unseen inputs — and underfitting, where the model has not captured the underlying patterns. Validation therefore happens during development, after each training iteration, rather than as a final step.

Testing data (Article 3, point 31): data used for an independent evaluation of an AI system to confirm that its expected performance holds before it is placed on the market or put into service. Testing data must be kept separate from the data used in training and validation; if the same data appears in all three stages, the independence of the performance evaluation collapses and results cannot be trusted.

The distinction matters in practice because each dataset plays a different role in the development lifecycle, and Article 10 applies obligations to all three — not just the training corpus.


Data governance for high-risk AI (Article 10)

Article 10 is the data governance article for high-risk AI systems. It does not apply to the full population of AI systems — only to those that fall within the high-risk classification under Article 6 and Annex III (or Annex I for safety components of regulated products). If your system is not high-risk, Article 10 does not bind you. If it is, Article 10 requirements apply alongside the rest of the high-risk obligation stack (Articles 9–15).

Quality criteria

Article 10(3) sets out the quality criteria that training, validation and testing datasets must meet. The datasets must be:

  • Relevant — appropriate for the intended purpose of the system. A credit-scoring model trained on data drawn from a different lending market, or a different period, may not be relevant to the setting in which it will operate.
  • Sufficiently representative — the data should capture the population, scenarios, and conditions the system will encounter. A recruitment screening tool trained predominantly on applications from one demographic is not representative of the applicant pool it will assess.
  • Free of errors and complete — to the best extent possible in view of the intended purpose. The Act does not demand perfection; it demands a good-faith, documented effort to identify and correct errors.
  • Appropriate to the geographic, contextual and behavioural setting — where the system is intended to be deployed. A system trained on behavioural data from one country may perform differently when applied in another, even for nominally similar tasks.

These criteria apply jointly to all three datasets. A provider cannot satisfy Article 10 by investing heavily in training data quality while ignoring the validation and testing sets.

Bias examination

Article 10(4) requires that training, validation, and testing data be examined for possible biases that could affect health, safety, or fundamental rights, or that could lead to discrimination prohibited under GDPR Article 21 or under Directive 2000/43/EC.

This obligation is prospective, not reactive. Providers must examine datasets for bias before deployment, document the results, and take corrective steps where biases are found. The examination should consider whether the dataset underrepresents particular groups, whether historical decisions embedded in the data reflect past discrimination, and whether the geographic or contextual settings of data collection introduced systematic distortions.

Bias in training data propagates into model outputs. An employment system trained on historical hiring decisions made by a biased human process will reproduce that bias at scale. Article 10(4) is the Act's mechanism for breaking that cycle at the data layer, before it becomes an operational and legal problem.

Processing special-category data for bias detection

Article 10(5) creates a narrow exception. Where strictly necessary to detect and correct biases that could affect the health, safety, or fundamental rights of persons, providers may process special-category data within the meaning of GDPR Article 9 — data such as racial or ethnic origin, health data, or data about sexual orientation. This processing must be subject to appropriate safeguards and may not be used for any purpose other than bias detection and correction.

This is a deliberate policy choice: making a high-risk system genuinely representative may require understanding the demographic composition of the data and correcting imbalances. Article 10(5) permits that while keeping the processing strictly bounded.

Representativeness and the intended purpose link

Across Article 10, the concept of "intended purpose" (defined separately in Article 3 and relevant to the intended purpose of the system) operates as the anchor for all quality assessments. What counts as relevant, representative, or complete is evaluated against the stated intended purpose — the context in which the system is designed to operate. A provider who later expands a system's use to cover a different population or geography may find that the original data no longer meets Article 10 criteria for the new use.


Documentation

Annex IV technical documentation

Every high-risk AI system must be accompanied by technical documentation drawn up before the system is placed on the market or put into service, and kept up to date. The content requirements are set by Article 11 and laid out in Annex IV.

Annex IV section 2(d) specifically requires that the technical documentation include a description of the training methodologies and techniques, and of the training, validation and testing data sets used, including their characteristics, how they were obtained, and how the provider verified their quality and relevance. This is where the Article 10 data governance work becomes a documentation artefact: the decisions made about data selection, quality checks, and bias examination must be written up and available for scrutiny.

The Annex IV technical documentation is part of the conformity assessment file reviewed under Article 43, and it forms the basis for the EU declaration of conformity issued under Article 47. Without adequate data documentation, neither the conformity assessment nor the declaration can be properly substantiated.

Data provenance

Provenance — knowing where data came from, under what conditions it was collected, and how it was processed — is not a separate statutory requirement, but it flows from the Article 10 quality criteria and the Annex IV documentation duty. A provider who cannot describe the origin and processing history of their training data cannot credibly assert that the data is relevant, representative, and free of known errors. Regulators and notified bodies examining the technical file will expect this information.

Provenance documentation should cover: the source of the data (internal collection, licensed datasets, publicly available sources); the period of collection; any pre-processing steps applied; how access and use rights were verified (particularly relevant for personal data); and the steps taken to assess and address data quality issues.

For providers using pre-trained models as a foundation, the data governance question extends to the upstream training data used by the model developer. That aspect intersects with the GPAI framework under Articles 53 and 55 (obligations on GPAI model providers to summarise training data), but for the downstream system provider, the relevant obligation remains Article 10 applied to any fine-tuning or additional training datasets the provider controls.


How Confir maps to Article 10

Confir's AITR module — Data & Technical Robustness — structures the Article 10 assessment as a set of plain-English controls. Each control maps to a specific requirement: data relevance for the intended purpose; representativeness across relevant population segments; a documented bias examination; the Article 10(5) special-category data log if applicable; and the data-provenance entries required for the Annex IV technical file.

The output feeds directly into the technical documentation pack Confir generates — a print-ready Annex IV datasheet that includes the data-governance section alongside the system description, risk management overview, and accuracy metrics. The engine is rule-based and deterministic: the same answers produce the same documentation, with no variation and no hallucination.


Frequently Asked Questions

Does Article 10 apply to all AI systems, or only high-risk ones?

Article 10 applies only to high-risk AI systems as classified under Article 6 and Annex III (or Annex I for product safety components). Providers of minimal-risk or limited-risk systems are not subject to Article 10's data governance requirements, though good data practice remains advisable for any system that affects people. If you are uncertain whether your system is high-risk, the starting point is the Article 6 classification rules and the Annex III use-case list.

What is the difference between validation data and testing data?

Both are used to evaluate an AI system, but at different stages and for different purposes. Validation data is used during development to tune hyperparameters and guide training decisions — it is part of the iterative development process. Testing data is held back and used only once, as an independent check of the final system's performance before market release. Mixing them — or using training data in either role — undermines the independence that makes the evaluation meaningful. Article 3 (points 30 and 31) defines both separately, reflecting this distinction.

How representative does training data need to be?

The Act requires data to be "sufficiently representative" — a standard assessed against the system's intended purpose and operational setting. There is no quantitative threshold in the text. In practice, providers should document the demographic, geographic, and contextual composition of their datasets, identify gaps, and take steps to address them. For a recruitment screening tool operating across multiple EU member states, that means examining whether the training data reflects the actual applicant population in each country, not just the jurisdiction where the data was originally collected.

Can a provider use sensitive personal data to check for bias?

Yes, under Article 10(5), but only within strict limits. Processing of special-category personal data — data revealing racial or ethnic origin, health data, sexual orientation, and other categories defined in GDPR Article 9 — is permitted where strictly necessary for detecting and correcting biases that could harm health, safety, or fundamental rights. The processing must be subject to appropriate safeguards and may not be repurposed. Providers should document this processing separately and ensure it is covered in the data-governance section of their Annex IV technical file.

When does the Article 10 obligation apply?

Article 10 is part of the high-risk obligations that apply to stand-alone Annex III systems from 2 December 2027 under the Digital Omnibus political agreement reached in May 2026. High-risk AI embedded in regulated products under Annex I follows on 2 August 2028. These dates replaced the original August 2026 deadline, which was deferred. Providers who develop high-risk systems now need to ensure their data governance practices are documented and defensible by those dates — the technical file must be ready before the system is placed on the market.

Does Article 10 require the training data itself to be retained?

The Act does not mandate indefinite retention of raw training data, but Article 10's requirements are operationally meaningless without records of what data was used, how it was selected, and how quality was assessed. The Annex IV documentation must describe these elements. Technical documentation must be kept for ten years after the system is placed on the market (Article 18). How long the underlying datasets must be retained will also be shaped by GDPR obligations where the data includes personal data.


Manage your EU AI Act compliance in one place

Confir automates risk classification, technical documentation, and audit trails for any company. No consultants. No 6-month projects. 7-day free trial.

Start free trial →