Agentic AI and the EU AI Act: Classification, Oversight, and Compliance
Agentic AI is not a risk category under the EU Act. Classification follows use case and Annex III. Covers Art 14 oversight, GPAI roles, and 2027 deadlines.
An agentic AI system is one that acts autonomously over sequences of steps — planning, executing tool calls, retrieving information, and producing outputs that feed back into the next action — with limited human intervention between steps. A customer-service agent that searches databases, drafts responses, and sends emails without a human approving each step is agentic. So is a software-development assistant that reads code, writes tests, executes them, and commits the results.
Agentic behaviour is an architectural pattern, not a risk category. The EU AI Act of 2024 (Regulation (EU) 2024/1689) does not name "agentic AI" as a distinct classification. Compliance obligations follow from what the system does and to whom — its use case, output type, and whether it falls within the Article 5 prohibited list or the Article 6 / Annex III high-risk classification. A multi-step agent that helps drafting internal emails is minimal-risk. An agent that allocates job tasks to workers (Annex III, point 4(d)) is high-risk regardless of how many autonomous steps it takes to reach its output.
Understanding how agentic AI maps to the Act's existing framework is increasingly urgent. GPAI models from providers such as OpenAI, Mistral, and Google are the typical foundation for agentic systems. The agent layer — the orchestration logic, the tool integrations, the memory and context management — is usually built by the deploying organisation. That layer's compliance status is determined by the Act's role and risk-tier logic, not by the novelty of the architecture.
Classification Comes First: Risk Tier and Role Determine Obligations
The starting question for any agentic deployment is the same as for any other AI system: does it fall in Article 5 (prohibited), Article 6 / Annex III (high-risk), Article 50 (limited-risk transparency), or the minimal-risk default?
Prohibited uses remain prohibited regardless of architecture. An agent that performs real-time remote biometric identification in publicly accessible spaces (Art 5(1)(h)), or that infers sensitive characteristics from biometric data for categorisation purposes (Art 5(1)(g)), is prohibited whether or not it operates autonomously. The architecture does not create exemptions.
Annex III high-risk uses are high-risk whether automated or not. An agentic system that makes or materially influences recruitment decisions (Annex III, point 4(a)), evaluates creditworthiness (point 5(b)), prioritises emergency-service calls (point 5(d)), or assists judicial authorities in researching and applying law (point 8(a)) is high-risk. The full Article 8 requirement stack — Articles 9 through 15 — applies.
Classify the output and affected population, not the architecture. Agentic systems often combine multiple steps, only some of which touch a regulated use. A legal research agent that retrieves case law and summarises it for a human lawyer is a different risk profile from an agent that autonomously drafts and files documents on behalf of natural persons before a tribunal. Analyse each distinct output type and downstream effect.
Article 6(3) filter. Even where an agent's output touches an Annex III area, the Article 6(3) exemption may apply if the system does not pose a significant risk of harm — for example, it performs a narrow preparatory task, improves the result of a previously completed human decision, or detects decision patterns without replacing or influencing human assessment. Providers claiming this exemption must document it and register under Article 49 regardless.
GPAI as the Foundation Layer
Most commercially deployed agentic systems are built on general-purpose AI models (GPAI models under Chapter V of the Act). The GPAI model provider — OpenAI, Mistral, Google, Meta, and others — has its own set of obligations under Article 53 (all GPAI providers) and Article 55 (providers of systemic-risk models, those trained above the 10²⁵ FLOP threshold presumed under Article 51).
Those GPAI obligations sit with the model provider, not with the organisation that builds an agent on top of the model. The company deploying an agentic system does not inherit Article 53 or Article 55 obligations simply by using an API. What it does inherit — or take on — depends on what it builds and ships.
Role clarification via Article 25. If an organisation accesses a GPAI model through an API and builds an agentic product that it places on the market under its own name, it is the provider of that AI system under Article 16, classifiable against Annex III by what the system does. The GPAI model vendor retains its own GPAI-tier obligations; the agent builder takes on provider obligations for the system it ships. This role shift is governed by Article 25.
If the organisation uses the agentic system internally — for its own processes, not sold or licensed to third parties — it is the deployer under Article 26, with lighter obligations: follow instructions for use, maintain appropriate oversight, keep logs, and (for certain high-risk categories) conduct a Fundamental Rights Impact Assessment under Article 27.
The Human Oversight Problem in Agentic Systems
Article 14 requires that high-risk AI systems be designed to allow designated natural persons to understand capabilities and limitations, detect failures, and — critically — decide not to use or disregard the system's output. This is the requirement that agentic architectures put most directly under pressure.
When an agent executes a five-step plan — retrieve data, evaluate options, select a course, act, report — the human may receive only the final output. The intermediate steps are often invisible in real time. For high-risk applications this is not acceptable: the oversight mechanism must be designed so a human can intervene at meaningful points, not merely observe the result after the fact.
Practical implications for Article 14 compliance in agentic deployments:
- Checkpoints for consequential actions. High-stakes steps — sending communications on behalf of a person, committing funds, making a system configuration change — should require a human confirmation, even if lower-stakes steps are fully automated.
- Interpretability of reasoning. The system should be able to explain which information it relied on and why it chose a particular action, in terms a non-expert user can assess. This is the Article 14(4) requirement that the system allow oversight persons to "correctly interpret" outputs.
- Override capability. There must be a mechanism to halt, correct, or reverse an agent's action at any stage. For systems that execute in external environments (APIs, databases, file systems), this means logging every action with enough information to reverse it.
- Competence of oversight persons. Article 14(3) requires that oversight persons have the necessary competence, training, and authority. For agentic systems operating in technical domains, this may mean the oversight designee must have domain expertise, not just access to a dashboard.
Multi-Agent Architectures: Who Is the Provider?
Many production agentic deployments involve multiple models — an orchestrator agent that calls specialist sub-agents, each of which may itself call external tools or APIs. This creates an attribution question: if a bias event or harmful output arises from a sub-agent's decision, which entity bears the Article 73 incident-reporting duty?
The answer turns on Article 25 and the role each entity plays. The entity that places the final agentic system on the market under its own name is the provider of that system. If it builds the orchestrator by connecting proprietary sub-agents and third-party APIs, it takes on provider obligations for the whole assembled system. The third-party sub-agent vendors are themselves providers of their component systems — each subject to their own compliance stack if those components are placed on the market.
This has several practical consequences:
- The provider of the assembled system must ensure it has sufficiently detailed technical documentation under Article 11 to cover the complete system, including the behaviour of third-party components in ways that affect the system's outputs.
- Contractual arrangements with sub-component providers should include clauses on incident notification timelines, because the top-level provider's Article 73 reporting clock runs from its own awareness — and it needs to know fast if a sub-component caused a serious incident.
- The risk management system under Article 9 must account for failure modes that arise from component interactions, not just from each sub-component in isolation.
Agentic AI in Non-High-Risk Deployments
Not every agentic system is high-risk. An internal knowledge-retrieval agent that answers employee questions using company documentation is minimal-risk if it does not affect natural persons in ways that touch an Annex III use case. A customer-facing chatbot agent that can answer questions and initiate a return process is likely limited-risk (Article 50) — the customer must know they are interacting with an AI system.
For limited-risk agentic deployments, the obligation is Article 50 transparency: disclose that the system is AI, allow users to request escalation to a human, and (for content-generating agents) label AI-generated material appropriately. The Article 50 obligations apply from 2 August 2026.
For minimal-risk deployments, no mandatory obligations apply under the Act, though voluntary codes of practice and internal governance policies remain best practice.
Key Article References for Agentic Deployments
| Compliance area | Relevant article |
|---|---|
| Risk classification | Art 5 (prohibited), Art 6 + Annex III (high-risk) |
| Provider obligations (if you build and ship) | Art 16, Art 25 |
| Deployer obligations (if you use internally) | Art 26 |
| Risk management system | Art 9 |
| Technical documentation | Art 11 / Annex IV |
| Human oversight by design | Art 14 |
| Accuracy, robustness, cybersecurity | Art 15 |
| Limited-risk transparency | Art 50 |
| GPAI model obligations (model vendor) | Art 53, Art 55 |
| Incident reporting (high-risk providers) | Art 73 |
| Post-market monitoring | Art 72 |
| Fundamental Rights Impact Assessment | Art 27 |
Internal Governance for Agentic AI Deployments
Beyond the statutory obligations, organisations deploying agentic AI internally face a practical governance gap: the usual controls — review a model's output before it goes anywhere — do not work when the model is already acting.
A workable internal governance model for agentic AI has four elements.
An AI inventory that captures agent behaviour, not just model identity. The inventory entry for an agentic system should record which external systems the agent can call (APIs, databases, email, file systems), what actions it can take (read-only vs write vs execute), and the maximum autonomous chain length before a human checkpoint is required. A registry that only logs "we use GPT-4" is not adequate for an agentic deployment.
Scope limits defined before deployment, not discovered after. Each agent should have an explicit list of permitted actions, a maximum spend or transaction limit where relevant, and a clear boundary on what it cannot do without explicit human authorisation. These limits should be enforced at the infrastructure level, not only in the prompt. Prompt-level guardrails are not reliable enough for high-stakes automated action.
Incident detection tuned to autonomous failures. Standard application monitoring flags crashes. Agentic failures are subtler — the agent completes successfully but takes a sequence of individually plausible steps that lead to an unintended outcome. Monitoring should include semantic auditing of action sequences, not just error-rate tracking.
A clear escalation path. When an agent encounters a situation outside its scope, the default should be to stop and notify a human — not to attempt an approximation. For high-risk applications this is an Article 14 requirement; for all applications it is sound practice.
How Confir Handles Agentic AI Classification
Because "agentic" is an architecture, not a compliance category, Confir's classification workflow focuses on what the system does rather than how it does it. The plain-English intake process asks about the system's output type, who is affected, whether the output influences decisions about natural persons, and which Annex III use-case areas the output touches. A multi-step agent that screens job applicants at step five of its pipeline is classified as high-risk under Annex III point 4(a) — the same as a single-step classifier that does the same thing.
The rule-based, deterministic logic produces the same classification for the same intake regardless of the number of automated steps involved. This consistency is deliberate: compliance defensibility requires reproducible findings.
Frequently Asked Questions
Is agentic AI a separate risk category under the EU AI Act?
No. The Act does not define "agentic AI" or treat it as a distinct risk tier. Classification follows the standard framework: prohibited (Article 5), high-risk (Article 6 and Annex III), limited-risk (Article 50), or minimal-risk. An agent's autonomy and multi-step behaviour do not change its risk classification — the use case and affected population do.
If I build an agent on top of GPT-4 or another GPAI model, do I inherit the GPAI provider's obligations?
No. The GPAI model provider's Article 53 and Article 55 obligations stay with that provider. When you build an agentic system on a GPAI model and place it on the market under your name, you become the provider of that system under Articles 16 and 25, with obligations that flow from what the system does — its risk tier and intended use. You do not take on GPAI-tier obligations unless you are also developing and placing a GPAI model on the market yourself.
Which article governs human oversight in agentic AI?
Article 14. It requires high-risk AI systems to be designed so that designated oversight persons can understand capabilities and limitations, detect failures, and decide not to use outputs. For agentic systems, this means building checkpoints, interpretable reasoning traces, and override mechanisms into the architecture — not merely observing final outputs.
What are the penalties if an agentic high-risk AI system fails a requirement?
Non-compliance with the Articles 9–15 requirements (activated by Article 8) falls under Article 99(4): up to €15,000,000 or 3% of total worldwide annual turnover, whichever is higher. For SMEs and start-ups, Article 99(6) caps the fine at the lower of the percentage or the fixed amount.
When do the high-risk obligations apply to agentic systems?
The same dates apply as for any high-risk AI system. For stand-alone Annex III systems: 2 December 2027 (under the Digital Omnibus agreed May 2026). For systems that are safety components of Annex I products: 2 August 2028. Article 50 limited-risk transparency obligations apply from 2 August 2026 for customer-facing agentic deployments.
Do multi-agent systems require separate compliance documentation for each agent?
It depends on how the components are placed on the market. If a single entity assembles and ships the full multi-agent system under its name, that entity is the provider of the assembled system and the Article 11 technical documentation covers the whole. If third-party sub-agents are independently placed on the market by other entities, those entities are themselves providers with their own documentation obligations for their components.
Related guides
- EU AI Act Article 14: Human Oversight
- EU AI Act Article 6: High-Risk AI Classification
- GPAI model risk classification
- EU AI Act Article 25: Role Shifts Along the Value Chain
- AI risk management system under Article 9
- Human oversight requirements for high-risk AI
Manage your EU AI Act compliance in one place
Confir automates risk classification, technical documentation, and audit trails for any company. No consultants. No 6-month projects. 7-day free trial.
Start free trial →