Skip to content
Confir.
AI Documentation

Annex IV Technical Documentation Template for Credit-Scoring AI

Template23 May 2026· 13 min read· 2,553 words

Worked Annex IV template for credit-scoring AI under EU AI Act Annex III point 5(b). All 9 content areas with credit-scoring examples. Deadline 2 Dec 2027.

Credit scoring falls squarely in the high-risk tier. Annex III, point 5(b) of Regulation (EU) 2024/1689 covers AI systems that evaluate the creditworthiness of natural persons or establish their credit scores — with one carved-out exception: pure fraud detection. If your system assesses whether a person can repay a loan, it is high-risk regardless of accuracy or deployment scale.

That classification triggers the full obligation stack under Articles 9 through 15 and, most concretely, the requirement under Article 11 to compile technical documentation in the form prescribed by Annex IV. Under Article 18, you must retain that documentation — and all substantial revisions — for ten years from the date the system is placed on the market. The conformity assessment route for Annex III 5(b) is the internal self-assessment under Annex VI (Article 43(2)); no notified body is required, unlike biometrics under Annex III point 1.

The deadline for stand-alone high-risk AI systems under Annex III is 2 December 2027, deferred from the original August 2026 date under the Digital Omnibus agreed in May 2026. Assembling a defensible Annex IV pack for a credit-scoring system takes months; providers who treat the deadline as distant will be rushing in late 2027.

This article walks every one of the nine Annex IV content areas, with credit-scoring-specific guidance and fill-in prompts for each.


Section 1: General Description

What it covers. Intended purpose, provider identity, version, and how the system is described to deployers under Article 11.

Template fields.

FieldCredit-scoring example
Intended purposeAssess the creditworthiness of natural persons applying for unsecured consumer credit (€500–€25,000, 6–84 months) in [member states]. Output: score 0–1,000 and a binary approve/refer recommendation at a configurable threshold.
Provider[Legal entity, registered address, VAT/company registration]
Authorised representative[Required under Article 22 if provider is non-EU]
Versionv2.4.1, released [DD/MM/YYYY]
Hardware/software environment[Cloud provider, region; on-premise minimum specs]

Precision point. "Creditworthiness of natural persons" is the Annex III 5(b) scope. If your system also scores legal entities, document that function separately so the high-risk compliance perimeter is unambiguous.


Section 2: Detailed System Description

What it covers. Design choices, model architecture, decision logic, development process, and the human-oversight interface (Article 14).

Architecture narrative (worked example).

ScoreCore v2.4.1 is a gradient-boosted ensemble (XGBoost, 300 estimators, max depth 5) calibrated with isotonic regression to produce probability-of-default outputs. The model uses 42 input features from three sources: credit-bureau trade-line data (55%), application-level variables (30%), and existing-relationship behavioural data (15%). Features were selected by information-value filtering (IV > 0.02). No variables directly encoding gender, ethnicity, religion, or disability are used; postcode at NUTS-3 level was identified as a weak ethnicity proxy in v2.2 and removed in v2.3.0.

Decision logic. The default approval threshold is a score of 620 (probability of default ≤ 7.4%). Deployers may adjust within a band of 580–680; any threshold outside that band requires a documented justification. Scores between 580 and 640 are flagged for mandatory human review — the system does not auto-reject below the threshold; it issues a referral.

Human-oversight interface. Document the interface through which Article 14 oversight is exercised: at minimum, it must surface the score, the top three contributing factors, and a mandatory-review flag for near-threshold cases.


Section 3: Monitoring, Functioning, and Control

What it covers. Logging, performance drift thresholds, and the mechanisms for overriding or halting the system.

Logging. Under Article 12, high-risk systems must automatically generate logs. Each inference event should record: timestamp, input-feature hash (or values where GDPR minimisation permits), output score, recommendation, and human-override status. Providers retain logs they control for at least six months; the full technical file is retained ten years under Article 18.

Drift monitoring thresholds (example).

MetricAlert thresholdAction
Approval rate±5 pp over 30-day rolling windowNotify deployer; provider review within 14 days
AUC-ROCDrop below 0.70Mandatory retraining evaluation
Disparate impact ratio (gender)Below 0.80Escalate to provider risk committee
Population Stability IndexPSI > 0.20Population stability investigation

System halt. Document the mechanism — API kill-switch, deployer-controlled flag — and the fallback process when the system is suspended.


Section 4: Data and Data Governance

What it covers. Article 10 governs data and data governance. Document training, validation, and test datasets; representativeness across protected groups; and bias analysis and mitigation measures.

Dataset summary.

DatasetSizePeriodScope
Training480,000 loans2017–2023DE, AT, NL, PL
Validation120,000 loans2022–2023 (held-out temporal)Same
Test60,000 loans2024 (out-of-sample)Same + prospective markets

Representativeness (Article 10(2)). Disaggregate by gender, age cohort, and geography. Document any group that is under-represented relative to the deployer's applicant population (thin-file borrowers, recent immigrants, applicants under 25 with no credit history) and what you did to address it — stratified resampling, supplemental sourcing, a separate sub-model.

Bias analysis and mitigation (Article 10(5)). Compute and document: disparate impact ratio (approval rate of protected group / reference group); equalised odds (true-positive and false-positive rates across groups); calibration by group. Mitigation measures must include before/after metrics, not just a narrative of the intervention.


Section 5: Accuracy, Robustness, and Cybersecurity

What it covers. Article 15 governs accuracy, robustness, and cybersecurity. Declare the performance levels and the testing methodology that established them.

Declared accuracy metrics (example).

MetricValueTest set
AUC-ROC0.742Out-of-sample 2024, n = 60,000
Gini coefficient0.484Same
KS statistic0.396Same
Approval rate at threshold41.2%Same

Disaggregate by gender and age cohort. If AUC varies by geography, disclose it — a 4-point gap between national sub-populations is material and must be surfaced to deployers.

Robustness testing. Document: input perturbation (±10% declared income — does the score shift cliff-edge or proportionally?); distribution shift (performance on 2020–2021 downturn data vs. 2017–2019 training period); missing data degradation (bureau data unavailable for 5%/10%/20% of features).

Cybersecurity. Document access controls on the inference API, model-artefact encryption at rest, and the process for detecting a model-extraction or feature-poisoning attack.


Section 6: Risk Management System Outputs

What it covers. Article 9 mandates a continuous risk management system. Section 6 documents its outputs: identified risks, mitigation measures, and residual risks that deployers are informed of.

Risk register (extract).

RiskProbabilitySeverityMitigationResidual
Higher false-negative rate for age 65+ denies creditworthy older applicantsMediumHighMandatory human-review flag; separate performance monitoring for cohortLow — documented in deployer instructions
Model drift in economic downturnMediumMediumPSI monitoring; retraining protocolDeployers must implement manual-override escalation path
Threshold misconfiguration by deployerLowHighAPI hard limits on threshold parameter; audit log of all changesLow
Score manipulation via misrepresented incomeLowMediumIncome verification requirement in deployer instructions; bureau-vs-declared anomaly detectionResidual — deployer's due-diligence obligation

Article 9(7) requires you to account for reasonably foreseeable misuse — for example, a deployer using a consumer-credit model to screen mortgage applicants. Document out-of-scope uses and the contractual and technical controls that prevent them.


Section 7: Lifecycle Changes

What it covers. Changes to the system over its lifetime, including retraining events, material modifications, and their impact on compliance.

Substantial modification is defined in Article 3(23): a change that affects the system's compliance with the Act's requirements or changes its intended purpose. For a credit-scoring model, this typically includes retraining on a materially different dataset, adding or removing features that affect the fairness profile, or deploying in a new member state with a materially different applicant population.

Version log (extract).

VersionChangeImpactNew assessment needed?
v2.3.0Removed NUTS-3 postcode; added bureau utilisation ratioBias risk reduced; AUC unchangedNo
v2.4.0Retrained on 2022–2023 data; added CZ/SK marketsNew geographic scope requires separate validationYes — Annex IV updated; Annex VI re-run
v2.4.1Bug fix: logging timestamp errorNo impact on model performanceNo

Log every change, even minor ones. A clean version history is itself evidence of a functioning Article 9 risk management process.


Section 8: Standards Applied

What it covers. Harmonised standards, common specifications, or other technical standards applied in development and validation.

StandardRelevance
ISO/IEC 42001:2023AI management system; 38 Annex A controls map to Article 17 (QMS) and Article 9 evidence. Certification is voluntary and does not substitute for the Article 43 conformity assessment.
ISO/IEC 23894:2023AI risk management guidance; supports Article 9 documentation
ISO/IEC 24029Robustness of neural networks; applicable to ensemble models under Article 15
NIST AI RMF 1.0Not an EU standard; cross-reference explicitly to EU AI Act articles if cited
EBA GL/2021/05 (internal governance)Where the deployer is an EBA-regulated institution; anticipate the intersection in your technical file

As of mid-2026, CEN/CENELEC has not published harmonised standards under the EU AI Act. Applying them when published creates a presumption of conformity. Monitor the Commission's published list.


Section 9: EU Declaration of Conformity

What it covers. Article 47 requires a written declaration of conformity before market placement. Annex V sets out its content. The declaration is a separate document referenced in the technical file.

Template text (Annex V structure).


EU DECLARATION OF CONFORMITY

Provider: [Legal entity name, registered address]

AI System: [System name], version [X.X.X]

Intended purpose: Assessing the creditworthiness of natural persons in connection with consumer credit products as described in the technical documentation referenced below.

High-risk classification: Annex III, point 5(b) of Regulation (EU) 2024/1689 — AI systems used to evaluate the creditworthiness of natural persons or establish their credit scores, excluding fraud detection.

This declaration is issued under the sole responsibility of the provider named above.

The AI system described above has been assessed and found to conform with Regulation (EU) 2024/1689, in particular Articles 9, 10, 11, 12, 13, 14, 15, 16, and 17.

Conformity assessment procedure applied: Annex VI (internal self-assessment), pursuant to Article 43(2).

Technical documentation reference: [Document reference / version / date]

Standards applied: [List from Section 8]

Date of issue: [DD/MM/YYYY]

Signed for and on behalf of [Provider name]:

Name: ________ Position: ________ Signature: ________


Retain the declaration for ten years from market placement (Article 18). Register the system in the EU database under Article 49 before placing it on the market.


Deployer Note: FRIA and Article 26

The Annex IV file is the provider's obligation. Deployers of Annex III 5(b) systems have two specific duties worth flagging:

Article 27 Fundamental Rights Impact Assessment (FRIA). Creditworthiness-scoring deployers are among those explicitly required to conduct a FRIA. This is one of the few cases where a private-sector deployer — not just a public body — must run the assessment. Under Article 27(4), the FRIA may build on an existing GDPR DPIA (GDPR Article 35), but the two are distinct: the DPIA addresses data-protection risk; the FRIA addresses fundamental-rights impact.

Article 26 obligations. Deployers must use the system per the provider's instructions, implement the human-oversight measures in Section 3, retain logs for at least six months, and flag serious incidents to the provider.


How Confir Helps

Confir's rule-based engine generates the Annex IV technical documentation pack in under two hours. You answer a structured intake — intended purpose, model architecture, data sources, performance metrics, risk controls — and Confir's deterministic logic maps your answers to the nine Annex IV content areas, flags documentation gaps, and produces a print-ready technical file. The same workflow generates the Article 47 / Annex VI Declaration of Conformity and, for deploying institutions, the Article 27 FRIA. Same inputs, same outputs, every rule traceable — the reproducibility a regulator expects.


Frequently Asked Questions

Which Annex III point covers credit scoring, and does it include fraud detection?

Credit scoring falls under Annex III, point 5(b): AI systems that evaluate the creditworthiness of natural persons or establish their credit scores. Fraud detection is carved out — a system whose sole function is detecting fraud is not high-risk under this point. If fraud-detection logic is integrated into your creditworthiness output, the system as a whole falls under 5(b). Separate the functions architecturally if you want to exclude fraud components from the Annex IV scope.

Does credit scoring require a notified body for conformity assessment?

No. Point 5(b) uses the internal self-assessment route under Annex VI, pursuant to Article 43(2). You conduct the assessment and issue the declaration yourself. A notified body is required for biometric systems under Annex III point 1 (where harmonised standards are not applied), but not for credit scoring. External auditors may be engaged voluntarily.

What is the compliance deadline for a credit-scoring system?

Stand-alone high-risk AI systems under Annex III must comply by 2 December 2027, under the Digital Omnibus agreed in May 2026. Assembling a complete Annex IV file, running bias analysis across protected groups, and completing the Annex VI self-assessment takes several months. Starting in 2026 is the right pace.

What happens to the technical file when the model is retrained?

If the retraining is a substantial modification under Article 3(23) — new intended purpose, materially different dataset, new geographic scope — the Annex IV file must be updated and a new Annex VI conformity assessment completed before deployment. Minor updates that do not affect compliance do not require a new assessment but must be logged in Section 7. The ten-year retention clock runs from the date each version is placed on the market.

Does the FRIA obligation fall on the provider or the deployer?

The Article 27 FRIA falls on the deployer. It applies to deployers of creditworthiness systems (Annex III 5(b)) — one of the few cases where a private-sector deployer, not only a public body, must run the assessment. Providers should document the system's fundamental-rights implications in the Article 13 information document to assist deployers in completing their FRIA.

What are the penalties for missing or deficient technical documentation?

Failing to comply with Article 11 and Annex IV breaches Chapter III obligations. Under Article 99(4), the maximum fine is €15,000,000 or 3% of total worldwide annual turnover, whichever is higher. For SMEs and start-ups, Article 99(6) caps the fine at the lower of the percentage and the fixed amount. Supplying incorrect or incomplete information to a competent authority triggers the lower tier under Article 99(5): up to €7,500,000 or 1%.


Related guides

Manage your EU AI Act compliance in one place

Confir automates risk classification, technical documentation, and audit trails for any company. No consultants. No 6-month projects. 7-day free trial.

Start free trial →