Document the Data Resources Used by Your AI System
Your organisation must keep a written record of the data resources used by each artificial intelligence (AI) system.
Plain language
Artificial intelligence (AI) systems run on data. They learn from data, are tested with data and produce results from data. This control says that when your organisation identifies the resources needed to build and run an AI system, you must also write down clear information about the data those systems use. In plain terms, you keep a record that answers questions like: where did the data come from, what does it contain, who is allowed to use it, and how good or reliable is it. Documenting your data resources matters because data quietly drives everything an AI system does. If the data is out of date, biased, low quality or used without permission, the AI's results will be flawed too, and you may not realise why. Having a written record means anyone reviewing the AI later (your own staff, an auditor or a regulator) can see exactly what data feeds it. This is part of an AI management system (AIMS), which is the set of policies, roles and records your organisation uses to run AI responsibly. The control is about keeping good records of your data, not about judging whether the AI's decisions are fair.
Framework
ISO/IEC 42001:2023
Control effect
Preventative
Classifications
N/A
Official last update
01 Dec 2023
Control Stack last updated
18 June 2026
Maturity levels
N/A
Official control statement
As part of resource identification, the organisation shall document information about the data resources utilised for the AI system.
Why it matters
Without a record of what data feeds your AI, you cannot trace flawed or unauthorised data, leaving results unreliable and use rights unclear.
Operational notes
Update the data resources register whenever a dataset is added, changed or an AI system is retrained, not only at the annual review.
Implementation tips
- The AI or data lead should create a data resources register that lists every dataset used to train, test and run each AI system, recording its name, purpose and the AI system it supports.
- For each dataset, the data owner should record its source (for example purchased, collected from customers, or publicly available) and the licence, contract or consent that gives your organisation the right to use it.
- The data lead should capture key details about each dataset, including its size, format, the time period it covers, how current it is, and any known quality limitations or gaps.
- Compliance staff should note whether a dataset contains personal or sensitive information and link the record to the relevant privacy approval, so data protection obligations are visible alongside the data itself.
- The AIMS owner should set a simple rule that the data resources register is updated whenever a new dataset is added or an AI system is retrained, and reviewed at planned intervals to keep it accurate.
Audit / evidence tips
- Askthe data resources register or equivalent documentation, and check that it covers each AI system the organisation operates rather than just one or two
- Look atwhether each dataset entry records its source and the licence, contract or consent permitting its use, confirming the organisation has the right to use the data
- Askhow data quality is described, and good evidence shows recorded details such as size, coverage, currency and known limitations for each dataset
- Look atwhether datasets containing personal or sensitive information are flagged and linked to a privacy approval, showing data protection is considered in the records
- Askwhen the documentation was last updated, and good practice is a register kept current when datasets change or AI systems are retrained, with a named owner responsible
Cross-framework mappings
How Annex A 4.3 relates to controls across ISO/IEC 27001, ISO/IEC 42001, Essential Eight, and ASD ISM.
ISO 27001
| Control | Notes | Details |
|---|---|---|
| sync_alt Partially overlaps (2) expand_less | ||
| Annex A 5.9 | Annex A 4.3 requires the organisation to document information about the data resources utilised for an AI system as part of resource iden... | |
| Annex A 5.34 | Annex A 4.3 requires documenting the AI system’s data resources, which often includes identifying whether datasets contain personal infor... | |
| handshake Supports (4) expand_less | ||
| Annex A 5.12 | Annex A 4.3 requires documenting the data resources used by an AI system to understand what data underpins the system across its lifecycle | |
| Annex A 5.13 | Annex A 4.3 requires documentation of AI data resources | |
| Annex A 5.22 | Annex A 4.3 requires documentation of AI data resources | |
| Annex A 5.33 | Annex A 4.3 requires the organisation to document information about data resources used by AI systems | |
ASD ISM
These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.
Want to implement this AI control?
Mindset Cyber runs PECB-accredited ISO/IEC 42001 training that maps directly to the AI controls in this library.