Skip to content
arrow_back
search
Annex A 7.5 psychology ISO/IEC 42001:2023

Data Provenance

Organisations must have a process to document the source and usage of data in AI systems over time.

record_voice_over

Plain language

This control means your business should keep track of where all the data used in your AI systems comes from and how it's used over time. It’s important because if you don’t know the history of your data, you might end up with AI that makes incorrect decisions or violates privacy laws, like recommending the wrong products to customers because the data is outdated or incorrect.

Framework

ISO/IEC 42001:2023

Control effect

Preventative

Classifications

N/A

Official last update

01 Dec 2023

Control Stack last updated

19 May 2026

Maturity levels

N/A

Official control statement

The organisation shall define and document a process for recording the provenance of data used in its AI systems over the life cycles of the data and the AI system.
psychology ISO/IEC 42001:2023 Annex A 7.5
priority_high

Why it matters

If you don't track where your AI's data comes from, it might make bad choices - like recommending products based on incorrect data - leading to upset customers and legal issues.

settings

Operational notes

Keep your data provenance log current - update it whenever you get new data or modify existing data, not just once a year.

build

Implementation tips

  • The data steward should set up a simple system, like a spreadsheet or a basic database, to log where all the AI data comes from and any changes made to it. It could include details like the original source and how the data was altered, which can help quickly identify issues if something goes wrong.
  • AI lead should train their team about the importance of data history (provenance) and its impact on AI system decisions. Use an example, like how outdated data caused wrong customer suggestions, to illustrate the benefits of knowing data history.
  • Procurement should ensure that when purchasing data sets, all vendor contracts mandate data provenance information. They could include a clause requiring the vendor to provide a data origin report, so the business always knows where their AI training data starts and how reliable it is.
  • The product owner should regularly verify that the AI model isn’t using outdated or invalid data. This can involve setting up alerts or checks in the system whenever there are updates or new data inputs to prevent unexpected AI behaviours.
  • The head of risk should assess the potential risks of using unidentified data in AI systems. This involves creating a risk profile for data sources and understanding potential legal or ethical issues, ensuring any problems are caught early before they impact customers.
fact_check

Audit / evidence tips

  • AskRequest the organisation's data provenance log for AI systems. GoodThe log is up-to-date, with clear details on data origin and modification.
  • AskAsk the AI development team's lead about their process for tracking data sources. GoodThe AI development lead provides a clear process that matches documented procedures.
  • AskCheck procurement contracts for data sets used by AI. GoodContracts contain clear data provenance clauses signed by both parties.
  • AskExamine recent changes in AI training data. GoodEach change in training data is documented, listing source and reason, in the system log.
  • AskLook at the minutes from recent risk management meetings. GoodThe minutes show regular discussion of data-related risks in AI, with action items recorded.
link

Cross-framework mappings

How Annex A 7.5 relates to controls across ISO/IEC 27001, ISO/IEC 42001, Essential Eight, and ASD ISM.

ISO 27001

Control Notes Details
handshake Supports (2) expand_less
Annex A 5.33 Annex A 7.5 requires a defined and documented process for recording data provenance for AI systems over time, creating provenance records...
Annex A 7.10 Annex A 7.5 requires the organisation to record provenance of data used by AI systems throughout the data and AI system life cycles, incl...

ASD ISM

Control Notes Details
sync_alt Partially overlaps (1) expand_less
ISM-2087 Annex A 7.5 requires a documented, end-to-end process for recording the provenance of data used in AI systems across the full life cycle
handshake Supports (1) expand_less
ISM-2103 ISM-2103 requires informed and explicit consent from data owners before organisational data from AI applications is used for training, fi...

These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.

psychology

Want to implement this AI control?

Mindset Cyber runs PECB-accredited ISO/IEC 42001 training that maps directly to the AI controls in this library.

Mapping detail

Mapping

Direction

Controls