ISO 42001: 2023 - A.6.2.6 AI System Verification and Validation

ISO 42001 Control Description

The organisation shall conduct verification and validation activities for AI systems to confirm that the system meets its documented requirements and that it performs as intended in conditions representative of its operational context, prior to deployment and following significant changes.

Control Objective

To provide assurance that AI systems function correctly against their specification, perform adequately in the intended operational environment, and do not exhibit unacceptable behaviours — including unsafe, unfair, or unreliable outputs — before being placed into service or materially modified.

Purpose

Verification and validation are the primary technical means by which the organisation confirms that an AI system is fit for its intended purpose. Verification activities establish that the system has been built correctly in accordance with its specification; validation activities establish that the correct system has been built — that is, that it will perform as needed in the context in which it will actually be used.

For AI systems, verification and validation present distinctive challenges. Unlike conventional software, where behaviour can often be exhaustively specified and tested, AI systems may exhibit complex, context-dependent, and emergent behaviours that are difficult to anticipate fully from the requirements specification alone. This makes the design of verification and validation activities — including the selection of test cases, evaluation metrics, and adversarial test conditions — a matter of significant professional judgement and technical care.

Robust verification and validation processes also serve important accountability functions: they generate the evidence base that allows the organisation to substantiate claims about system performance to regulators, customers, affected individuals, and other stakeholders, and to demonstrate that risks identified during assessment have been addressed effectively in the completed system.

Guidance on Implementation

Verification and Validation Planning

The organisation shall develop a verification and validation plan for each AI system before testing activities commence. The plan shall define the scope of verification and validation activities, the methods and criteria to be applied, the datasets to be used, the responsibilities of personnel involved, and the criteria for concluding that the system is ready for deployment.

Verification and validation plans shall be aligned with the requirements specification established under A.6.2.2 and shall address the risk profile documented in the AI risk assessment. Systems subject to higher risk shall receive more extensive and rigorous testing.

Functional Verification

Functional verification shall confirm that the AI system operates in accordance with its documented functional requirements. This includes confirming that the system processes inputs correctly, produces outputs within the specified range, handles edge cases and boundary conditions appropriately, and behaves as specified across the full scope of intended use scenarios.

Verification activities shall produce documented evidence, including test records and results, that is retained as part of the AI system documentation.

Performance Validation

Performance validation shall assess whether the AI system achieves the performance criteria specified in the requirements documentation. Evaluation shall be conducted using held-out test datasets that were not used during development, ensuring that results reflect genuine generalisation performance rather than performance on data seen during training.

Performance results shall be evaluated against defined acceptance thresholds, with clear criteria for passing or failing the validation. Where results fall short of acceptance thresholds, the implications for deployment shall be assessed and addressed before the system proceeds.

Fairness and Non-Discrimination Testing

Where the AI system makes or informs decisions that affect individuals, validation activities shall include testing for fairness and the absence of discriminatory bias. Tests shall assess system performance disaggregated by relevant population subgroups, enabling the identification of differential performance that may indicate bias.

The fairness criteria applied in testing shall be consistent with the requirements established in the system specification and the impact assessment conducted under A.6.1.1. Identified fairness concerns shall be documented and assessed for their implications for deployment.

Robustness and Adversarial Testing

Validation shall include testing of system robustness, including testing under distributional shift conditions, with degraded or incomplete inputs, and, where relevant to the intended use, under adversarial input conditions. The scope of robustness testing shall be calibrated to the risk profile of the system and the plausibility of the tested conditions in the operational environment.

Results shall inform decisions about appropriate operational constraints, monitoring requirements, and safeguards for the deployed system.

Pre-Deployment Review

Prior to deployment, the organisation shall conduct a structured pre-deployment review that considers the totality of verification and validation results and determines whether the system is acceptable for deployment. The review shall include consideration of any outstanding issues, known limitations, and conditions or constraints that shall be applied to the system in operation.

The pre-deployment review shall be documented, and deployment shall not proceed without documented authorisation from an appropriate level of management responsibility.

Related Controls

A.6.2.2 – AI System Requirements and Specification: Verification and validation activities shall be planned against the requirements and acceptance criteria documented in the specification.
A.6.2.3 – Data for Development and Testing of AI Systems: Test datasets used in validation shall be managed in accordance with the data governance requirements established under A.6.2.3.
A.6.2.7 – AI System Deployment: Deployment shall be conditional on the satisfactory completion of verification and validation activities and an approved pre-deployment review.
A.6.2.8 – AI System Documentation: Verification and validation plans, test records, and pre-deployment review outcomes form part of the AI system documentation.
A.6.1.1 – AI System Impact Assessment: Fairness and impact-related testing criteria shall be informed by the outcomes of the impact assessment.