ISO 42001: 2023 - A.8.2 AI System Incident Management

ISO 42001 Control Description

The organisation shall establish an incident management process for AI systems that enables the timely identification, reporting, assessment, containment, investigation, and resolution of AI-related incidents, including incidents involving unexpected system behaviour, adverse outcomes for individuals, and failures to operate within intended parameters.

Control Objective

To ensure that incidents involving AI systems are managed in a structured, accountable, and effective manner that minimises harm, preserves evidence, supports root cause analysis, and enables the organisation to learn from incidents and implement improvements that reduce the likelihood of recurrence.

Purpose

AI systems can fail in complex and sometimes unexpected ways. Incidents may arise from model performance degradation, data quality failures, unexpected interactions between system components, adversarial inputs, misuse of the system beyond its intended scope, or broader system integration failures. Some AI incidents may cause or contribute to harm to individuals — through erroneous decisions, biased outputs, or the inappropriate disclosure of personal information — creating legal, regulatory, and reputational risks for the organisation.

Effective incident management for AI systems requires capabilities that go beyond standard IT incident management. The statistical and probabilistic nature of AI outputs means that identifying the boundary between acceptable variation and a genuine incident requires careful judgement. Root cause analysis for AI incidents must consider not only technical failure modes but also dataset characteristics, deployment conditions, and the adequacy of pre-deployment risk management. The potential for AI incidents to have diffuse impacts across a population — rather than the clearly bounded impacts typical of conventional software incidents — creates particular challenges for scoping and assessing incidents.

This control provides the framework within which the organisation identifies, manages, and learns from AI-related incidents, supporting its obligation to operate AI systems responsibly and to be accountable for their effects.

Guidance on Implementation

Incident Identification and Reporting

The organisation shall establish clear criteria for what constitutes an AI system incident, including: system errors, failures, or unexpected behaviours; outputs that fall below defined performance thresholds; instances of the system operating outside its intended use scope; outputs that cause or may have caused harm or adverse outcomes for individuals; reports of discriminatory or unfair outputs; and security events affecting the AI system or its data.

Reporting mechanisms shall enable operational personnel, users, and affected individuals to report potential incidents efficiently. Personnel responsible for AI system operation shall be trained to recognise incident indicators and to initiate the reporting process promptly.

Incident Assessment and Classification

Reported incidents shall be assessed to determine their nature, scope, and severity. Assessment shall consider the actual or potential impact on individuals and organisational processes; whether the incident is isolated or may indicate a systemic issue; whether regulatory notification or other external reporting obligations are triggered; and the urgency of containment and remediation actions.

Incidents shall be classified according to severity, with classification determining the escalation pathway and the resources to be mobilised for response.

Containment and Immediate Response

Where an incident involves ongoing risk of harm or significant operational disruption, immediate containment actions shall be taken. Containment measures may include restricting the scope of system operation, increasing human oversight thresholds, suspending automated processing for affected use cases, or taking the system offline pending investigation.

Containment decisions shall be documented, including the rationale for the measures taken and the authorisation for any operational restrictions.

Investigation and Root Cause Analysis

Incidents shall be investigated to identify their root cause and contributing factors. AI incident investigations shall examine relevant system logs, monitoring data, input records, and output records; the characteristics of the data in use at the time of the incident; potential design, development, or deployment factors that may have contributed; and the adequacy of the pre-deployment assessment and monitoring processes.

Investigations shall be conducted by personnel with appropriate technical competence and documented in an incident report that records findings, conclusions, and recommended corrective actions.

Corrective Action and Resolution

Incidents shall be resolved through the implementation of corrective actions addressing the root cause and any contributing factors. Corrective actions shall be assessed for their effectiveness before the incident is formally closed. Where corrective actions involve changes to the AI system, these shall be managed through the change management process established under A.7.6.

For incidents that resulted in harm to individuals, the organisation shall consider what remedial action, if any, is owed to those affected and shall act in accordance with applicable legal obligations and ethical commitments.

Regulatory and Contractual Notification

The organisation shall maintain awareness of its regulatory and contractual obligations to notify external parties — including regulators, customers, and affected individuals — following AI system incidents. Notification processes and responsibilities shall be defined and documented, and notifications shall be made within applicable timeframes.

Learning and Improvement

Incident records shall be reviewed periodically to identify patterns, systemic issues, and opportunities for improvement across the AI governance framework. Lessons learned shall be fed back into risk assessment processes, design practices, operational procedures, and monitoring frameworks.

Related Controls

A.7.5 – AI System Monitoring: Monitoring is a primary mechanism for detecting conditions that indicate or lead to incidents, and monitoring alerts shall feed into the incident reporting process.
A.7.6 – AI System Change Management: Corrective actions arising from incident investigation that involve changes to the AI system shall be managed through the change management process.
A.6.1.2 – AI Risk Assessment: Significant incidents shall trigger a review of the AI risk assessment to ensure that it reflects current operational experience and emerging risks.
A.6.2.8 – AI System Documentation: Incident records, investigation reports, and corrective action documentation form part of the AI system documentation maintained throughout the lifecycle.
A.5.2 – AI Policy: Incident patterns and systemic issues identified through incident management shall inform updates to the organisation's AI policy and governance framework.