Monitoring

Monitoring refers to the ongoing process of observing an AI system’s behaviour, performance, and interactions in real-world operation. It is a cornerstone of post-deployment AI assurance, ensuring that systems continue to perform safely, accurately, and within acceptable risk thresholds throughout their lifecycle.

Unlike pre-deployment testing, monitoring addresses the reality that AI systems can encounter new, unexpected, or adversarial conditions once they are in the field. It enables detection of emerging issues such as:

Performance degradation or drift
Bias emergence or behaviour shifts
Security vulnerabilities or data poisoning
Misalignment with operational context or policy changes

Monitoring can include both automated and human-in-the-loop components. Key techniques include:

Telemetry and logging to capture inputs, outputs, and model states
Health checks and anomaly detection to flag irregular behaviours
User feedback loops to gather input from operators or end users
Dashboards and alerts to support oversight teams in real-time

For mission-critical or safety-sensitive AI systems, monitoring helps maintain operational control and ensures a rapid response to anomalies. For example, an AI-enabled drone navigation system might be monitored for decision latency, sensor errors, or unexpected path deviations.

AI assurance frameworks emphasise monitoring as an ongoing responsibility. It complements risk assessments, certification, and impact evaluations by ensuring that assurances made before deployment hold up under live conditions. Monitoring also supports accountability, allowing for post-incident analysis and continuous improvement.

Regulatory guidance, such as the EU AI Act and the NIST AI Risk Management Framework, explicitly requires monitoring of high-risk systems. These frameworks often mandate logging, documentation, and mechanisms for human intervention when performance thresholds are exceeded.