Skip to content
arrow_back
search
ISM-2085 policy ASD Information Security Manual (ISM)

Prevent Exposure of AI Model Confidence Scores

Ensure AI model confidence scores are not shown to users or in API outputs.

record_voice_over

Plain language

This control ensures that the confidence scores from AI models are not shared with users or included in any outputs they can access. This is important because displaying these scores might lead users to overtrust the system's predictions or, alternatively, dismiss them entirely, both of which can result in poor decision-making and security risks.

Framework

ASD Information Security Manual (ISM)

Control effect

Preventative

Classifications

NC, OS, P, S, TS

ISM last updated

Nov 2025

Control Stack last updated

19 Mar 2026

E8 maturity levels

N/A

Official control statement

The exposure of exact artificial intelligence model confidence scores in API responses or user interfaces is prevented.
policy ASD Information Security Manual (ISM) ISM-2085
priority_high

Why it matters

Exposing exact model confidence scores can enable gaming and calibration attacks, driving overtrust or rejection of outputs and increasing security risk.

settings

Operational notes

Block confidence score fields in APIs/UIs, return only coarse bands if needed, and test logs/telemetry to ensure scores are never leaked.

Mapping detail

Mapping

Direction

Controls