Skip to content
arrow_back
search
ISM-2085 policy ASD Information Security Manual (ISM)

Prevent Exposure of AI Model Confidence Scores

Ensure AI model confidence scores are not shown to users or in API outputs.

record_voice_over

Plain language

This control ensures that the confidence scores from AI models are not shared with users or included in any outputs they can access. This is important because displaying these scores might lead users to overtrust the system's predictions or, alternatively, dismiss them entirely, both of which can result in poor decision-making and security risks.

Framework

ASD Information Security Manual (ISM)

Control effect

Preventative

Classifications

NC, OS, P, S, TS

ISM last updated

Nov 2025

Control Stack last updated

18 May 2026

E8 maturity levels

N/A

Official control statement

The exposure of exact artificial intelligence model confidence scores in API responses or user interfaces is prevented.
policy ASD Information Security Manual (ISM) ISM-2085
priority_high

Why it matters

Exposing exact model confidence scores can enable gaming and calibration attacks, driving overtrust or rejection of outputs and increasing security risk.

settings

Operational notes

Block confidence score fields in APIs/UIs, return only coarse bands if needed, and test logs/telemetry to ensure scores are never leaked.

build

Implementation tips

  • IT team should configure APIs to hide confidence scores from outputs. This can be done by adjusting the settings or filters within the API management tool to exclude these scores from the data provided to users.
  • Software developers should modify user interfaces to ensure confidence scores are not displayed. They can achieve this by updating the code to remove any elements or sections of the interface that may show these scores to users.
  • Product managers should review AI model integration documentation to identify where confidence scores might be exposed. They should work with developers to ensure such information is excluded before release.
  • Data security officers should regularly audit AI outputs for sensitive information like confidence scores. They should run tests on various outputs to ensure these scores aren't leaking into user-accessible data.
  • Training staff should educate users about the nature and limits of AI predictions. They can organise sessions explaining that while AI is a helpful tool, its predictions should be used alongside human judgement, without reliance on unseen confidence metrics.
fact_check

Audit / evidence tips

  • AskThe documented API configurations: Request a configuration file or document detailing API settings GoodWill exclude fields related to confidence scores
  • AskDesign blueprints or specifications for relevant user interfaces GoodIncludes screenshots or design mockups without any confidence score display
  • AskTest results of AI outputs: Request a report or summary of tests performed on the system GoodIncludes test results showing scores are not exposed
  • AskThe content used in staff training sessions GoodIncludes presentation slides or brochures addressing this point
link

Cross-framework mappings

How ISM-2085 relates to controls across ISO/IEC 27001, ISO/IEC 42001, Essential Eight, and ASD ISM.

ISO 27001

Control Notes Details
layers Partially meets (1) expand_less
Annex A 8.28 ISM-2085 requires organisations to prevent exposing exact AI model confidence scores in APIs and user interfaces

These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.

Mapping detail

Mapping

Direction

Controls