Prevent Exposure of AI Model Confidence Scores
Ensure AI model confidence scores are not shown to users or in API outputs.
Plain language
This control ensures that the confidence scores from AI models are not shared with users or included in any outputs they can access. This is important because displaying these scores might lead users to overtrust the system's predictions or, alternatively, dismiss them entirely, both of which can result in poor decision-making and security risks.
Framework
ASD Information Security Manual (ISM)
Control effect
Preventative
Classifications
NC, OS, P, S, TS
ISM last updated
Nov 2025
Control Stack last updated
19 Mar 2026
E8 maturity levels
N/A
Guideline
Guidelines for software developmentOfficial control statement
The exposure of exact artificial intelligence model confidence scores in API responses or user interfaces is prevented.
Why it matters
Exposing exact model confidence scores can enable gaming and calibration attacks, driving overtrust or rejection of outputs and increasing security risk.
Operational notes
Block confidence score fields in APIs/UIs, return only coarse bands if needed, and test logs/telemetry to ensure scores are never leaked.
Implementation tips
- IT team should configure APIs to hide confidence scores from outputs. This can be done by adjusting the settings or filters within the API management tool to exclude these scores from the data provided to users.
- Software developers should modify user interfaces to ensure confidence scores are not displayed. They can achieve this by updating the code to remove any elements or sections of the interface that may show these scores to users.
- Product managers should review AI model integration documentation to identify where confidence scores might be exposed. They should work with developers to ensure such information is excluded before release.
- Data security officers should regularly audit AI outputs for sensitive information like confidence scores. They should run tests on various outputs to ensure these scores aren't leaking into user-accessible data.
- Training staff should educate users about the nature and limits of AI predictions. They can organise sessions explaining that while AI is a helpful tool, its predictions should be used alongside human judgment, without reliance on unseen confidence metrics.
Audit / evidence tips
-
Askthe documented API configurations: Request a configuration file or document detailing API settings
Goodwill exclude fields related to confidence scores
-
Askdesign blueprints or specifications for relevant user interfaces
Goodincludes screenshots or design mockups without any confidence score display
-
Asktest results of AI outputs: Request a report or summary of tests performed on the system
Goodincludes test results showing scores are not exposed
-
Askthe content used in staff training sessions
Goodincludes presentation slides or brochures addressing this point
Cross-framework mappings
How ISM-2085 relates to controls across ISO/IEC 27001, Essential Eight, and ASD ISM.
ISO 27001
| Control | Notes | Details |
|---|---|---|
| layers Partially meets (1) expand_less | ||
| Annex A 8.28 | ISM-2085 requires organisations to prevent exposing exact AI model confidence scores in APIs and user interfaces | |
These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.