Preventing Prompt Injection in AI Applications
AI apps must check user inputs to stop harmful or unintended content creation.
Plain language
AI applications need to be careful about user inputs to avoid creating content that's harmful or goes against what the app is meant to do. If inputs aren't checked properly, the app could generate misleading or hazardous information, which can lead to serious consequences for the users and the reputation of the organisation.
Framework
ASD Information Security Manual (ISM)
Control effect
Preventative
Classifications
NC, OS, P, S, TS
ISM last updated
Aug 2025
Control Stack last updated
19 Mar 2026
E8 maturity levels
N/A
Guideline
Guidelines for software developmentTopic
Prompt InjectionOfficial control statement
Generative artificial intelligence applications evaluate user prompts to detect and mitigate adversarial inputs or suffixes designed to elicit unintended behaviour or assist in the generation of sensitive or harmful content.
Why it matters
Failure to evaluate prompts for injection may allow adversarial inputs to bypass safeguards, generating sensitive/harmful content and causing legal and reputational harm.
Operational notes
Test prompts against known prompt-injection patterns (instruction overrides, jailbreak suffixes, data-exfiltration cues) and tune detectors/filters; log and review bypass attempts regularly.
Implementation tips
- System developers should establish a policy for handling user inputs. They can do this by creating a checklist of common harmful inputs and work with security experts to design ways to identify and block them.
- IT teams should implement regular input monitoring. They can use software tools that flag unusual patterns in user inputs and provide alerts for potential risks.
- AI application managers should train staff on recognising harmful prompts. Hold workshops where employees learn about potential risks and report suspicious input attempts.
- Product managers should coordinate with developers to test the app's response to various inputs. Run simulations with different kinds of user prompts to see how the system reacts and adjust the system to close any loopholes.
- Organisational leaders should ensure there is a review process in place. Schedule regular reviews with internal or external auditors to assess the effectiveness of input handling policies and procedures.
Audit / evidence tips
-
Askthe input validation policy document: Request the written policy that outlines how user inputs are handled
Goodshows a comprehensive policy with defined protocols for blocking or handling risky inputs
-
Goodincludes regular, detailed reports showing how flagged inputs were managed
-
Asktraining materials or records related to harmful prompt recognition: Review the content and attendance records of the training sessions
Goodhas up-to-date materials and attendance logs showing staff participation
-
Goodincludes evidence of thorough testing and subsequent system adjustments
-
Askrecords of reviews or audits conducted on input handling processes: Review the findings of these audits
Goodshould show clear outcomes of audits and any actions taken following reviews with documented changes
Cross-framework mappings
How ISM-1924 relates to controls across ISO/IEC 27001, Essential Eight, and ASD ISM.
ISO 27001
| Control | Notes | Details |
|---|---|---|
| handshake Supports (2) expand_less | ||
| Annex A 8.12 | ISM-1924 focuses on preventing prompt injection so the AI does not generate or disclose sensitive or harmful content due to adversarial p... | |
| Annex A 8.16 | ISM-1924 requires generative AI applications to evaluate user prompts to detect and mitigate prompt injection attempts that could cause u... | |
| extension Depends on (2) expand_less | ||
| Annex A 8.26 | ISM-1924 requires generative AI solutions to evaluate user prompts and mitigate adversarial inputs intended to elicit unintended behaviou... | |
| Annex A 8.28 | ISM-1924 requires organisations to build AI applications that can identify and mitigate adversarial prompt content (e.g | |
These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.