Skip to content
arrow_back
search
ISM-2090 policy ASD Information Security Manual (ISM)

Rate Limiting for AI Model Inference Queries

Limit how often AI queries are run to prevent system overuse and improve efficiency.

record_voice_over

Plain language

Rate limiting means setting limits on how often AI systems are allowed to process requests. This is important because without limits, the system could become overloaded, slow down, or even crash, causing disruptions and potentially leading to mistakes in important tasks.

Framework

ASD Information Security Manual (ISM)

Control effect

Preventative

Classifications

NC, OS, P, S, TS

ISM last updated

Nov 2025

Control Stack last updated

19 Mar 2026

E8 maturity levels

N/A

Official control statement

Rate limiting is applied to inference queries for artificial intelligence models.
policy ASD Information Security Manual (ISM) ISM-2090
priority_high

Why it matters

Without rate limiting, AI inference APIs can be abused, driving up compute costs and causing service degradation or denial of service for legitimate users.

settings

Operational notes

Monitor inference request rates and tune limits per client/model; log and review HTTP 429 events to detect abuse and adjust thresholds without blocking legitimate use.

build

Implementation tips

  • IT team should establish query limits: Determine the maximum number of requests that the AI system can handle without slowdowns. Analyse system performance data to set these limits effectively.
  • Procurement should coordinate with service providers: Engage with AI service suppliers to ensure they support rate limiting features. Confirm these features are included in contracts and service agreements.
  • Developers should integrate rate limiting in application code: Include features that monitor and cap requests. Use simple coding practices to prevent the system from processing too many requests at once, like setting time intervals or request counts.
  • System owners should monitor usage patterns: Continuously observe how often and for what purposes the AI is accessed. Adjust limits based on these insights to balance efficiency and accessibility.
  • Managers should educate staff: Conduct training sessions on why rate limits are in place. Use relatable examples to explain how these limits maintain system reliability and ensure fairness in access.
fact_check

Audit / evidence tips

  • AskSystem logs showing query volume: Request records that display the number of AI queries over a specific period GoodWould show consistent adherence to set limits
  • AskDocumentation on rate limit settings: Request the file or record detailing the current rate limits in use GoodIncludes both the numbers themselves and the rationale for their setting
  • AskContracts with service providers: Request copies of agreements with AI service suppliers GoodIs a legally binding document with relevant terms clearly outlined
  • AskIncident reports related to overload: Request any logs detailing system overload or failures due to excessive queries GoodWould show prompt response and adjustments where necessary
  • AskTraining materials given to staff: Request copies of any presentations or guides on rate limiting GoodIncludes materials with clear explanations and examples that resonate with everyday tasks
link

Cross-framework mappings

How ISM-2090 relates to controls across ISO/IEC 27001, ISO/IEC 42001, Essential Eight, and ASD ISM.

ISO 27001

Control Notes Details
handshake Supports (1) expand_less
Annex A 8.6 Annex A 8.6 requires monitoring and adjustment of resource use to prevent performance degradation or failures due to capacity shortfalls

These mappings show relationships between controls across frameworks. They do not imply full equivalence or certification.

Mapping detail

Mapping

Direction

Controls