Monitor spend and usage of models and tools, and alert on anomalous consumption indicative of Denial of Wallet or runaway agentic loops.
AI/ML / Multi Agent Refarch / Controls / DEV
AI System Alerting and Denial of Wallet Monitoring
CCC.MARefArc.CN18 · DET
Related Capabilities
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.CP16 | Model-interaction zero-trust guardrails | Enforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution. |
| CCC.MARefArc.CP06 | Agent collaboration and orchestration patterns | Supports supervisor/worker decomposition, skills-based routing, and agent-as-a-tool handoff for decomposing and executing complex tasks across multiple agents. |
| CCC.MARefArc.CP15 | LLM inference gateway routing | Validates inference requests and routes each to the correct model instance, abstracting model hosting behind a consistent interface. |
| CCC.MARefArc.CP14 | Approved-model registry and lifecycle | Catalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards. |
Related Threats
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.TH08 | Denial of Wallet via token-expensive or unthrottled agentic calls | Token-expensive prompts, large-document chunking, or poorly throttled agentic loops drive excessive model and tool invocations, exhausting token budgets, triggering throttling, or inflating cost beyond capacity planning. |
| CCC.MARefArc.TH09 | Technology service provider outage or degradation | Tight coupling to a specific external model provider with limited failover leaves the system exposed to provider outages or performance degradation under load, violating business-continuity expectations. |
| CCC.MARefArc.TH10 | VRAM exhaustion on model-serving infrastructure | Configuration changes, aggressive caching, or memory leaks in model-serving libraries behind the LLM gateway exhaust GPU VRAM, degrading responsiveness or crashing model serving. |
Assessment Requirements
| ID | Text | Applicability |
|---|---|---|
| CCC.MARefArc.CN18.AR01 | Model and tool consumption MUST be metered per consumer and monitored against budget and rate thresholds. | tlp-clear, tlp-green, tlp-amber, tlp-red |
| CCC.MARefArc.CN18.AR02 | Anomalous spend or call volume MUST raise alerts and MUST be able to trigger throttling or suspension. | tlp-clear, tlp-green, tlp-amber, tlp-red |
Guideline Mappings
| Framework | ID | Remarks |
|---|---|---|
| finos-air | AIR-DET-009 |