Skip to main content

AI/ML / Multi Agent Refarch / Threats / DEV

Reputational harm from offensive or misleading outputs

CCC.MARefArc.TH15

The system generates offensive, misleading, or inappropriate outputs, or is manipulated into doing so, that are attributed to the organization, with reputational and regulatory impact when output filtering and human review are insufficient.

Related Capabilities

IDTitleDescription
CCC.MARefArc.CP16Model-interaction zero-trust guardrailsEnforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.
CCC.MARefArc.CP22Runtime protectionMonitors agent actions and model outputs during execution to detect unsafe, non-compliant, or anomalous behavior, enforcing constraints, blocking disallowed actions, or triggering escalation.
CCC.MARefArc.CP02Human-in-the-loop output reviewApplication-embedded controls that allow users to review, approve, or modify agent outputs before they are executed or shared.

Related Controls

IDTitleDescription
CCC.MARefArc.CN02User, Application, and Model FirewallingEstablish enforced trust boundaries between the user, the application, and the models and tools by routing all traffic through the agent, LLM, and MCP gateways where guardrails inspect and constrain requests and responses.
CCC.MARefArc.CN05Legal and Contractual Frameworks for AI SystemsEstablish contractual controls with model and MCP service providers covering data handling, retention and deletion, intellectual property, liability, and supply-chain integrity.
CCC.MARefArc.CN10AI Firewall Implementation and ManagementImplement and operate an AI firewall within the guardrail components that inspects prompts, content, and responses for injection, sensitive data, and policy violations.
CCC.MARefArc.CN19Human Feedback Loop for AI SystemsCapture human feedback on agent outputs through the Feedback Engine and Human Supervision capabilities and feed it into evaluation and improvement of agents and models.
CCC.MARefArc.CN20Citations and Source Traceability for AI-Generated InformationAttach citations and source traceability to AI-generated information so that outputs can be verified against retrieved sources and decisions can be explained.

External Mappings

FrameworkIDRemarks
air-vecAIR-OP-020