Skip to main content

AI/ML / Multi Agent Refarch / Threats / DEV

Backdoor triggers and safety-mechanism disablement

CCC.MARefArc.TH21

Where weights are accessible, adversarial fine-tuning, engineered trigger phrases, or tampering disables alignment and content-moderation safeguards, causing targeted unsafe behaviour under specific conditions.

Related Capabilities

IDTitleDescription
CCC.MARefArc.CP16Model-interaction zero-trust guardrailsEnforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.
CCC.MARefArc.CP14Approved-model registry and lifecycleCatalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards.

Related Controls

IDTitleDescription
CCC.MARefArc.CN05Legal and Contractual Frameworks for AI SystemsEstablish contractual controls with model and MCP service providers covering data handling, retention and deletion, intellectual property, liability, and supply-chain integrity.
CCC.MARefArc.CN08Role-Based Access Control for AI DataEnforce least-privilege, role-based access control over all AI data stores, including source bases, the vector store, and model artifacts.
CCC.MARefArc.CN13MCP Server Security GovernanceGovern the onboarding, verification, and ongoing monitoring of MCP servers so that only approved, integrity-verified servers are reachable, and supply-chain compromise is detected.

External Mappings

FrameworkIDRemarks
air-vecAIR-SEC-008-04
air-vecAIR-SEC-008-05