Skip to main content

AI/ML / Multi Agent Refarch

Multi-Agent Reference Architecture Threats

Version: DEV

IDTitleDescriptionExternal MappingsCapability MappingsControl Mappings
CCC.MARefArc.TH01Model memorization leaks sensitive data across sessionsThe hosted models accessed through the LLM layer may memorize sensitive inputs or training data and later disclose customer PII, proprietary algorithms, or trading strategies, including cross-user leakage into unrelated sessions.
1
1
8
CCC.MARefArc.TH02Hosted-provider data-handling exposureSensitive data submitted through the LLM gateway to third-party hosted models is exposed when the provider lacks transparent encryption, retention limits, or secure-deletion guarantees, leaving the institution without control over data it no longer holds.
1
1
8
CCC.MARefArc.TH03Embedding inversion and membership inference on the vector storeVectors stored for semantic retrieval can be inverted to reconstruct original source text, or probed to infer whether specific confidential information is present, exposing PII or proprietary content held in the knowledge layer.
1
1
3
CCC.MARefArc.TH04Embedding-store poisoning degrades retrieved contextAn actor with write access injects malicious or misleading embeddings into the vector store, degrading the accuracy of retrieved grounding context; the dense numerical representation makes the tampering hard to detect.
1
1
3
CCC.MARefArc.TH05Vector-store access-control, encryption, and audit gapsMissing role-based access control, encryption at rest, or audit logging on the vector store allows unauthorized retrieval, modification, or undetected exfiltration of embeddings derived from sensitive internal data.
1
1
3
CCC.MARefArc.TH06Foundation-model training and fine-tuning data poisoningAdversaries tamper with training, fine-tuning, or third-party data feeds behind the approved models, mislabeling data or embedding backdoor triggers and biases that corrupt downstream decisions without visible symptoms until a major failure.
1
1
3
CCC.MARefArc.TH07Adaptive-learning and continuous-learning exploitationThe adaptive-learning capability that refines prompts and configurations from execution outcomes can be steered by an adversary who systematically feeds misleading signals, gradually skewing agent behaviour when validation of learning inputs is inadequate.
1
1
3
CCC.MARefArc.TH08Denial of Wallet via token-expensive or unthrottled agentic callsToken-expensive prompts, large-document chunking, or poorly throttled agentic loops drive excessive model and tool invocations, exhausting token budgets, triggering throttling, or inflating cost beyond capacity planning.
1
1
5
CCC.MARefArc.TH09Technology service provider outage or degradationTight coupling to a specific external model provider with limited failover leaves the system exposed to provider outages or performance degradation under load, violating business-continuity expectations.
1
1
5
CCC.MARefArc.TH10VRAM exhaustion on model-serving infrastructureConfiguration changes, aggressive caching, or memory leaks in model-serving libraries behind the LLM gateway exhaust GPU VRAM, degrading responsiveness or crashing model serving.
1
1
5
CCC.MARefArc.TH11Direct prompt injection overrides guardrailsAn actor interacting through the application crafts inputs that override system prompts, bypass safety guardrails, or coerce disclosure, requiring no special privileges and exploiting any gap in ingress and model-interaction guardrails.
1
1
3
CCC.MARefArc.TH12Indirect prompt injection via retrieved or processed contentMalicious instructions hidden in retrieved documents, web-search results, tool outputs, or persisted memory are processed by an agent and hijack its decision-making, escalate privileges, trigger unauthorized actions, or exfiltrate data, which is especially dangerous in automated multi-agent workflows.
1
1
3
CCC.MARefArc.TH13Model profiling and system-prompt extractionCrafted prompt sequences probe model internals to extract proprietary system prompts, configurations, or fine-tuning and RAG corpus content, enabling intellectual-property theft, model cloning, or follow-on attacks.
1
1
3
CCC.MARefArc.TH14Model overreach and scope creep beyond validated useAgents are used beyond their validated scope as users discover new applications or systems are repurposed without re-evaluation, producing unreliable outputs in untested contexts; weak registry scoping and orchestration boundaries accelerate the drift.
1
1
4
CCC.MARefArc.TH15Reputational harm from offensive or misleading outputsThe system generates offensive, misleading, or inappropriate outputs, or is manipulated into doing so, that are attributed to the organization, with reputational and regulatory impact when output filtering and human review are insufficient.
1
1
5
CCC.MARefArc.TH16Confident hallucination and fabricated factsLacking ground truth and faced with ambiguous prompts or helpfulness-biased tuning, the model fabricates plausible but false facts, figures, or citations, presented with high fluency that makes errors hard to catch and likely to be acted upon.
1
1
4
CCC.MARefArc.TH17Non-deterministic and non-reproducible outputsProbabilistic sampling, internal-state variation, context sensitivity, and decoding parameters cause identical inputs to yield different outputs across runs, undermining testing, reproducibility, and reliable evaluation.
1
1
5
CCC.MARefArc.TH18RAG grounding failuresEven with retrieval, responses may contradict retrieved documents, drop caveats truncated by the context window, fill gaps with incorrect general knowledge, exceed authorized advisory scope, or adopt an inappropriate tone or certainty for the domain.
1
1
4
CCC.MARefArc.TH19Silent model version, prompt, and deployment driftProviders silently retrain, re-prompt, or re-architect models, or change deployment and API defaults, shifting behaviour even when inputs are unchanged; without version pinning in the model registry this breaks reproducibility and validated behaviour.
1
1
5
CCC.MARefArc.TH20Model supply-chain tamperingAdversaries tamper with training data, weights, GPU firmware and operating systems, cloud orchestration, or ML libraries in the provider pipeline, embedding manipulations that are difficult to detect downstream of the LLM gateway.
1
1
3
CCC.MARefArc.TH21Backdoor triggers and safety-mechanism disablementWhere weights are accessible, adversarial fine-tuning, engineered trigger phrases, or tampering disables alignment and content-moderation safeguards, causing targeted unsafe behaviour under specific conditions.
1
1
3
CCC.MARefArc.TH22Poor-quality, drifting, and bias-amplifying dataInaccurate, incomplete, outdated, or biased grounding and training data lead to unreliable outputs, while data and concept drift erodes predictive power over time and amplifies historical errors at scale.
1
1
3
CCC.MARefArc.TH23Discriminatory outputs from biasBiased training data, architectural and feature choices, proxy variables such as postal codes, and uncorrected feedback loops cause systematically discriminatory outcomes against protected groups, with legal and reputational exposure.
1
1
4
CCC.MARefArc.TH24Lack of explainability and traceable rationaleBlack-box foundation models produce outputs without traceable rationale, leaving the firm unable to justify AI-driven decisions to regulators, stakeholders, or customers and allowing latent errors or biases to go undetected; observability and human oversight are the principal mitigating surfaces.
1
1
1
CCC.MARefArc.TH25Non-compliant outputs and model-risk-management gapsAI-generated advice, marketing, or communications that fail KYC, suitability, disclosure, record-keeping, or model-risk-management expectations create regulatory exposure; weak supervision and accountability lines turn this into direct non-compliance.
1
1
7
CCC.MARefArc.TH26Intellectual-property leakage and licensing violationsOutputs may replicate copyrighted training material, employees may leak trade secrets into AI tools, and improper platform licensing or terms-of-service violations create contractual and legal liability.
1
1
2
CCC.MARefArc.TH27Authorization bypass and tool-chain privilege escalationAgents discover and invoke API endpoints outside their use case, chain individually authorized calls into unauthorized outcomes, circumvent segregation-of-duties workflows, or experience permission creep during operation, defeating intended authorization boundaries.
1
1
5
CCC.MARefArc.TH28Tool selection, parameter, and sequencing manipulationCrafted inputs cause agents to select inappropriate tools, inject malicious parameters into legitimate calls, reorder tool execution into dangerous combinations, corrupt tool-state understanding, or pass one tool's output as malicious input to the next.
1
1
2
CCC.MARefArc.TH29MCP supply-chain compromiseExternal MCP servers are compromised, receive poisoned updates, are sabotaged by insiders, or have their protocol and transport manipulated through man-in-the-middle or downgrade attacks, or have connections redirected via DNS and infrastructure attacks, injecting malicious data or logic into services agents consume.
1
1
2
CCC.MARefArc.TH30Agent memory and state poisoningInjected instructions or corrupted reasoning patterns are written into agent short- or long-term memory, learned behaviours are corrupted over repeated exposure, state storage is attacked directly, and malicious instructions persist across sessions and users.
1
1
1
CCC.MARefArc.TH31Multi-agent collaboration compromiseMalicious or compromised agents inject harmful data into agent-to-agent channels, contaminate shared resources, impersonate higher-privilege agents, inherit privileges through interaction, or propagate cascade failures across dependent agents.
1
1
1
CCC.MARefArc.TH32Credential harvesting via agent tools and storageAgents are manipulated into using file, database, API, and cloud-management tools to enumerate and extract credentials from configuration files, environment variables, process memory, databases, key vaults, and instance metadata, and to correlate fragments into full credentials.
1
1
1