AI/ML / Multi Agent Refarch

Multi-Agent Reference Architecture Threats

Version: DEV

ID	Title	Description	External Mappings	Capability Mappings	Control Mappings
CCC.MARefArc.TH01	Model memorization leaks sensitive data across sessions	The hosted models accessed through the LLM layer may memorize sensitive inputs or training data and later disclose customer PII, proprietary algorithms, or trading strategies, including cross-user leakage into unrelated sessions.	1	1	8
CCC.MARefArc.TH02	Hosted-provider data-handling exposure	Sensitive data submitted through the LLM gateway to third-party hosted models is exposed when the provider lacks transparent encryption, retention limits, or secure-deletion guarantees, leaving the institution without control over data it no longer holds.	1	1	8
CCC.MARefArc.TH03	Embedding inversion and membership inference on the vector store	Vectors stored for semantic retrieval can be inverted to reconstruct original source text, or probed to infer whether specific confidential information is present, exposing PII or proprietary content held in the knowledge layer.	1	1	3
CCC.MARefArc.TH04	Embedding-store poisoning degrades retrieved context	An actor with write access injects malicious or misleading embeddings into the vector store, degrading the accuracy of retrieved grounding context; the dense numerical representation makes the tampering hard to detect.	1	1	3
CCC.MARefArc.TH05	Vector-store access-control, encryption, and audit gaps	Missing role-based access control, encryption at rest, or audit logging on the vector store allows unauthorized retrieval, modification, or undetected exfiltration of embeddings derived from sensitive internal data.	1	1	3
CCC.MARefArc.TH06	Foundation-model training and fine-tuning data poisoning	Adversaries tamper with training, fine-tuning, or third-party data feeds behind the approved models, mislabeling data or embedding backdoor triggers and biases that corrupt downstream decisions without visible symptoms until a major failure.	1	1	3
CCC.MARefArc.TH07	Adaptive-learning and continuous-learning exploitation	The adaptive-learning capability that refines prompts and configurations from execution outcomes can be steered by an adversary who systematically feeds misleading signals, gradually skewing agent behaviour when validation of learning inputs is inadequate.	1	1	3
CCC.MARefArc.TH08	Denial of Wallet via token-expensive or unthrottled agentic calls	Token-expensive prompts, large-document chunking, or poorly throttled agentic loops drive excessive model and tool invocations, exhausting token budgets, triggering throttling, or inflating cost beyond capacity planning.	1	1	5
CCC.MARefArc.TH09	Technology service provider outage or degradation	Tight coupling to a specific external model provider with limited failover leaves the system exposed to provider outages or performance degradation under load, violating business-continuity expectations.	1	1	5
CCC.MARefArc.TH10	VRAM exhaustion on model-serving infrastructure	Configuration changes, aggressive caching, or memory leaks in model-serving libraries behind the LLM gateway exhaust GPU VRAM, degrading responsiveness or crashing model serving.	1	1	5
CCC.MARefArc.TH11	Direct prompt injection overrides guardrails	An actor interacting through the application crafts inputs that override system prompts, bypass safety guardrails, or coerce disclosure, requiring no special privileges and exploiting any gap in ingress and model-interaction guardrails.	1	1	3
CCC.MARefArc.TH12	Indirect prompt injection via retrieved or processed content	Malicious instructions hidden in retrieved documents, web-search results, tool outputs, or persisted memory are processed by an agent and hijack its decision-making, escalate privileges, trigger unauthorized actions, or exfiltrate data, which is especially dangerous in automated multi-agent workflows.	1	1	3
CCC.MARefArc.TH13	Model profiling and system-prompt extraction	Crafted prompt sequences probe model internals to extract proprietary system prompts, configurations, or fine-tuning and RAG corpus content, enabling intellectual-property theft, model cloning, or follow-on attacks.	1	1	3
CCC.MARefArc.TH14	Model overreach and scope creep beyond validated use	Agents are used beyond their validated scope as users discover new applications or systems are repurposed without re-evaluation, producing unreliable outputs in untested contexts; weak registry scoping and orchestration boundaries accelerate the drift.	1	1	4
CCC.MARefArc.TH15	Reputational harm from offensive or misleading outputs	The system generates offensive, misleading, or inappropriate outputs, or is manipulated into doing so, that are attributed to the organization, with reputational and regulatory impact when output filtering and human review are insufficient.	1	1	5
CCC.MARefArc.TH16	Confident hallucination and fabricated facts	Lacking ground truth and faced with ambiguous prompts or helpfulness-biased tuning, the model fabricates plausible but false facts, figures, or citations, presented with high fluency that makes errors hard to catch and likely to be acted upon.	1	1	4
CCC.MARefArc.TH17	Non-deterministic and non-reproducible outputs	Probabilistic sampling, internal-state variation, context sensitivity, and decoding parameters cause identical inputs to yield different outputs across runs, undermining testing, reproducibility, and reliable evaluation.	1	1	5
CCC.MARefArc.TH18	RAG grounding failures	Even with retrieval, responses may contradict retrieved documents, drop caveats truncated by the context window, fill gaps with incorrect general knowledge, exceed authorized advisory scope, or adopt an inappropriate tone or certainty for the domain.	1	1	4
CCC.MARefArc.TH19	Silent model version, prompt, and deployment drift	Providers silently retrain, re-prompt, or re-architect models, or change deployment and API defaults, shifting behaviour even when inputs are unchanged; without version pinning in the model registry this breaks reproducibility and validated behaviour.	1	1	5
CCC.MARefArc.TH20	Model supply-chain tampering	Adversaries tamper with training data, weights, GPU firmware and operating systems, cloud orchestration, or ML libraries in the provider pipeline, embedding manipulations that are difficult to detect downstream of the LLM gateway.	1	1	3
CCC.MARefArc.TH21	Backdoor triggers and safety-mechanism disablement	Where weights are accessible, adversarial fine-tuning, engineered trigger phrases, or tampering disables alignment and content-moderation safeguards, causing targeted unsafe behaviour under specific conditions.	1	1	3
CCC.MARefArc.TH22	Poor-quality, drifting, and bias-amplifying data	Inaccurate, incomplete, outdated, or biased grounding and training data lead to unreliable outputs, while data and concept drift erodes predictive power over time and amplifies historical errors at scale.	1	1	3
CCC.MARefArc.TH23	Discriminatory outputs from bias	Biased training data, architectural and feature choices, proxy variables such as postal codes, and uncorrected feedback loops cause systematically discriminatory outcomes against protected groups, with legal and reputational exposure.	1	1	4
CCC.MARefArc.TH24	Lack of explainability and traceable rationale	Black-box foundation models produce outputs without traceable rationale, leaving the firm unable to justify AI-driven decisions to regulators, stakeholders, or customers and allowing latent errors or biases to go undetected; observability and human oversight are the principal mitigating surfaces.	1	1	1
CCC.MARefArc.TH25	Non-compliant outputs and model-risk-management gaps	AI-generated advice, marketing, or communications that fail KYC, suitability, disclosure, record-keeping, or model-risk-management expectations create regulatory exposure; weak supervision and accountability lines turn this into direct non-compliance.	1	1	7
CCC.MARefArc.TH26	Intellectual-property leakage and licensing violations	Outputs may replicate copyrighted training material, employees may leak trade secrets into AI tools, and improper platform licensing or terms-of-service violations create contractual and legal liability.	1	1	2
CCC.MARefArc.TH27	Authorization bypass and tool-chain privilege escalation	Agents discover and invoke API endpoints outside their use case, chain individually authorized calls into unauthorized outcomes, circumvent segregation-of-duties workflows, or experience permission creep during operation, defeating intended authorization boundaries.	1	1	5
CCC.MARefArc.TH28	Tool selection, parameter, and sequencing manipulation	Crafted inputs cause agents to select inappropriate tools, inject malicious parameters into legitimate calls, reorder tool execution into dangerous combinations, corrupt tool-state understanding, or pass one tool's output as malicious input to the next.	1	1	2
CCC.MARefArc.TH29	MCP supply-chain compromise	External MCP servers are compromised, receive poisoned updates, are sabotaged by insiders, or have their protocol and transport manipulated through man-in-the-middle or downgrade attacks, or have connections redirected via DNS and infrastructure attacks, injecting malicious data or logic into services agents consume.	1	1	2
CCC.MARefArc.TH30	Agent memory and state poisoning	Injected instructions or corrupted reasoning patterns are written into agent short- or long-term memory, learned behaviours are corrupted over repeated exposure, state storage is attacked directly, and malicious instructions persist across sessions and users.	1	1	1
CCC.MARefArc.TH31	Multi-agent collaboration compromise	Malicious or compromised agents inject harmful data into agent-to-agent channels, contaminate shared resources, impersonate higher-privilege agents, inherit privileges through interaction, or propagate cascade failures across dependent agents.	1	1	1
CCC.MARefArc.TH32	Credential harvesting via agent tools and storage	Agents are manipulated into using file, database, API, and cloud-management tools to enumerate and extract credentials from configuration files, environment variables, process memory, databases, key vaults, and instance metadata, and to correlate fragments into full credentials.	1	1	1