Skip to main content

AI/ML / Multi Agent Refarch / Capabilities / DEV

Vector-based semantic retrieval

CCC.MARefArc.CP13

Vector databases providing semantic search and grounding so agents can find relevant information from large text corpora.

Related Threats

IDTitleDescription
CCC.MARefArc.TH03Embedding inversion and membership inference on the vector storeVectors stored for semantic retrieval can be inverted to reconstruct original source text, or probed to infer whether specific confidential information is present, exposing PII or proprietary content held in the knowledge layer.
CCC.MARefArc.TH04Embedding-store poisoning degrades retrieved contextAn actor with write access injects malicious or misleading embeddings into the vector store, degrading the accuracy of retrieved grounding context; the dense numerical representation makes the tampering hard to detect.
CCC.MARefArc.TH05Vector-store access-control, encryption, and audit gapsMissing role-based access control, encryption at rest, or audit logging on the vector store allows unauthorized retrieval, modification, or undetected exfiltration of embeddings derived from sensitive internal data.
CCC.MARefArc.TH12Indirect prompt injection via retrieved or processed contentMalicious instructions hidden in retrieved documents, web-search results, tool outputs, or persisted memory are processed by an agent and hijack its decision-making, escalate privileges, trigger unauthorized actions, or exfiltrate data, which is especially dangerous in automated multi-agent workflows.
CCC.MARefArc.TH18RAG grounding failuresEven with retrieval, responses may contradict retrieved documents, drop caveats truncated by the context window, fill gaps with incorrect general knowledge, exceed authorized advisory scope, or adopt an inappropriate tone or certainty for the domain.
CCC.MARefArc.TH22Poor-quality, drifting, and bias-amplifying dataInaccurate, incomplete, outdated, or biased grounding and training data lead to unreliable outputs, while data and concept drift erodes predictive power over time and amplifies historical errors at scale.