Embedding Model Selection

CCC.GenAI.CP03

Ability to select a foundation model used for tasks like semantic search, clustering, and document similarity by converting text into vector embeddings.

Related Threats

ID	Title	Description
CCC.GenAI.TH02	Data Poisoning	Data poisoning occurs when training, fine-tuning or embedding data is tampered with in order to modify the model's behaviour, for example steering it towards specific outputs, degrading performance or introducing backdoors.
CCC.GenAI.TH04	Insecure / Unreliable Model Output	A GenAI model may generate content that is incorrect, misleading or harmful, such as convincing misinformation (hallucinations) or vulnerable or malicious code, due to its reliance on statistical patterns rather than factual understanding. Directly using this flawed output without validation can lead to system compromises, poor decision-making, and legal or reputational damage.
CCC.GenAI.TH08	Model Tampering	Supply chain risks, including tampering with a model's core components at any stage of its lifecycle—from its source code and training data to the final deployable artifact—may result in embedding backdoors or adversarial triggers altering model behaviour under certain conditions.