Ensure all incoming embeddings are structurally and statistically validated before indexing to prevent poisoning or corruption.
Database / Vector / Controls / DEV
Validate Embeddings Before Indexing
CCC.Vector.CN01 · Ingestion
Related Capabilities
| ID | Title | Description |
|---|---|---|
| CCC.Vector.CP02 | Vector Indexing | Provides creation and management of indexes optimized for similarity search, such as HNSW, IVF, or PQ. |
| CCC.Vector.CP05 | Batch Ingestion | Allows for high-throughput batch upload and deletion of vectors and associated metadata. |
| CCC.Vector.CP07 | Index Lifecycle Management | Enables automated or manual creation, optimization, and removal of vector indexes. |
| CCC.Vector.CP08 | Embedding Format Compatibility | Supports standard vector formats and integrates with common embedding generators (e.g., OpenAI, HuggingFace, TensorFlow). |
| CCC.Vector.CP09 | Vector Dimension Management | Supports storing and managing vectors of specific or dynamic dimensionality, depending on model needs. |
| CCC.Core.CP04 | Transaction Rate Limits | The service can throttle, delay, or reject excess requests when transactions exceed a user-specified rate limit, and always provides industry-standard throughput up to that limit. |
| CCC.Core.CP16 | Budgeting | The service may be configured to take a user-specified action when a spending threshold is met or exceeded on a child or networked resource. |
| CCC.Core.CP19 | Child Resource Scaling | The service may be configured to scale child resources automatically or on-demand. |
Related Threats
| ID | Title | Description |
|---|---|---|
| CCC.Vector.TH02 | Embedding and Index Poisoning | Adversaries may insert malicious or adversarial vectors into the index through ingestion endpoints, polluting the dataset and degrading search quality, or subtly steering results toward specific outcomes. |
| CCC.Vector.TH05 | Embedding Format or Dimension Attacks | Poor validation of embedding formats or dimensions can cause service crashes or logic errors. This can result in denial of service or incorrect similarity results. |
| CCC.Core.TH12 | Resource Constraints are Exhausted | Exceeding the resource constraints through excessive consumption, resource-intensive operations, or lowering of rate-limit thresholds can impact the availability of elements such as memory, CPU, or storage. This may disrupt availability of the service or child resources by denying the associated functionality to users. If the impacted system is not designed to expect such a failure, the effect could also cascade to other services and resources. |
Assessment Requirements
| ID | Text | Applicability |
|---|---|---|
| CCC.Vector.CN01.AR01 | When a vector embedding is submitted for indexing, the system MUST validate that it matches expected schema, dimension, and format profiles. | tlp-clear, tlp-green, tlp-amber, tlp-red |
Guideline Mappings
| Framework | ID | Remarks |
|---|---|---|
| FINOS-AIGF | AIR-PREV-002 | Data Filtering From External Knowledge Bases |