Inspect and validate input before it is passed to a GenAI model in order to filter or sanitise adversarial queries and prevent sensitive data leakage.
AI/ML / Gen AI / Controls / DEV
Model Input Filtering and Sanitisation
CCC.GenAI.CN01 · MachineLearning
Related Capabilities
| ID | Title | Description |
|---|---|---|
| CCC.Core.CP14 | API Access | The service exposes a port enabling external actors to interact programmatically with the service and its resources using HTTP protocol methods such as GET, POST, PUT, and DELETE. |
| CCC.GenAI.CP15 | Text-Based Prompts | Ability to input prompts in plain text. |
| CCC.GenAI.CP16 | Structured Prompts | Ability to provide structured input such as JSON as prompts. |
| CCC.GenAI.CP17 | Contextual Prompts | Ability to provide context or background information within the prompt to guide the response. |
| CCC.GenAI.CP18 | Interactive Prompts | Ability to use conversational prompts to create interactive dialogues. |
| CCC.GenAI.CP19 | Image-Based Prompts | Ability to input an image as a prompt to generate a response. |
| CCC.GenAI.CP20 | Custom Template Prompts | Ability to define custom templates or structures for prompts to standardize interactions with the models. |
| CCC.GenAI.CP21 | Generate Content | Ability to generate a response given a foundation model, parameter values, and a prompt. |
| CCC.GenAI.CP24 | Content Moderation | Ensure the service detects and filters abusive, harmful, and sensitive information to ensure responsible and safe use of the service. |
| CCC.Core.CP02 | Encryption at Rest Enabled by Default | The service automatically encrypts all data using industry-standard cryptographic protocols prior to being written to a storage medium. |
| CCC.Core.CP06 | Access Control | The service automatically enforces user configurations to restrict or allow access to a specific component or a child resource based on factors such as user identities, roles, groups, or attributes. |
| CCC.GenAI.CP22 | Data Control | Ensures prompts, model outputs, embeddings, and training data fed by customers are not used to train foundation models. |
Related Threats
| ID | Title | Description |
|---|---|---|
| CCC.GenAI.TH01 | Prompt Injection | Prompt injection may occur when crafted input is used to manipulate the GenAI model's behaviour, resulting in the generation of harmful or unintended outputs. Prompt injection can be either direct (performed via direct interaction with the model) or indirect (performed via external sources ingested by the model). Both text-based and multi-modal prompt injection is possible. |
| CCC.GenAI.TH03 | Sensitive Information Disclosure | Sensitive data can be memorised by the model from user interaction or training and may then be leaked to unintended and unauthorised parties by querying the model, for example through crafted prompts. |
Assessment Requirements
| ID | Text | Applicability |
|---|---|---|
| CCC.GenAI.CN01.AR01 | Untrusted input such as user queries, RAG data or tool output MUST be validated before it is passed to a GenAI model. | tlp-clear, tlp-green, tlp-amber, tlp-red |
| CCC.GenAI.CN01.AR02 | If malicious patterns such as prompt injection or sensitive data are detected during input validation, the input MUST be blocked or sanitised. | tlp-clear, tlp-green, tlp-amber, tlp-red |
Guideline Mappings
| Framework | ID | Remarks |
|---|---|---|
| FINOS-AIGF | AIR-PREV-003 | User/App/Model Firewalling/Filtering |
| FINOS-AIGF | AIR-PREV-017 | AI Firewall Implementation and Management |
| FINOS-AIGF | AIR-PREV-002 | Data Filtering From External Knowledge Bases |
| FINOS-AIGF | AIR-DET-001 | AI Data Leakage Prevention and Detection |
| SAIF | Input Validation and Sanitization | |
| MITRE-ATLAS | AML.M0020 | Generative AI Guardrails |
| MITRE-ATLAS | AML.M0021 | Generative AI Guidelines |
| MITRE-ATLAS | AML.M0015 | Adversarial Input Detection |