Probabilistic sampling, internal-state variation, context sensitivity, and decoding parameters cause identical inputs to yield different outputs across runs, undermining testing, reproducibility, and reliable evaluation.
AI/ML / Multi Agent Refarch / Threats / DEV
Non-deterministic and non-reproducible outputs
CCC.MARefArc.TH17
Related Capabilities
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.CP14 | Approved-model registry and lifecycle | Catalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards. |
| CCC.MARefArc.CP15 | LLM inference gateway routing | Validates inference requests and routes each to the correct model instance, abstracting model hosting behind a consistent interface. |
| CCC.MARefArc.CP20 | Feedback engine | Collects and aggregates structured and unstructured feedback from users, evaluators, and automated systems, including correctness assessments, preference signals, and quality ratings, to inform system improvement. |
Related Controls
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.CN03 | System Acceptance Testing | Validate agents, models, and end-to-end workflows against accuracy, robustness, bias, drift, and compliance criteria before promotion to production, and re-validate after material changes. |
| CCC.MARefArc.CN07 | AI Model Version Pinning | Pin and record explicit model versions in the Model Registry so that model behaviour is reproducible and provider-side changes are surfaced rather than silently absorbed. |
| CCC.MARefArc.CN17 | AI System Observability | Instrument every layer to emit logs, traces, metrics, and events to the Observability Layer so that behaviour, drift, availability, and data handling are continuously visible and auditable. |
| CCC.MARefArc.CN19 | Human Feedback Loop for AI Systems | Capture human feedback on agent outputs through the Feedback Engine and Human Supervision capabilities and feed it into evaluation and improvement of agents and models. |
| CCC.MARefArc.CN21 | Automated Evaluation Using LLM-as-a-Judge | Use automated model-based evaluation in the Evaluation Layer to assess output quality, grounding, bias, and policy compliance at scale. |
External Mappings
| Framework | ID | Remarks |
|---|---|---|
| air-vec | AIR-OP-006-01 | |
| air-vec | AIR-OP-006-02 | |
| air-vec | AIR-OP-006-03 | |
| air-vec | AIR-OP-006-04 |