Validation &
Claim Boundaries
The bench is designed around a simple principle. The system structures judgement. It does not replace it.
Permanent claim boundaries
These are not phase boundaries. They will not be lifted in a later version.
- ⊘No readiness scores
- ⊘No diagnostic findings
- ⊘No risk ratings
- ⊘No compliance determinations
- ⊘No legal conclusions
- ⊘No deployment approval
- ⊘No liability assessments
v0.2 implementation boundaries
These describe the current technical posture of the Mode 2 demo. They may change in later versions, by deliberate design choice and with explicit consent flows.
No autonomous inference on the customer-facing surface
The Deployment Judgement Snapshot does not use an LLM to interpret participant responses. It structures participant-provided context into a reviewable snapshot.
This is deliberate. The bench is designed to preserve human judgement rather than simulate it. Customer-facing outputs remain bounded by question structure, answer metadata, schema, and participant-provided context.
Agents may support research, scenario generation, and build-time tooling. They do not generate customer-facing conclusions in the current Mode 2 surface.
Validation checks
The current Mode 2 package includes local validation scripts that check the following properties of any snapshot output:
- Schema complianceOutput conforms to the snapshot schema. No fields outside the contract.
- Prohibited fieldsNo score, rating, grade, classification, risk level, or compliance status fields.
- Claim-language leakageNo diagnostic, evaluative, or approval language in copy or output.
- Free-text preservationParticipant responses are reflected verbatim — not summarised, classified, or reinterpreted.
- Answer classification metadataClassifications are user-assigned, not system-derived.
- Mode 3 absenceNo Mode 3 evidence-scaffold material is present in Mode 2 output.
- Local-only operationNo network requests during snapshot generation.
Artefacts under pilot / NDA
The following artefacts are available in pilot or partner contexts because they contain the working machinery: schemas, claim boundaries, validation reports, scenario logic, and scaffold specifications. The public site explains the method; partner review exposes the instrument.
- Mode Boundary Contract
- Snapshot Output Schema
- Claim Audit
- Validation Report
- Manual Demo Test Plan
- Mode 1 Simulator (private)
- Mode 3 Evidence Scaffold Specification
- Wind tunnel pattern library
If you are a buyer wondering why this page exists in this much detail: it exists because the distinction between an instrument that helps an organisation see itself and a tool that pretends to see it for them is the only distinction that matters. Everything else is decoration.