English → Verified Agent Pipelines
Compile natural language into typed, capability-bounded, budget-constrained DAGs with machine-checked Lean 4 proofs at every handoff point.
LLMs perform probabilistic retrieval under context pressure. The same prompt on the same model can select different tools on successive invocations. Selection is not deterministic, not provably correct, and not auditable.
Sub-agents receiving full conversation history carry forward irrelevant tokens. This wastes budget, dilutes attention, and introduces surface area for confabulation.
Without declared capability envelopes, a skill authorised to READ a calendar can attempt to WRITE to it. The capability boundary is implicit and unenforced.
For a pipeline of n steps each with success probability pi, joint success = ∏i pi. A 10-step pipeline at 95% per step achieves only 59.9% end-to-end. At 90%: 34.9%.
| Compilation Stage | CSC Equivalent |
|---|---|
| Lexical analysis | LLM-assisted tokenisation of English prompt |
| Parsing / AST | Semantic IR extraction (intent, entities, constraints) |
| Type checking | Lean 4 handoff proofs at every DAG edge |
| Optimisation | Parallel branch detection, budget partitioning |
| Code generation | SkillDAG with execution mode, triage map, traces |
| Test harness | Wind tunnel + chaos monkey |
Structural subtyping: source output has at least the fields required by target input, with compatible types.
No skill escalates beyond its declared envelope. Composed envelope bounded by intersection.
Total DAG budget proved ≤ available quota before any agent executes.
Synthesis only proceeds if all upstream SubTasks pass the ConfabulumRate gate (score < 0.65 on all 9 types). Proved in CSC/ConfabulumGate.lean.
Material decisions require a human auth token. escalation_gate_sound proved in CSC/EscalationGate.lean. Maps to FINRA Rule 3110/3120.
Hash-chain integrity: any modification to a historical record invalidates all successors. chain_tamper_evident proved in CSC/Groundtrace.lean.
Compliance is a path property, not just terminal-state. pathway_compliance_compositional proved. Addresses Goodhart's Law applied to agentic AI.
Advice SubTasks cannot execute without a fresh ClientProfile. profile_required_sound proved in CSC/ClientProfile.lean.
The router minimises token spend under an accuracy floor. route_cost_optimal proved in CSC/Router.lean. 9 theorems total.
HandoffProof structure and replayed on demand. Phase 1–5 theorems extend coverage to epistemic integrity, audit tamper-evidence, agentic risk, and client context freshness. Invariant set: I1–I13.
Tests English → SemanticIR robustness. Given a semantically fixed prompt, does paraphrase variation produce the same canonical IR?
Tests SkillDAG against input variation. Samples from property space of each skill's input types.
structure SkillSignature where
id : SkillId
name : String
level : SkillLevel -- L1 | L2 | L3
operation : Operation -- READ | WRITE | TRANSFORM
input_type : TypeSignature
output_type : TypeSignature
capability_envelope : CapabilityEnvelope
budget_ceiling : Nat
side_effects : List SideEffect
safety_class : SafetyClass -- safe | confirm | dangerous
structure HandoffProof where
edge : DAGEdge
type_proof : TypeProof
capability_proof : CapabilityProof
budget_proof : BudgetProof
structure SemanticIR where
intent : Intent
entities : List Entity
constraints : List Constraint
decomposition : List SubTask
uncertainty : List FormalisationBoundary
| Module | Description | Phase |
|---|---|---|
CSC.Router |
Task classifier — routes Factual / Structural / Formal to optimal execution path. 9 theorems including route_cost_optimal. |
P1 |
CSC.Groundtrace |
GroundtraceRecord hash-chain schema. Tamper-evident per-SubTask audit trail. chain_tamper_evident proved. |
P2 |
CSC.EscalationGate |
I11 — material decision gate requiring human auth token. FINRA Rule 3110/3120 supervisory substitution. escalation_gate_sound proved. |
P3 |
CSC.CompetenceSignal |
Certainty vocabulary gate — confidence level constrains synthesis language register. Addresses the “digital sociopath” problem. | P3 |
CSC.PathwayAudit |
DAG edge compliance monitor — deontic obligation set checked at every step. pathway_compliance_compositional proved. |
P3 |
CSC.ClientProfile |
I13 — first-class client context type. Freshness proof enforced structurally. profile_required_sound proved. |
P4 |
CSC.DataAdapter |
Typed interface for grounded data sources (FRED, EDGAR, PerplexitySearch). Includes stale_ground_blocked theorem. |
P4 |
CSC.FiduciaryScope |
Licensed operator wrapper — hard pipeline gate requiring fiduciary acceptance. Integrates with TokenGov.NormfallAlert. |
P5 |
The fiduciary AI problem (Lo, MIT Sloan, 2026) is that AI systems deliver correct and incorrect answers with equal confidence. CSC addresses this at five levels:
ConfabulumGate + RAG grounding + ConfidenceScore propagation.
9-type hallucination taxonomy with pipeline halts. TruthfulQA regression gate in CI.
EscalationGate (I11) + GroundtraceRecord + PathwayAudit + CompetenceSignal.
FINRA 3110/3120 supervisory substitution. Deontic compliance checked at every DAG edge.
FiduciaryScope.
Append-only AuditStore (Rule 17a-4 path). NormfallAlert monitors DOL/SEC/FINRA regulatory corpus.
ClientProfile (I13) + DataAdapter (FRED / EDGAR).
Staleness invariants enforced structurally. Advice cannot proceed without a resolved, fresh client profile.
FiduciaryScope + NormfallAlert (TokenGov).
Requires licensed operator. Stack refuses financial advice without signed FiduciaryScope. Engineering ceiling: P1 full closure requires regulatory assignment by a licensed entity.
Formal verification, 13 invariants (I1–I13), 4,163+ lines. Machine-checked proofs for type safety, capability safety, budget safety, confabulation gating, tamper-evident audit trails, agentic risk controls, and client context freshness.
Runtime, supervision trees, fault tolerance, 5,800+ lines. BEAM VM provides lightweight processes with preemptive scheduling for concurrent pipeline execution.
ClawCombinator Agent Protocol for execution traces and witnesshood. Provides third-party attestation of execution correctness across agent boundaries.
Mechanism-design token budget allocation (VCG/DSIC). Ensures truthful bidding for compute resources across competing agent pipelines.
CompanyAsCode orchestrator. Models organisational structure as typed, composable skill graphs with formal governance constraints.
Formal legal reasoning in Lean 4. Encodes regulatory obligations as propositions and verifies compliance before pipeline execution.
Formalised economics in Lean 4 — 253 theorems. Provides the auction theory and mechanism design foundations for TokenGov.
Regulatory compliance framework. Maps jurisdiction-specific requirements to machine-checkable constraints on agent behaviour.