Continuous Improvement
A Torvalds-grade review of the 10-system formally-verified economic stack. Every number from source code, not documentation.
"The vision is the hard part, and you have it. But vision without execution is a whitepaper. Stop writing new specs. Sit down, close the sorry counts, and make one end-to-end path work."
Every sorry counted from source code via grep -r "sorry" --include="*.lean". Updated on each CI run.
| System | Theorems | Sorries | Rate | Status |
|---|---|---|---|---|
| EconLib4 | 253 | 191 | 75% | Triaged |
| CSC | 74 | 21 | 28% | CI Live + Phase 1–5 Complete |
| TokenGov | 39 | 12 | 31% | Triaged + Reduced |
| LegalLean | 87 | 29 | 33% | CI Live |
| OpenCompliance | ~5 | 0 | 0% | Trivial proofs |
| TOTAL | 458 | 253 | 55% |
Phase 1–5 fully proved (0 sorry):
chain_tamper_evident, valid_chain_all_links, escalation_gate_sound, escalation_gate_total, no_auth_blocks_material, non_material_always_allowed, pathway_compliance_compositional, empty_pathway_compliant, subpathway_compliant, violation_propagates, profile_required_sound, stale_ground_blocked
GitHub Actions deployed across all code repositories. Sorry regression gates prevent backsliding.
lake build + mix test + sorry regression gates
Automated weekly audit with per-module tracking
Elan: 114 own tests + 1,119 CompanyAsCode suite (corrected in docs)
CSC now require EconLib4 — first cross-system import in the stack
191 sorries classified: 89 provable, 85 hard, 11 axiom-candidate, 6 mathlib-gap
36 → 12 sorry. 13 closed via Float axiom promotion. SORRY_TRIAGE.md added.
3 sorry remaining on 17 theorems — make it the first 0-sorry module
CSC.FiduciaryScope deployed; cross-stack call: FiduciaryScope.check_normfall/1 → TokenGov.NormfallAlert.check_all/0
Templates created. Deploy to repos for reproducible builds.
Focus on SocialChoice (worst) and Information (feeds Spectral)
Implement core compression primitives in Lean. Even 5 proved theorems beats 0.
Write the TypeScript client. 500-line reference that validates one schema.
PARTIALLY DONE: FiduciaryScope (CSC) now calls NormfallAlert (TokenGov) — first live cross-system integration. Full EconLib4 mechanism → TokenGov allocation path still pending.
Synthetic load → CSC routing → latency/correctness metrics capture
Stack TruthfulQA score is 53.3%; regression gate set at ≥95%. RAG pipeline (Phase 1) built but target not yet verified post-RAG. Priority: HIGH.
BenchArena now emits audit_store_<run_id>.jsonl groundtrace files (Phase 2). Wire these as v1 Recursa trace inputs. Priority: HIGH.
89 of 191 sorries classified as provable. Prioritise Finance (put_call_parity already proved) and Information (feeds Spectral downstream). Priority: MEDIUM.
BBH stack=26.7% > standard=13.3% on Tier 7 run. Confirm stability on second run. Priority: MEDIUM.
Merge all Lean codebases. One lake build for the entire proof corpus.
Well-known results become explicit axioms. Honesty over pretence.
StreamData PBT for CSC + Elan. Bridges the proof-runtime gap.
Privacy-preserving verification for allocation + compliance.
Five structural gaps in fiduciary AI (after Lo's "digital sociopath" framing) addressed by the Phase 1–5 roadmap. Status as of April 2026.
| Phase | Gap | Status | Implementation |
|---|---|---|---|
| P1 | Legal Accountability | PARTIAL | FiduciaryScope requires licensed_entity + liability_acceptance_hash; full accountability requires regulatory recognition |
| P2 | Confabulation | CLOSED | ConfabulumRate gate (9 types, halt 0.65) + CompetenceSignal vocabulary gating + TruthfulQA regression gate |
| P3 | Missing Context | CLOSED | ClientProfile (I13, freshness-gated) + DataAdapter (PerplexitySearch + FRED, staleness-blocked) |
| P4 | Agentic Risk | CLOSED | EscalationGate (I11, material-decision HITL) + PathwayAudit (4 FINRA/SEC obligations, compositional proof) + GroundtraceRecord (tamper-evident audit trail) |
| P5 | Governance / Observability | CLOSED | NormfallAlert (struck-norm detection) + FiduciaryScope (licensed operator wrapper) + GroundtraceRecord (Rule 17a-4 path) |
No documented error taxonomy. Lean proofs assume happy paths. The Elixir runtime needs to handle the "impossible" cases the formal model excludes.
No migration story. If EconLib4 types change, downstream systems break silently. No lakefile dependency graph across the stack.
Ten systems, zero metrics, zero tracing, zero structured logging. No way to diagnose cross-layer failures.
TokenGov claims governance invariants but defines no threat model. The gap between "proved in Lean" and "enforced in production" is where exploits live.
Zero performance data. CSC compilation speed, Elan throughput, TokenGov allocation rate — all unknown. Recursa is a spec, not a runner.