Formal Economics in Lean 4
A comprehensive Lean 4 library formalising economics, game theory, mechanism design, trust, cryptography, information theory, and semantic compression. 12,500 lines. 253 theorems. Built on Mathlib4.
12 new sorry-free theorems proved downstream in CSC Phases 1–5 (EscalationGate, PathwayAudit, ClientProfile, DataAdapter)
View on GitHubArchitecture
Six layers of formalisation, from Mathlib foundations to applied economics and finance.
↑ Dependencies flow downward — each layer imports from layers below
Modules
Each module formalises a complete subfield with structures, definitions, and theorems.
Expected utility theory with full VNM axiomatisation, risk aversion classification, and the Arrow-Pratt measure of absolute risk aversion.
Shannon information theory: entropy, mutual information, channel capacity, and the data processing inequality. Includes f-divergences and Fano's inequality.
Normal-form and extensive-form games, Nash equilibria, mixed strategies, Bayesian games, cooperative game theory with Shapley value and nucleolus. Includes classic games.
Direct mechanisms, the revelation principle, VCG with dominant strategy incentive compatibility proof, Clarke payments, and auction theory with revenue equivalence.
Voting rules (plurality, Borda, Condorcet, IRV), Arrow's impossibility theorem, Gibbard-Satterthwaite, and strategy-proofness.
Moral hazard (principal-agent, first/second best) and adverse selection (screening, signalling, Rothschild-Stiglitz separation).
Reputation systems with Bayesian updating (Milgrom-Roberts), commitment devices, escrow, staking, and trust transitivity.
Game-based cryptographic security definitions (IND-CPA/CCA, OWF, PRG, PRF), commitment schemes with hiding/binding, and zero-knowledge proofs.
Source coding theory, Kraft inequality, Shannon's source coding theorem, Huffman coding, Kolmogorov complexity, NCD, and minimum description length.
Verified semantic-preserving compression of LLM prompts. Groundtrace with proved soundness and compression. NoisyChannel models LLMs as Shannon channels. Tersiture metric and confabulation risk. The groundtrace_sound theorem (fully proved ✓) is the formal ancestor of CSC.GroundtraceRecord — a 20-field tamper-evident audit struct deployed in production with SHA-256 hash chaining and Rule 17a-4 compliance path.
Fundamental theorem of asset pricing, no-arbitrage conditions, Black-Scholes, put-call parity (proved), and martingale pricing.
Prospect theory, Kahneman-Tversky value function, probability weighting, Allais paradox, and framing effects.
Online learning with regret bounds, multiplicative weights update (proved regret bound), follow-the-regularised-leader, UCB, and no-regret implies coarse correlated equilibrium.
Theorems
Key theorems and their connections across modules.
| Theorem | Connects | Status |
|---|---|---|
| Groundtrace Soundness | SemanticCompression | Proved ✓ |
| Groundtrace Compression | SemanticCompression | Proved ✓ |
| Put-Call Parity | Finance | Proved ✓ |
| VNM Representation | Utility ↔ Probability | Stated |
| Revelation Principle | MechanismDesign ↔ Bayesian Games | Stated |
| VCG is DSIC | MechanismDesign ↔ Game Theory | Stated |
| Myerson Optimal Auction | Auctions ↔ Utility | Stated |
| Arrow's Impossibility | SocialChoice ↔ Order Theory | Stated |
| Gibbard-Satterthwaite | SocialChoice ↔ MechanismDesign | Stated |
| Shannon Source Coding | Information ↔ Coding | Stated |
| Data Processing Inequality | Information ↔ all downstream | Stated |
| FTAP (finite) | Finance ↔ Probability | Stated |
| No-Regret → CCE | Learning ↔ GameTheory | Stated |
Code
Lean 4 structures and proofs from the library.
structure GroundtraceStep (D : Type*) [SemanticDomain D] where
before : TokenSequence D
after : TokenSequence D
rule : RewriteRule D
sound : SemanticEquivalence before after
compresses : after.length < before.length
structure VCGMechanism (env : QLEnvironment) where
allocation : (∀ i, env.Θ i) → env.X
efficient : ∀ θ, allocation θ ∈ argmax (fun x => ∑ i, env.v i x (θ i))
payment : ∀ i, (∀ j, env.Θ j) → ℝ
dsic : vcg_is_dsic env allocation payment
theorem groundtrace_sound {D : Type*} [SemanticDomain D]
(trace : Groundtrace D) :
SemanticEquivalence trace.source trace.target := by
induction trace.steps with
| nil => exact SemanticEquivalence.refl _
| cons step rest ih =>
exact SemanticEquivalence.trans step.sound ih
SemanticCompression
Formalises verified semantic-preserving compression of LLM prompts. Each compression step carries a machine-checked certificate that meaning is preserved.
Chain of meaning-preserving rewrites. Sound (semantics preserved) and compressing (tokens strictly decrease). Both properties are fully proved in Lean 4 — no sorry.
Models LLMs as Shannon channels. Defines confabulation risk as error probability. Proves rate bounds: if semantic rate exceeds channel capacity, confabulation is inevitable.
Quantitative metric: ratio of current length to optimal (canonical form). tersiture = 1 iff maximally compressed. Gives a precise progress measure for compression.
Per-token information content, surprisal, and redundancy. Connects Shannon information measures to the token-level structure of language model inputs.
Design
The guiding principles behind EconLib4's architecture.
Mechanisms are functors, channels are Kleisli morphisms, games are indexed families in monoidal categories.
Proofs carry constructive witnesses. Don't just prove existence — compute the equilibrium.
Every compression step carries a machine-checked certificate. Soundness and compression are theorem-level guarantees.
Quantitative compression progress metric. tersiture = 1 is the target; every step is measurable.
Contribute
The sorry-marked proofs are an invitation. Each one is a well-typed statement waiting for a constructive proof.
Ecosystem
EconLib4 provides the formal backbone for several downstream projects.
Mechanism-design token allocation. Uses VCG, Reputation, and Learning modules.
Composable Skill Compiler. Uses Shapley, Groundtrace, NoisyChannel, GroundtraceRecord (Phase 2), ConfabulumRate (Phase 1), and CertaintyVocabulary (Phase 1).
Formal legal reasoning. Uses ExtensiveForm, MoralHazard, and Arrow.
CompanyAsCode orchestrator. Uses AdverseSelection and Regret.
Agent protocol. Uses Commitment, ZeroKnowledge, and Trust.
Stack Integration
How EconLib4 abstract concepts are now concretely operationalised in the production stack (Phases 1–5).
| EconLib4 Concept | Stack Operationalisation | Phase |
|---|---|---|
| SemanticCompression.Groundtrace (groundtrace_sound ✓) | CSC.GroundtraceRecord — 20-field SHA-256 hash chain audit trail | Phase 2 |
| SemanticCompression.NoisyChannel / confabulation risk | CSC.ConfabulumRate — 9-type gate, halt at 0.65 aggregate score | Phase 1 |
| SemanticCompression.Tersiture metric | CSC.CertaintyVocabulary — 5-level Verified→Halted confidence vocab | Phase 1 |
| MechanismDesign.VCG + Learning.Regret | TokenGov allocation (unchanged) | Pre-existing |
| Crypto.ZeroKnowledge (special_soundness, fiat_shamir) | Spectral coalition safety proofs (unchanged) | Pre-existing |