Lean 4: A Unified Platform for Programming and Verification

Leo de Moura
Senior Principal Applied Scientist, AWS
Chief Architect, Lean FRO
NASA Formal Methods Symposium | May 6, 2026
Lean

zlib in Lean

A general-purpose AI converted zlib (a C compression library) to Lean, passed the test suite, and proved the code correct.

theorem zlib_decompressSingle_compress (data : ByteArray) (level : UInt8)
    (hsize : data.size < 1024 * 1024 * 1024) :
    ZlibDecode.decompressSingle (ZlibEncode.compress data level) = .ok data

Round-trip correctness, every compression level, machine-checked. lean-zip

When AI Writes the World's Software, Who Verifies It?

Lean

What is Lean?

A programming language and a proof assistant

Same language for code and proofs.

Lean is implemented in Lean. Very extensible.

Lean is scalable.

Small trusted kernel. Proofs can be exported and independently checked.

Lean

Programming and Theorem Proving in Lean

"You have written my favorite computer game" — Kevin Buzzard

def reverse : List α List α | [] => [] | x :: xs => reverse xs ++ [x] theorem reverse_append (xs ys : List α) : reverse (xs ++ ys) = reverse ys ++ reverse xs := by induction xs with | nil => simp [reverse] | cons x xs ih => simp [reverse, ih] #eval reverse [1, 2, 3]
[3, 2, 1]

The "game board": you see goals and hypotheses, then apply moves (tactics).

Each tactic transforms the goal until nothing remains. The kernel checks the final proof term.

Lean

The Kernel Catches Mistakes

example : reverse [1, 2, 3] = [1, 2, 3] := by simp [reverse]

You don't need to trust me, or my automation. You only need to trust the small kernel.

Incorrect proofs are rejected.

Lean

Multiple Independent Kernels

arena.lean-lang.org hosts independent kernel implementations.

  • The reference kernel — C++.

  • nanoda — Rust.

  • lean4lean — Lean.

  • nanobruijn — Rust.

  • Many others.

Anyone can build a kernel and submit it.

Disagreement between independent kernels exposes bugs that neither would catch alone.

Lean

The State of Lean

250,000+ unique installations: VS Code (143K) + Open VSX (107K).

9,000+ GitHub repositories depending on Lean.

Industry: AWS, Google, Microsoft, Mistral, Nethermind, Galois, others.

"Lean has become the de facto choice for AI-based systems of mathematical reasoning." — ACM SIGPLAN Award

Lean

Lean as an IDE

Real-time feedback: errors, goals, and hints as you type. The IDE is where we collaborate with AI now.

Claude Code: sum.lean with sorry

Lean

AI Proves the Theorem

AI found the proof and explained its strategy in natural language:

Claude Code: sum.lean proved

Lean

Interactive Walkthrough: Sum Formula

def sum (n : Nat) : Nat := if n = 0 then 0 else n + sum (n-1) theorem sum_eq : 2 * sum n = n * (n+1) := by induction n with | zero => simp [sum] | succ n ih => unfold sum; simp rw [Nat.mul_add, ih, Nat.add_mul] rw [Nat.add_comm 2 n, Nat.mul_comm]
Lean

Mathlib: 2.2M+ Lines of Formalized Mathematics

  • 270,000+ formalized theorems.

  • 750+ contributors.

  • 2.2M+ lines of Lean.

  • 1,500+ type classes and 20,000+ instances.

  • No competing proof assistant has this engagement.

Mathlib dependency graph

Lean

The Liquid Tensor Experiment

In 2020, Peter Scholze posed a formalization challenge.

"I spent much of 2019 obsessed with the proof of this theorem, almost getting crazy over it. I still have some small lingering doubts." — Peter Scholze

Johan Commelin led a team that verified the proof, with only minor corrections.

They verified and simplified it without fully understanding the proof. Lean was a guide.

"The Lean Proof Assistant was really that: an assistant in navigating through the thick jungle that this proof is." — Peter Scholze

Nature article on LTE

Lean

Lean Is Taking Mathematics by Storm

Six Fields Medalists engaged: Tao, Scholze, Viazovska, Gowers, Hairer, Freedman.

At this scale, mathematics needs build systems, dependency graphs, code review, release engineering, and a precise medium for communication.

Lean

Reading vs Writing Formal Math

  • It is easier to read formal math than write it.

  • Formal math is easier to read than LaTeX.

  • You can click through, inspect definitions, check types.

  • AI can explain formal proofs and is making the writing part easier.

Formal math in Lean

Proof Assistants in the Age of AI

Lean

Verso: Written in Lean. Checked by Lean.

  • These slides are written in Verso.

  • Every code example you have seen — and will see — is type-checked by Lean as part of the build.

  • Verso is a domain-specific language embedded in Lean.

  • lean-lang.org is written in Verso.

  • My homepage is written in Verso.

  • Future math papers will embed type-checked Lean code.

  • Built by David Thrane Christiansen at the Lean FRO.

Verso: source and rendered output

Software Verification

Lean is not only for mathematics.

Lean

Cedar (AWS): Verified Authorization

Cedar — open-source authorization policy language. Used by AWS Verified Permissions and AWS Verified Access.

Cedar spec: the model is written in Lean.

Three examples of verified components: evaluator, authorizer, validator.

  • forbid_trumps_permit — if any forbid policy is satisfied, the request is denied.

  • allowed_only_if_explicitly_permitted — a request is allowed only if at least one permit policy is satisfied.

  • typecheck_is_sound — if the validator accepts a policy, evaluation produces a boolean.

Cedar website

Lean

Cedar — The Engineering Story

Executable Lean model alongside Rust production code. Lean model ~10× smaller than Rust.

Cedar differential testing

  • ~100M differential random tests nightly. Lean: 5 μs/test. Rust: 7 μs/test.

  • Release gate: no Cedar version ships unless model, proofs, and differential tests are current.

"We've found Lean to be a great tool for verified software development." — Emina Torlak

Lean

SampCert (AWS): Verified Differential Privacy

Differential privacy is the gold standard for privacy-preserving data analysis. Implementing it correctly is hard.

Mironov's attack exploited floating-point to break DP. Bugs in random number generation have destroyed privacy guarantees in real systems.

SampCert (Jean-Baptiste Tristan): the first comprehensive, mechanized foundation for executable DP implementations.

  • 12,000+ lines of Lean proof.

  • Discrete Laplace and Gaussian samplers, proved correct using Fourier analysis and the Poisson summation formula from Mathlib.

  • Composition of mechanisms with budgets γ₁ and γ₂ yields budget γ₁ + γ₂.

  • Postprocessing: functions that don't access the secret database don't degrade privacy.

Lean

SampCert — Deployed at AWS

Custom code extractor: the deployed artifact does not go through the Lean compiler. TCB includes the extractor, not the compiler.

Performance: SampCert outperforms diffprivlib by >2× for discrete Gaussians.

Deployed in AWS Clean Rooms Differential Privacy.

PLDI 2025: "Verified Foundations for Differential Privacy."

"SampCert uses the Poisson summation formula from Mathlib to prove a production privacy service correct."

Lean

Strata (AWS): Language Syntax and Semantics

Strata: extensible platform for formalizing language syntax and semantics. Built in Lean. Open source.

Key idea: dialects (inspired by MLIR). Orthogonal, composable building blocks for modeling programming constructs.

  • Core, C_Simp, Laurel, SMTLib, Python, Boole.

  • Laurel pipeline: Python/Java/JavaScript → Laurel → Strata Core → VCG → SMT.

  • Dialect Definition Mechanism (DDM): embedded DSL in Lean for defining syntax and typing rules.

  • Boogie/Dafny-style verification infrastructure, built in Lean and extensible via Lean's metaprogramming.

Strata logo

Lean

Waimea (AWS): Verified Compiler for Trainium

CompCert-style verified compiler for AWS Trainium, Amazon's custom AI accelerator chip. ~200,000 lines of Lean code.

  • The ISA specification changes several times per week. Not a fixed ISA like x86.

  • Lean provides an up-to-date simulator that hardware and verification engineers share.

  • Lean + AI keeps the team in sync with hardware changes. Several hundred instructions per generation.

Long-term goal: end-to-end compiler verification — source program compiles to Trainium machine code, with a proof that compilation preserves semantics.

A single miscompilation on a training run that costs millions in compute is catastrophic.

Lean

SymCrypt (Microsoft): Verified Cryptography

SymCrypt verification with Aeneas

SymCrypt — Microsoft's core cryptographic library, rewritten in Rust.

Verification pipeline: Charon → Aeneas → Lean.

Aeneas: translates safe Rust to pure functional Lean code. The bridge that makes Rust verification in Lean practical.

Lean

SymCrypt — Verifying Code as Engineers Write It

They verify the Rust code as written by software engineers. No special verification-friendly subset.

Code is evolving: new optimizations for specific hardware (AVX-512, Arm NEON, …). Proofs adapt to rewrites.

"We are verifying code faster than we can write it." — Son Ho, SymCrypt, Microsoft

The proof maintenance burden is real and unavoidable. The fact that it stays manageable is the point.

Lean

Anneal (Google): Verified Unsafe Rust

Anneal logo

Google's zerocopy team building "literate verification" for unsafe Rust using Lean and Aeneas.

Unsafe Rust accounts for ~2/3 of vulnerabilities in the Rust ecosystem. Safety invariants are documented only in prose comments.

Anneal: developers write Lean specifications directly in Rust doc comments.

  • spec blocks for safe code.

  • unsafe(axiom) blocks for opaque unsafe operations with pre/post conditions.

  • Type invariants for struct safety properties.

  • cargo anneal verify runs the pipeline: Charon → Aeneas → Lean.

Designed for AI agents. Antigravity demonstrated authoring unsafe Rust and proving its soundness.

Lean

Veil: Verified Distributed Protocols

Veil: verification of distributed protocols. Built on Lean as a library using Lean metaprogramming.

  • Push-button verification via SMT (cvc5/Z3) for decidable fragments.

  • Full interactive proofs in Lean when automation falls short.

  • Seamless transition between the two modes.

  • Foundational: Veil's VCG is proven sound w.r.t. the specification language semantics.

  • 16 distributed protocol case studies. All 16 verified. Ivy failed on 2. 87.5% verified in under 15 seconds.

Veil: ring leader election

Lean

Veil — Beyond EPR + The Rabia Finding

Verified Paxos, Suzuki-Kasami, Reliable Broadcast outside the decidable EPR fragment. This is where Ivy's automation breaks down.

The Rabia finding: while porting Rabia from Ivy + Rocq to Veil, the team spotted an invariant discrepancy that went undetected across two separate tools.

One unified framework caught what two could not.

Loom (POPL 2026): the foundational layer underneath Veil. Correct-by-construction verifiers from monad transformer algebras.

Lean

Kraken (Google): x64 Semantics in Lean

Kraken is an x64 model written in Lean.

Intended for verifying sequential software that performs computations using common registers and memory.

Project started after the Lean@Google Hackathon in December 2025.

Lean

Move Borrow Checker: The Problem

Ilya Sergey (VERSE lab) formalized and verified Move's borrow checker in Lean. 39,000 lines of mechanized metatheory in 27 working days, using Claude as an AI coding assistant.

Move: smart contract language for the Sui and Aptos blockchains. Rust-like ownership discipline. References are access paths rooted in local variables. No lifetime annotations.

The borrow checker: tracks reachability between references using regular expressions over field paths.

Lean

Move Borrow Checker: What Was Proved

Formalized in Lean:

  • Relational type system encoding

  • Executable algorithmic type checker

  • Soundness proof: the algorithmic checker is sound w.r.t. the relational rules

  • Runtime semantics as a definitional interpreter

  • Type soundness: progress + preservation. Any function accepted by the type checker never reaches a preventable error across all execution paths.

39,000 non-blank lines. 267 commits. 153 preservation lemmas across 23 statement forms.

Lean

Move Borrow Checker: AI's Role

Claude (Opus 4.5/4.6) handled proof repair, pattern propagation, routine preservation cases.

Human insight still needed for novel proof strategies (e.g., weakening lemmas, inter-procedural reasoning for function calls took two days).

27 working days. Estimated 5-6 months without AI. Second largest PL metatheory mechanization in Lean, and very likely the largest accomplished predominantly using AI.

Blog post

Lean

CSLib: A Mathlib for Computer Science

CSLib aims to be a foundation for teaching, research, and new verification efforts, including AI-assisted.

CSLib: adding to CS foundations

Steering committee: Clark Barrett, Swarat Chaudhuri, Jim Grundy, Pushmeet Kohli, Leo de Moura, Fabrizio Montesi.

Lean

Signal Shot

Signal Shot is a public moonshot to verify the Signal protocol and its Rust implementation using Lean.

It is a joint effort of Signal (Rolfe Schmidt), the Beneficial AI Foundation (Max Tegmark), and the Lean FRO.

The pieces:

  • Aeneas,

  • Mathlib and CSLib,

  • Symbolic proof automation (grind) and scalable verification generation (SymM),

  • and AI.

Proof Automation

Lean

Do We Still Need Proof Automation?

"I thought AI would prove all theorems for us now."

  • Tactics are like game moves.

  • Better tactics = shorter proofs.

  • Better tactics = more powerful AI.

  • Compact proofs = better training data = better AI provers.

40 lines replaced by one grind call

Lean

What Is grind?

New proof automation, shipped in Lean v4.22. Kim Morrison and me.

A virtual whiteboard, inspired by modern SMT solvers.

  • Writes facts on the board. Merges equivalent terms.

  • Cooperating engines: congruence closure, E-matching, constraint propagation, guided case analysis.

  • Satellite theory solvers: cutsat (linear integer arithmetic), commutative rings (Gröbner), linarith, AC.

  • Native to dependent type theory. No translation to FOL.

  • Produces ordinary Lean proof terms. Kernel-checkable.

Lean Goal
Preprocessing
Internalization
E-graph
cutsat
rings
linarith
orders
ac

Lean

grind — Theory Combination

example [CommRing α] [NoNatZeroDivisors α] (a b c : α) (f : α Nat) : a + b + c = 3 a^2 + b^2 + c^2 = 5 a^3 + b^3 + c^3 = 7 f (a^4 + b^4) + f (9 - c^4) 1 := by grind

Three solvers meet at the E-graph:

  • Ring solver derives a^4 + b^4 = 9 - c^4.

  • Congruence closure lifts it to f (a^4 + b^4) = f (9 - c^4).

  • Linear integer arithmetic closes 2 * f (9 - c^4) ≠ 1.

The Nelson-Oppen playbook. Inside dependent type theory. No SMT translation layer.

Lean

grind — Typeclass-Parameterized Solvers

Satellite solvers activate automatically when the type classes are present.

  • cutsatToInt class. Int, Int32, BitVec 64, Fin n.

  • RingCommRing, CommSemiring, Field, IsCharP, NoNatZeroDivisors.

  • linarith — module over the integers. Preorders, partial, linear orders.

  • AC — any associative-commutative operator.

example (x y : Int) : 27 11*x + 13*y 11*x + 13*y 45 -10 7*x - 9*y 7*x - 9*y 4 False := by grind -- BitVec 8 is a CommRing of characteristic 256. -- No bitvector-specific theory needed. example (x : BitVec 8) : (x - 16) * (x + 16) = x^2 := by grind
Lean

grind — Annotations and E-Matching

Library authors annotate theorems; grind instantiates them by E-matching — pattern matching modulo the E-graph.

  • [grind =] — use the LHS as the E-matching pattern.

  • [grind →] — forward reasoning: patterns from hypotheses.

  • [grind ←] — backward reasoning: patterns from the conclusion.

  • grind_pattern — custom multi-patterns.

  • Constraint system on patterns: guard, check, is_value, is_strict_value.

@[grind =] theorem fg {x} : f (g x) = x := by unfold f g; omega example {a b c} : f a = b a = g c b = c := by grind

4,000+ grind uses in Mathlib.

Lean

Better Tactics = Better AI

example {α} [CommRing α] [IsCharP α 0] [NoNatZeroDivisors α] (d t d_inv : α) (Δ40 : d * (d + t + d * t) = 0) (Δ41 : d^2 * (d + d * t - 2 * d * t^2 + d * t^4 + d^2 * t^4) = 0) : d * d_inv = 1 t + 2 * t^2 - t^3 - 2 * t^4 + t^5 = 0 := by grind

Quantum algebra example from "Categories generated by a trivalent vertex" — Morrison, Peters, Snyder.

When grind closes a goal in one step instead of fifty, the AI search tree shrinks dramatically.

Lean

Tao on grind

"I have been experimenting recently with the new tactic grind in Lean, which is a powerful tool inspired by 'good old-fashioned AI' such as satisfiability modulo theories (SMT) solvers... When I apply grind to a given subgoal, it can report a success within seconds... But, importantly, when this does not work, I quickly get a grind failed message." — Terence Tao

VC Generation & Scalability

Lean

VC Generation

A VC generator turns code and specifications into proof obligations.

Lean-based VC generators:

  • Aeneas — Rust verification via translation to Lean.

  • Velvet — a Dafny-style verifier built in Lean.

  • mvcgen — Lean's VC generator for monadic programs.

Issue: Verification condition generation with proof assistants has not scaled so far.

Adam Chlipala's group documented this for Rocq. The same story holds in Lean.

Lean

Andres' Challenge — The Language

Andres Erbsen — Google ISE Formal. Bedrock2. Fiat Cryptography.

A Rocq veteran distilled the scaling problem into a minimal challenge at Lean@Google Hackathon.

A small deeply embedded imperative language.

inductive Binop where | add | sub deriving Repr inductive Expr where | literal (v: Int) | var (x: String) | op (bop : Binop) (e1 e2 : Expr) deriving Repr inductive Cmd where | skip : Cmd | set : String Expr Cmd | seq : Cmd Cmd Cmd | input : String Cmd -- .. deriving Repr abbrev Memory := PartialMap Word Byte abbrev LocalContext := PartialMap String Word
Lean

Andres' Challenge — Operational Semantics

inductive IOEvent | IN : Word IOEvent | OUT : Word IOEvent abbrev Post := List IOEvent Memory LocalContext Prop inductive Exec : Cmd List IOEvent Memory LocalContext Post Prop | skip : post es m l Exec .skip es m l post | seq : Exec c1 es m l mid ( es' m' l', mid es' m' l' Exec c2 es' m' l' post) Exec (.seq c1 c2) es m l post | set : (h_eval : Expr.eval m l e = some v) (h_post : post es m (PartialMap.put l x v)) Exec (.set x e) es m l post | input : ( v, post (IOEvent.IN v :: es) m (PartialMap.put l x v)) Exec (.input x) es m l post -- .. theorem Exec.seq_cps (c1 c2 : Cmd) (es : List IOEvent) (m : Memory) (l : LocalContext) (post : Post) : Exec c1 es m l (fun es' m' l' => Exec c2 es' m' l' post) Exec (.seq c1 c2) es m l post := by intro h; apply Exec.seq; apply h; intros; assumption
Lean

Andres' Challenge — A Parametric Program

notation:130 a " ;;\n" b => Cmd.seq a b notation:150 x " ::= " e:150 => Cmd.set x e notation:160 a " +' " b:160 => Expr.op Binop.add (Expr.var a) (Expr.var b) notation:160 a " -' " b:160 => Expr.op Binop.sub (Expr.var a) (Expr.var b) /- Generates a command sequence of the form input b a := b a := a + a; a := a - b ... a := a + a; a := a - b -/ def generated_cmd (n : Nat) (a b : String) : Cmd := .input b ;; a ::= .var b ;; repeated_cmds n a b abbrev spec (m : Memory) : Post := fun es m' l => m' = m v : Word, es = [IOEvent.IN v] l.get "a" = some v def Goal (n : Nat) : Prop := m l, Exec (generated_cmd n "a" "b") [] m l (spec m)
Lean

Andres' Challenge — Baseline

example : Goal 1 := by intro m l; simp only [generated_cmd, repeated_cmds]; unfold spec apply Exec.seq_cps; apply Exec.input; intro v apply Exec.seq_cps; apply Exec.set; simp_vc; rfl apply Exec.seq_cps; apply Exec.set; simp_vc; rfl apply Exec.seq_cps; apply Exec.set; simp_vc; rfl apply Exec.skip simp only [List.cons.injEq, IOEvent.IN.injEq, and_true, PartialMap.put_put, PartialMap.get_put, Option.some.injEq, and_self, exists_eq']

With Lean's default tactic framework, MetaM, this is superlinear.

Andres' Challenge tactics time by goal size

Lean

What Must Be Fast in VC Generation

A VC generator turns code and specifications into proof obligations.

Lean-based VC generators: Aeneas, Velvet, mvcgen.

All three share the same idiom — and need:

  • Efficient apply.

  • Efficient metavariable management.

  • Preserve term sharing.

  • Reuse simp results across VCs and across simulation steps.

  • Do not traverse the same subterms over and over.

Lean

SymM: Making Verification Scale

  • SymM: new monadic framework for high-performance software verification. Born after the Lean@Google Hackathon.

  • Designed for tools like Aeneas, mvcgen and Velvet that need to discharge thousands of verification conditions efficiently.

Without this, verifying something as large as Signal is slow and painful. With it, verification conditions get discharged efficiently.

SymM vs MetaM benchmark

SymM with cache reuse

Lean

SymM Interactive Mode

Humans and AI both need to inspect and steer VC generation.

example (p q : Prop) : p q p q := by sym => intro hp hq apply And.intro apply hp apply hq register_sym_simp chainSimp where post := ground >> rewrite [Nat.add_zero, Nat.zero_add] example (x y : Nat) (h : x y) : (1 - 1) + x 2*y + (1 + 0) := by sym => simp chainSimp lia
Lean

mvcgen — Ported to SymM

Lean's VC generator for monadic programs. Ported to SymM by Sebastian Graf (Lean FRO).

Andres' challenge using Lean monadic code. Linear out to n=1000.

Tactic time

Kernel time

Lean

Velvet — Ported to SymM

A Dafny-style verifier in Lean. Ported to SymM by Vova (Lean FRO intern). Preliminary results.

  • Before: Velvet ran ~3× slower than Dafny.

  • After: Faster than Dafny on 24 of 27 benchmarks; competitive on the other 3 (differenceMinMax, findEvenNumbers, mergeSorted).

One port beat Dafny on 24 of 27. Dafny has a vast TCB.

Dafny vs Velvet vs Velvet + Loom2 verification time

AI

Accelerating formal verification.

Lean

"Vibe Proving"

So many people are choosing Lean that even casual users are proving theorems with ChatGPT.

Erdos Prize paper

"The bottleneck for using AI to create strategies and make conjectures is we have to rely on human experts and the test of time to validate whether something is plausible or not." — Terence Tao, Dwarkesh Patel interview

Lean

International Mathematical Olympiad (IMO)

To date, every AI achieving medal-level IMO performance with formal proofs used Lean.

  • AlphaProof (Google DeepMind) — silver medal, IMO 2024

  • Aristotle (Harmonic) — gold medal, IMO 2025

  • SEED Prover (ByteDance) — silver medal, IMO 2025

Lean

AI startups

  • Leanstral (Mistral): the first open-source code agent designed for Lean 4.

  • Axiom: solved 12/12 problems on Putnam 2025.

  • DeepSeek Prover-V2: 88.9% on miniF2F. Open-source, 671B parameters.

  • Harmonic: Built the Aristotle AI — gold medal, IMO 2025.

Startups using Lean

Lean

The Lean FRO

Nonprofit. 20 engineers. Open source. Controlled by no single company.

Sebastian Ullrich and I launched it in August 2023.

29 releases and 8,500+ pull requests merged since launch.

Lean is not a research project anymore. We run the Lean FRO as a startup.

Public roadmaps | Our team

Lean

Conclusion

The Lean is extensible, scalable, and trusted.

SymM turn a superlinear bottleneck into a linear one.

  • Adoption: 250,000+ unique installs. 9,000+ GitHub repositories.

  • Trust: Multiple independent kernels. Comparator. Small TCB.

  • Mathematics on Lean: Six Fields Medalists. Liquid Tensor Experiment. Fermat's Last Theorem. Sphere packing. Equational Theories Project. Tao's Analysis I.

  • AI on Lean: Every AI achieving medal-level IMO performance with formal proofs used Lean (AlphaProof, Aristotle, SEED Prover). Axiom: 12/12 on Putnam 2025. zlib verified by a general AI.

  • Lean FRO: 20 engineers building Lean full-time.

Thank You

NFM 2026