Lean: Machine-Checked Mathematics and Verified Programming

Leo de Moura
Senior Principal Applied Scientist, AWS
Chief Architect, Lean FRO
University of Minnesota | March 30, 2026
Lean

Last Month, Something Unexpected Happened

A general-purpose AI, with no special training for theorem proving, converted zlib (a C compression library) to Lean, passed the test suite, and then proved that the code is correct.

Not tested. Proved. For every possible input. lean-zip

"This was not expected to be possible yet."

How? Let me show you.

Lean

What is Lean?

  • A programming language and a proof assistant

  • Same language for code and proofs. No translation layer.

  • Small trusted kernel. Proofs can be exported and independently checked.

  • Lean is implemented in Lean. Very extensible.

  • The math community made Lean their own: >50,000 lines of extensions.

def reverse : List α List α | [] => [] | x :: xs => reverse xs ++ [x] theorem reverse_append (xs ys : List α) : reverse (xs ++ ys) = reverse ys ++ reverse xs := by induction xs with | nil => simp [reverse] | cons x xs ih => simp [reverse, ih] #eval reverse [1, 2, 3]
[3, 2, 1]
Lean

Propositions as Types

def odd (n : Nat) : Prop := k, n = 2 * k + 1
  • The type of odd 3 is Prop, a statement that can be true or false.

  • We can't evaluate it: odd returns a Prop, not a Bool.

  • But we can prove it by providing a witness.

example : odd 3 := 1, by decide
Lean

Theorem Proving Is a Game

"You have written my favorite computer game" — Kevin Buzzard

The "game board": you see goals and hypotheses, then apply "moves" (tactics).

theorem square_of_odd_is_odd : odd n odd (n * n) := by intro k₁, e₁ simp [e₁, odd] exists 2 * k₁ * k₁ + 2 * k₁ grind

Each tactic transforms the goal until nothing remains. The kernel checks the final proof term.

Lean

The Kernel Catches Mistakes

example : odd 3 := 2, by decide

You don't need to trust me, or my automation. You only need to trust the small kernel.

An incorrect proof is rejected immediately.

Lean has multiple independent kernels.

You can build your own and submit it to arena.lean-lang.org.

Lean

Lean Is an IDE

  • Rich user interface and integrated tooling.

  • Build system (Lake), LSP server, and VS Code plugin.

  • Real-time feedback: errors, goals, and hints as you type.

Lean IDE with Claude Code

The IDE is where I collaborate with AI now.

Lean

Lean Is an IDE (part 2)

I asked Claude to prove append_length. It wrote the proof, Lean checked it. No goals remaining.

Claude proves append_length

Lean

Mathlib: The Mathematical Library

  • 200,000+ theorems. 750+ contributors.

  • From basic algebra to modern research mathematics.

  • Fewer than 1,000 missing definitions from coverage of modern research math.

  • The dataset every AI math system trains on.

Mathlib dependency graph

Lean

The Liquid Tensor Experiment

In 2020, Peter Scholze posed a formalization challenge.

"I spent much of 2019 obsessed with the proof of this theorem, almost getting crazy over it. I still have some small lingering doubts." — Peter Scholze

Johan Commelin led a team that verified the proof, with only minor corrections.

They verified and simplified it without fully understanding the proof. Lean was a guide.

"The Lean Proof Assistant was really that: an assistant in navigating through the thick jungle that this proof is." — Peter Scholze

Nature article on LTE

Lean

Lean Is Taking Mathematics by Storm

Five Fields Medalists engaged: Tao, Scholze, Viazovska, Gowers, Hairer.

Tao on Lex Fridman

Lean

Tao Finding a Bug in His Own Paper

Tao finding a bug via Lean

Lean's linarith couldn't close a step because he only had n > 2 but needed 0 < n - 3. He added a footnote to the published paper.

Even the best mathematicians benefit from machine checking.

Lean

Tao on Refactoring

"We had formalized the proof with this constant 12, and then when this new paper came out, we said, 'Okay, let's update the 12 to 11.' And what you can do with Lean is that you just in your headline theorem change a 12 to 11. You run the compiler and... of the thousands of lines of code you have, 90% of them still work." — Terence Tao, Lex Fridman interview

Formal math is maintainable. This is new.

Lean

Reading vs Writing Formal Math

  • It is easier to read formal math than write it.

  • Formal math is easier to read than LaTeX.

  • You can click through, inspect definitions, check types.

  • AI can explain formal proofs.

  • AI is making the writing part easier.

  • Humans need to be able to read formal math.

Formal math in Lean

Lean

What We Learned from Mathematics

  • Machine-checkable proofs enable a new level of collaboration.

  • Auto refactoring and generalization for free.

  • It is not just about proving but also understanding complex objects and proofs, getting new insights, and navigating through structures beyond our cognitive abilities.

Auto-generalization in Mathlib

Lean

Do We Still Need Proof Automation?

"I thought AI would prove all theorems for us now."

  • Tactics are like "game moves".

  • Better tactics = shorter proofs.

  • Better tactics = more powerful AI.

  • grind is a new "game move."

40 lines replaced by one grind call

Lean

What is grind?

New proof automation (since Lean v4.22), developed by Kim Morrison and me.

A proof-automation tactic inspired by modern SMT solvers. Think of it as a virtual whiteboard:

  • Discovers new equalities, inequalities, etc.

  • Writes facts on the board and merges equivalent terms

  • Multiple engines cooperate on the same workspace

Cooperating Engines:

  • Congruence closure; E-matching; Constraint propagation; Guided case analysis

  • Satellite theory solvers (linear integer arithmetic, commutative rings, linear arithmetic, AC)

Supports dependent types, type-class system, and dependent pattern matching

Produces ordinary Lean proof terms for every fact.

Lean

Tao on grind

Tao on grind

"I have been experimenting recently with the new tactic grind in Lean, which is a powerful tool inspired by 'good old-fashioned AI' such as satisfiability modulo theories (SMT) solvers... When I apply grind to a given subgoal, it can report a success within seconds... But, importantly, when this does not work, I quickly get a grind failed message."

Lean

How Does AI Affect Proof Automation?

  • I built heuristics for Z3: quantifier instantiation, case-split ordering, proof search.

  • AI can do this better. AI replaces the heuristics.

  • The algorithmic core (congruence closure, linear arithmetic, decision procedures) is more valuable than ever.

  • grind is designed this way: strong algorithms, AI-replaceable search.

example (f : R Nat) : max 3 (4 * f ((cos x + sin x)^2)) 2 + f (2 * cos x * sin x + 1) := by grind => use [Nat.max_def, trig_identity] ring cases_next
Lean

Better Tactics = Better AI

  • When grind closes a goal in one step instead of fifty, the AI's search tree shrinks dramatically.

  • Compact proofs = better training data = better AI provers.

  • The flywheel: better automation, compact proofs, better AI, more users, more libraries, repeat.

example {α} [CommRing α] [IsCharP α 0] [NoNatZeroDivisors α] (d t d_inv : α) (Δ40 : d * (d + t + d * t) = 0) (Δ41 : d^2 * (d + d * t - 2 * d * t^2 + d * t^4 + d^2 * t^4) = 0) : d * d_inv = 1 t + 2 * t^2 - t^3 - 2 * t^4 + t^5 = 0 := by grind

Quantum Algebra example from “Categories generated by a trivalent vertex”, Morrison, Peters, and Snyder

Lean

Every AI System Used Lean

Every AI system that solved IMO problems with formal proofs used Lean.

  • AlphaProof (Google DeepMind) — silver medal, IMO 2024

  • Aristotle (Harmonic) — gold medal, IMO 2025

  • SEED Prover (ByteDance) — silver medal, IMO 2025

Three independent teams. No competing platform was used by any of them.

Lean

The AI Scoreboard

  • Axiom: valued at $1.5 billion. 12/12 on Putnam 2025.

  • DeepSeek Prover-V2: 88.9% on MiniF2F-test. Open-source, 671B parameters.

  • Harmonic: valued at $1.45 billion. Built the Aristotle AI.

None of these systems existed in early 2024.

Startups using Lean

Lean

Aristotle — Harmonic's AI for Lean

Aristotle proving Gauss-Lucas theorem in Mathlib

"The proof was found by Aristotle AI, then rewritten by me in Mathlib style."

Lean

Why Lean?

  • Lean is implemented in Lean. You can access Lean's internal data using Lean.

  • Mathlib is the training data. 200,000+ theorems. 2.2M+ lines.

  • Machine-checkable proofs = unfakeable evaluation metric.

  • Open source. No vendor lock-in.

  • Five Fields Medalists. 750+ contributors.

Lean

"I F*cking Love This Sh!t"

"I genuinely do not know of another tool that can do all of the following:" — Markus de Medeiros

  • ML-style (not insane) functional programming

  • Metaprogramming so easy you can do it just to make your life easier

  • Access to every C library ever written

  • Honest to god proofs for your pure code

  • Monadic style (not insane) imperative programming

He implemented a cryptographic protocol in Lean: OpenSSL bindings via FFI, a custom DSL via metaprogramming, and formal proofs of correctness.

All in one language. Blog post

Lean

"Vibe Proving"

So many people are choosing Lean that even casual users are proving theorems with ChatGPT. "We used ChatGPT to vibe code a Lean proof."

Erdos prize paper

"The bottleneck for using AI to create strategies and make conjectures is we have to rely on human experts and the test of time to validate whether something is plausible or not." — Terence Tao, Dwarkesh Patel interview

Lean

The Moment AI Changed Everything for Me

AI is not just changing who uses Lean. It changed how I work.

Kim Morrison (Lean FRO) showed me a major refactoring step: more than 10,000 lines changed across Mathlib, implemented by Claude Code.

Since then, AI has removed the pain of developing software.

I used to believe my superpower was that I could tolerate a lot of pain. That superpower is no longer relevant.

"Oh my gosh, how do you manage to do this?" — Jeremy Avigad, watching me analyze gigantic Lean traces to isolate a bug (2016)

AI now does this. It isolates issues, tries new ideas, runs experiments, analyzes performance, and much more.

Lean

Optimizing Lean for AI

  • AI will be Lean's main user. I seldom use the Lean tools directly anymore.

  • Humans will interact with Lean at a much higher level of abstraction.

  • Many improvements coming soon:

    • Scalability: we are optimizing every subsystem to support 100x more formalized mathematics

    • Disk-based resumption: >100x faster AI-Lean interaction

    • AI-optimized protocol for proof exploration at scale

    • grind diagnostics designed for AI consumption

    • Python interop for the AI ecosystem

Lean

Reward Hacking in RL

"It's really important with these formal proof assistants that there are no backdoors or exploits you can use to somehow get your certified proof without actually proving it, because reinforcement learning is just so good at finding these backdoors." — Terence Tao

The kernel's soundness is AI safety infrastructure.

Who Watches the Provers? | Comparator | Validating a Lean Proof

Lean

Your Program Is Already a Specification

The kernel is trustworthy. So what can you build on that trust?

An inefficient program is a specification. I wrote this:

inductive Tree (α : Type u) where | leaf | node (left : Tree α) (a : α) (right : Tree α) def Tree.map (f : α β) : Tree α Tree β | .leaf => .leaf | .node l a r => .node (map f l) (f a) (map f r)

My prompt to AI: "Implement a tail-recursive version, and prove it computes the same result."

Lean

AI's Response: Setup

def size : Tree α Nat | .leaf => 0 | .node l _ r => 1 + l.size + r.size /-- Stack frame for tail-recursive tree map. -/ inductive Frame (α : Type u) (β : Type v) where | visit (val : β) (right : Tree α) | done (left : Tree β) (val : β) def stackWeight : List (Frame α β) Nat | [] => 0 | .visit _ r :: s => 4 * r.size + 3 + stackWeight s | .done _ _ :: s => 1 + stackWeight s

stackWeight is the termination measure. It assigns a weight to each stack frame so that every transition (go-to-go, go-to-back, back-to-go, back-to-back) strictly decreases. visit frames are heavier because they still carry unprocessed subtrees.

Lean

AI's Response: Tail-Recursive Implementation

mutual def Tree.mapGo (f : α β) (stack : List (Tree.Frame α β)) (t : Tree α) : Tree β := match t with | .node l a r => mapGo f (.visit (f a) r :: stack) l | .leaf => mapBack f stack .leaf termination_by 4 * t.size + stackWeight stack + 1 decreasing_by all_goals simp only [stackWeight, size]; omega def Tree.mapBack (f : α β) (stack : List (Tree.Frame α β)) (result : Tree β) : Tree β := match stack with | .visit a r :: rest => mapGo f (.done result a :: rest) r | .done l a :: rest => mapBack f rest (.node l a result) | [] => result termination_by stackWeight stack decreasing_by all_goals simp only [stackWeight]; omega end

Two mutually recursive functions with an explicit stack. A termination measure. omega closes all obligations.

Lean

The Equivalence Theorem

theorem Tree.mapTR_eq_map (f : α β) (t : Tree α) : Tree.mapTR f t = Tree.map f t := by simp only [Tree.mapTR, Tree.mapGo_spec, Tree.applyStack]

The key idea: applyStack interprets the stack as a context. The mutual lemmas show both functions respect this interpretation. The main theorem follows in one line.

I wrote 3 lines. AI wrote 70 lines of optimized code and correctness proof. No supervision. The simple version was the specification. Full source

Lean

AI Is Rewriting Critical Software

  • Google: 30%+ of new code is AI-generated. Microsoft: 25-30%. Meta: targeting 50%.

  • 45% of AI-generated code fails basic security tests (Veracode, 100+ models). Newer, larger models do not help.

  • "I 'Accept All' always, I don't read the diffs anymore." — Andrej Karpathy

The same AI that creates this problem can generate verified proofs. Both sides run on Lean.

Lean

zlib in Lean

AI converted zlib (a C compression library) to Lean.

The Lean version passed the test suite, and then proved that the code is correct.

theorem zlib_decompressSingle_compress (data : ByteArray) (level : UInt8)
    (hsize : data.size < 1024 * 1024 * 1024) :
    ZlibDecode.decompressSingle (ZlibEncode.compress data level) = .ok data

Decompression always recovers the original data. Every compression level. Machine-checked.

The barrier is no longer AI capability. It is platform readiness.

When AI Writes the World's Software, Who Verifies It?

Lean

Radix: 10 AI Agents, One Weekend

10 AI agents built a verified embedded DSL with proved optimizations in a single weekend.

  • Zero sorry — 52 theorems, all complete

  • 5 verified compiler optimizations

  • Determinism, type safety, memory safety — all proved

  • Interpreter correctness — sound and complete

  • 27 modules, ~7,400 lines of Lean

  • I did not touch the code. Zero lines. Full agent autonomy.

Lean

Radix — A Real Program

def bubbleSort := `[RStmt| let arr := new uint64[][10]; arr[0] := 5; arr[1] := 3; arr[2] := 8; arr[3] := 1; arr[4] := 9; arr[5] := 2; arr[6] := 7; arr[7] := 4; arr[8] := 6; arr[9] := 0; i := 0; while (i < 9) { j := 0; while (j < 9 - i) { let a : uint64 = arr[j]; let b : uint64 = arr[j + 1]; if (a > b) { arr[j] := b; arr[j + 1] := a; } j := j + 1; } i := i + 1; } first := arr[0]; last := arr[9]; ] #eval! run bubbleSort "first" #eval! run bubbleSort "last"
Lean

Radix — Big-Step Semantics — 16 Rules

-- Big-step: ⟨σ, s⟩ ⇓ r -- 16 rules: skip, assign, decl, seqNormal, seqReturn, -- ifTrue, ifFalse, whileTrue, whileReturn, whileFalse, -- alloc, free, arrSet, ret, block, callStmt, scope example : BigStep σ .skip (.normal σ) := BigStep.skip example (he : e.eval σ = some v) (hs : σ.setVar x v = some σ') : BigStep σ (.assign x e) (.normal σ') := BigStep.assign he hs example (h₁ : BigStep σ₁ s₁ (.normal σ₂)) (h₂ : BigStep σ₂ s₂ r) : BigStep σ₁ (s₁ ;; s₂) r := BigStep.seqNormal h₁ h₂ example (hc : e.eval σ₁ = some (.bool true)) (hb : BigStep σ₁ b (.normal σ₂)) (hw : BigStep σ₂ (.while e b) r) : BigStep σ₁ (.while e b) r := BigStep.whileTrue hc hb hw
Lean

Radix — Determinism — grind Does The Work

-- Same state + same statement → same result -- 80-line proof by induction on h₁ -- grind closes equational contradictions example (h₁ : BigStep σ s r₁) (h₂ : BigStep σ s r₂) : r₁ = r₂ := BigStep.det h₁ h₂

In the real proof: cases h₂ with | assign => grind handles equational cases. grind decides contradictions like some true = some false. Explicit IH application for recursive cases (seq, while, call, scope).

Lean

Radix — Interpreter Correctness — Sound and Complete

A fuel-based interpreter (Stmt.interp) proved equivalent to the relational semantics in both directions:

-- Completeness: BigStep implies interp succeeds example (h : BigStep σ s r) : fuel, s.interp fuel σ = (.ok r.retVal, r.state) := Stmt.interp_complete h -- Soundness: interp success implies BigStep example (h : s.interp fuel σ = (.ok rv, σ')) : BigStep σ s (toStmtResult rv σ') := Stmt.interp_sound h

The relational semantics is the spec. The interpreter is the implementation. These two theorems say they agree exactly.

Lean

Radix — Verified Optimizations

example (h : BigStep σ s r) : BigStep σ s.constFold r := Stmt.constFold_correct h example (h : BigStep σ s r) : BigStep σ s.deadCodeElim r := Stmt.deadCodeElim_correct h example (h : BigStep σ s r) : BigStep σ s.copyPropagation r := Stmt.copyProp_correct h

Each pass: Stmt → Stmt with a correctness theorem. "If the original runs, the optimized runs too."

Lean

Radix — Lessons Learned

Unlike most human users, Claude reads the documentation. One stuck agent? I said "Read the grind docs." It downloaded, understood, unblocked.

Examples are powerful: agents found our DSL examples and adopted the same pattern.

Our AI agents are better Lean users than our human users.

  • More precise: structured error context, not "it doesn't work"

  • More systematic: they hit every edge case, not just the one they needed

  • More actionable: "tactic X fails on pattern Y because Z"

Lean

Cedar (AWS)

"We've found Lean to be a great tool for verified software development." — Emina Torlak

  • Proved validator soundness in 18 person-days.

Cedar QED comparison

Lean

SymCrypt (Microsoft)

SymCrypt verification with Aeneas

  • Core cryptographic library. Verified using Aeneas + Lean.

  • They verify the Rust code as written by software engineers.

  • Code is evolving (new optimizations for specific hardware). They must adapt to rewrites.

Lean

Veil: Verified Distributed Protocols

  • Built on Lean by Ilya Sergey's group.

  • Combines model checking with full formal proof.

  • Found an inconsistency in a prior formal verification of Rabia consensus that went undetected across two separate tools.

  • Even the verifiers can be wrong. Lean caught it.

Veil code

Verifying Distributed Protocols in Veil

Lean

The Bottleneck Is Shifting

  • AI will write proofs. AI will write code. AI will even draft specifications from natural language.

  • But reading the formal specification and saying "yes, this is what I meant" remains the human job.

  • The formal statement is the trust anchor.

  • Designing the right abstractions, choosing the right interfaces: that's the human contribution.

Proof Assistants in the Age of AI

Lean

The Lean FRO

Nonprofit. 20 engineers. Open source. Controlled by no single company.

Sebastian Ullrich and I launched it in August 2023.

28 releases and 8,000+ pull requests merged since its launch.

"Lean is not a research project anymore. We run the Lean FRO as a startup."

Public roadmaps | Our team

ACM SIGPLAN Award 2025: "Lean has become the de facto choice for AI-based systems of mathematical reasoning."

The Lean project was featured in: NY Times, Quanta, Scientific American, Wired, etc.

Lean

The Ecosystem

  • 8,000+ GitHub repos.

  • 200,000+ VS Code installs.

  • Mathlib Initiative

  • CSLib: a Mathlib for computer science (Stanford, Amazon, Google DeepMind).

  • PhysLean: Digitalizing Physics in Lean 4

  • Companies pivoting math-proven AIs toward program verification:

    • spec auto-formalization + program synthesis + verification as a service.

Lean

Verso: Written in Lean. Checked by Lean.

  • Future math papers will have Lean code that is checked.

  • Lecture notes, papers, and slides — all in one tool.

  • These slides are written in Verso.

  • lean-lang.org is written in Verso.

  • leodemoura.github.io is written in Verso.

Verso: source and rendered output

Lean

Reservoir

Reservoir: the registry for all Lean packages

Lean

Conclusion

I opened this talk with something unexpected: a general-purpose AI converted a production C library to Lean and proved the Lean version correct. A year ago, no one thought that was possible yet.

“It's quite possible at the high school level... that you could get involved in a math project and actually make a real contribution because of all these AI tools, Lean, and everything else.” — Terence Tao, Dwarkesh Patel interview

Formalization isn't just about proving. It's about understanding, collaborating, and building trust at a scale beyond our cognitive abilities.

The entire discipline thrives when no one has to “take it on faith.”

Thank You

Leo de Moura
lean-lang.org
leanprover.zulipchat.com

X: @leanprover
LinkedIn: Lean FRO
Mastodon: @leanprover@functional.cafe
#leanlang #leanprover

leodemoura.github.io
University of Minnesota | March 30, 2026