Lean: A Unified Platform for Verification, Programming, and AI

Leo de Moura
Senior Principal Applied Scientist, AWS
Chief Architect, Lean FRO

ETAPS Industry Day | April 15, 2026

Monday recap

On Monday: What Lean is and how it works.

Today: Lean applications and who uses Lean

For anyone who missed Monday:

Programming language and proof assistant. Same language for code and proofs.
Small trusted kernel. Multiple independent kernels.
Lean is implemented in Lean. Very extensible.

AWS and Lean

AWS uses Lean across multiple domains:

Cedar — verified authorization policy language
SampCert — verified differential privacy
Strata — language syntax and semantics
Waimea — tensor compiler for Trainium

Cedar: Authorization Policy Language

Cedar: open-source authorization policy language.

Cedar spec: model written in Lean.

Cedar website

Cedar: Verified Components

Used by AWS Verified Permissions and AWS Verified Access. Policies separate from application logic.

Three examples of verified components: evaluator, authorizer, validator.

forbid_trumps_permit: if any forbid policy is satisfied, the request is denied
allowed_only_if_explicitly_permitted: a request is allowed only if at least one permit policy is satisfied
typecheck_is_sound: if the validator accepts a policy, evaluation produces a boolean

Cedar: Verification Queries

Verification queries proved sound and complete against a formalized symbolic compiler. SMT encoder and solver are trusted.

verifyNeverErrors — policy won't error
verifyAlwaysAllows, verifyAlwaysDenies
verifyImplies — one policy set implies another
sound_policy_slice — authorization result is the same for the full set or the relevant slice

Originally built with Dafny, migrated to Lean.

"We've found Lean to be a great tool for verified software development." — Emina Torlak

Cedar: The Engineering Story

Executable Lean model alongside Rust production code. Lean model ~10x smaller than Rust.

Cedar differential random testing: random inputs fed to both Lean model and Rust production, outputs compared

~100M differential random tests nightly. Lean: 5 μs/test. Rust: 7 μs/test.
Release gate: no Cedar version ships unless model, proofs, and differential tests are current.

cedar-spec | lean-lang.org/use-cases/cedar

SampCert: Verified Differential Privacy

Differential privacy: the gold standard for privacy-preserving data analysis.

The problem: implementing DP correctly is hard. Bugs in random number generation and floating-point approximations have destroyed privacy guarantees in real systems. Mironov's attack exploited floating-point to break DP.

SampCert (led by Jean-Baptiste Tristan): first comprehensive, mechanized foundation for executable DP implementations.

12,000+ lines of Lean proof
Discrete Laplace and Gaussian sampling algorithms, proved correct
Generic DP foundation: pure DP, zero-concentrated DP, Renyi DP
Composition theorems verified generically

SampCert: What's Verified

Discrete Laplace sampler: proved to produce samples following the exact PMF
Discrete Gaussian sampler: proved correct using Fourier analysis and the Poisson summation formula from Mathlib
Composition: composing mechanisms with budgets γ₁ and γ₂ yields budget γ₁ + γ₂
Postprocessing: functions that don't access the secret database don't degrade privacy

Key innovation: a probability monad (SLang) embedded in Lean.

Heavy use of Mathlib: Fourier analysis, Poisson summation, measure theory, number theory, topology.

SampCert: Deployed at AWS

Custom code extractor: the deployed artifact does not go through the Lean compiler. TCB includes the extractor, not the Lean compiler.

Performance: SampCert outperforms diffprivlib by >2x for discrete Gaussians.

Deployed in AWS Clean Rooms Differential Privacy, a production AWS service.

PLDI 2025: "Verified Foundations for Differential Privacy."

SampCert uses the Poisson summation formula from Mathlib to prove a production privacy service correct.

SampCert | PLDI paper

Strata: Language Syntax and Semantics

Strata: extensible platform for formalizing language syntax and semantics. Built in Lean. Open source.

Key idea: dialects (inspired by MLIR). Orthogonal, composable building blocks for modeling programming constructs. Define a dialect, get AST, parser, pretty printer, type checker.

Core, C_Simp, Laurel, SMTLib, Python, Boole
Laurel pipeline: Python/Java/JavaScript → Laurel → Strata Core → VCG → SMT
Dialect Definition Mechanism (DDM): embedded DSL in Lean for defining syntax and typing rules

Strata: Why Lean

Lean's role:

Extensibility via metaprogramming (DDM is an embedded DSL)
Verification of Strata's own implementation
Long-term: using Lean to reason about programs written in Strata dialects

Monday's MiniRadix was a toy version of this idea. Strata is the industrial-scale version.

Boogie/Dafny-style verification infrastructure, built in Lean and extensible via Lean's metaprogramming.

strata-org/Strata

Waimea: Verified Compiler for Trainium

CompCert-style verified compiler for AWS Trainium, Amazon's custom AI accelerator chip. ~200,000 lines of Lean code.

The ISA specification changes several times per week. Not a fixed ISA like x86.
Lean provides an up-to-date simulator that hardware and verification engineers share.
Lean + Claude keeps the team in sync with hardware changes.
Several hundred instructions per generation. AI is essential for keeping up.

Long-term goal: end-to-end compiler verification. Source program compiles to Trainium machine code, with a proof that the compilation preserves semantics.

Waimea: Why It Matters

A single miscompilation on a training run that costs millions of dollars in compute is catastrophic.

Also used Claude Code to port the Flocq floating-point library from Rocq to Lean autonomously, building the numeric foundations needed for hardware verification.

Anneal: Verified Unsafe Rust at Google

Google's zerocopy team building "literate verification" for unsafe Rust using Lean and Aeneas.

Unsafe Rust accounts for ~2/3 of vulnerabilities in the Rust ecosystem. Safety invariants documented only in prose comments.

Anneal: developers write Lean specifications directly in Rust doc comments. cargo anneal verify runs the pipeline: Charon → Aeneas → Lean.

Quick Aeneas recap (Monday covered SymCrypt (Microsoft) which also uses Aeneas): Aeneas (Son Ho, Microsoft) translates safe Rust to pure functional Lean code. The bridge that makes Rust verification in Lean practical.

Anneal: AI-Ready Verification

Specs and proofs live in Rust doc comments alongside the code.

spec blocks for safe code
unsafe(axiom) blocks for opaque unsafe operations with pre/post conditions
Type invariants for struct safety properties

Designed for AI agents: Anneal demonstrated that their AI (Antigravity) can author unsafe Rust code and prove its soundness.

zerocopy/anneal | design doc

Veil: Verified Distributed Protocols

Veil: verification of distributed protocols. Built on Lean as a library using Lean metaprogramming.

Push-button verification via SMT (cvc5/Z3) for decidable fragments
Full interactive proofs in Lean when automation falls short
Seamless transition between the two modes
Foundational: Veil's VCG is proven sound w.r.t. the specification language semantics.
16 distributed protocol case studies. All 16 verified. Ivy failed on 2. 87.5% verified in under 15 seconds.

Veil: ring leader election

Veil: Beyond EPR

Verified Paxos, Suzuki-Kasami, Reliable Broadcast outside the decidable EPR fragment. This is where Ivy's automation breaks down.

The Rabia finding: while porting Rabia from Ivy + Rocq to Veil, they spotted an invariant discrepancy that went undetected across two separate tools. One unified framework caught what two couldn't.

Loom (POPL 2026): the foundational layer underneath Veil. Correct-by-construction verifiers from monad transformer algebras.

verse-lab/veil | CAV paper

One Platform, Many Domains

Cedar is access control. SampCert is privacy. Strata is languages. Waimea is compiler. SymCrypt is Crypto. Anneal is unsafe Rust. Veil is distributed protocols.

AWS, Google, Microsoft, NUS, Nethermind, Galois use Lean.

Independent third-party adoptions:

Nethermind: verified INTMAX2 zkRollup (5,000 lines of Lean), CertiPlonk, Halo2, EVM + Yul
Galois: verified Jolt zkVM. Custom tactic saved 6,800+ lines of proof. One XOR proof: 213 lines → one automated line.

AI + Lean

Companies are being built on this: Harmonic, Axiom, Math Inc, Logical Intelligence, Google DeepMind, ByteDance teams.

Many job opportunities.

Startups using Lean

Harmonic Aristotle: IMO Gold

Harmonic: a startup building AI systems that reason formally.

IMO 2025 Gold Medal: Aristotle solved 5 out of 6 problems with formal Lean proofs. First AI system to achieve gold with formal proofs.

Architecture:

Lean proof search: MCTS with a learned value function. RL-trained on Lean proof search.
Informal reasoning: generates conjectures and lemmas in natural language, then formalizes them as Lean statements.
Geometry solver: dedicated symbolic solver. Open-sourced as Yuclid.

Training: RL on Lean proof search. Test-time training.

Aristotle: Beyond Competition Math

Aristotle made contributions to Mathlib. Category theory, homological algebra.

Verina benchmark: 96.8% resolution rate (160/189 Lean specifications proved). Found specification bugs in the benchmarks.

#1 on ProofBench by 15% margin.

Fully agentic: Aristotle works directly in Lean projects. It reads project files, understands dependencies, writes proofs that integrate into existing code.

Lean serves as the training environment, the evaluation harness, and the deployment target.

harmonic.fun | aristotle.harmonic.fun

Move Borrow Checker: The Problem

Ilya Sergey (VERSE lab) formalized and verified Move's borrow checker in Lean. 39,000 lines of mechanized metatheory in 27 working days, using Claude as an AI coding assistant.

Move: smart contract language for the Sui and Aptos blockchains. Rust-like ownership discipline. References are access paths rooted in local variables. No lifetime annotations.

The borrow checker: tracks reachability between references using regular expressions over field paths.

Move Borrow Checker: What Was Proved

Formalized in Lean:

Relational type system encoding
Executable algorithmic type checker
Soundness proof: the algorithmic checker is sound w.r.t. the relational rules
Runtime semantics as a definitional interpreter
Type soundness: progress + preservation. Any function accepted by the type checker never reaches a preventable error across all execution paths.

39,000 non-blank lines. 267 commits. Zero axioms, zero sorry. 153 preservation lemmas across 23 statement forms.

Move Borrow Checker: AI's Role

Claude (Opus 4.5/4.6) handled proof repair, pattern propagation, routine preservation cases.

Human insight still needed for novel proof strategies (e.g., weakening lemmas, inter-procedural reasoning for function calls took two days).

27 working days. Estimated 5-6 months without AI. Second largest PL metatheory mechanization in Lean, and very likely the largest accomplished predominantly using AI.

Blog post

Your Program Is Already a Specification

"I am running out of things to Prove." — Son Ho, SymCrypt, Microsoft

In the AI era, writing specifications is a critical skill.

A naive program is a specification.

inductive Tree (α : Type u) where
  | leaf
  | node (left : Tree α) (a : α) (right : Tree α)

def Tree.map (f : α → β) : Tree α → Tree β
  | .leaf => .leaf
  | .node l a r => .node (map f l) (f a) (map f r)

My prompt to AI: "Implement a tail-recursive version, and prove it computes the same result."

AI's Response: Tail-Recursive Implementation

mutual
  def Tree.mapGo (f : α → β) (stack : List (Tree.Frame α β))
      (t : Tree α) : Tree β :=
    match t with
    | .node l a r => mapGo f (.visit (f a) r :: stack) l
    | .leaf => mapBack f stack .leaf
  termination_by 4 * t.size + stackWeight stack + 1
  decreasing_by all_goals simp only [stackWeight, size]α:Type u_1β:Type u_2stack:List (Frame α β)⊢ stackWeight stack < 4 * 0 + stackWeight stack + 1; omegaAll goals completed! 🐙

  def Tree.mapBack (f : α → β) (stack : List (Tree.Frame α β))
      (result : Tree β) : Tree β :=
    match stack with
    | .visit a r :: rest => mapGo f (.done result a :: rest) r
    | .done l a :: rest => mapBack f rest (.node l a result)
    | [] => result
  termination_by stackWeight stack
  decreasing_by all_goals simp only [stackWeight]α:Type u_1β:Type u_2result:Tree βl:Tree βa:βrest:List (Frame α β)⊢ stackWeight rest < 1 + stackWeight rest; omegaAll goals completed! 🐙
end

Two mutually recursive functions with an explicit stack. A termination measure. omega closes all obligations.

The Equivalence Theorem

theorem Tree.mapTR_eq_map (f : α → β) (t : Tree α) :
    Tree.mapTR f t = Tree.map f t := byα:Type u_1β:Type u_2f:α → βt:Tree α⊢ mapTR f t = map f t
  simp only [Tree.mapTR, Tree.mapGo_spec, Tree.applyStack]All goals completed! 🐙

I wrote 6 lines. AI wrote 70 lines of optimized code and correctness proof, with no supervision. The simple version was the specification.

On Trust

What do I have to trust in order to trust a Lean proof?

Trust is layered. Each layer reduces what you have to believe.

Everyday: blue check + #print axioms
Replay: lean4checker re-runs the kernel on .olean output
Gold standard: comparator in a sandbox with multiple independent kernels

Validating a Lean Proof

Everyday Trust: Axioms and the Blue Check

Blue checkmark: the elaborator accepted the proof.

A double blue check mark

#print axioms thmName: list the axioms the theorem transitively depends on.

Axioms in stdlib: propext, Classical.choice, Quot.sound.
sorryAx → incomplete proof.
Custom axioms → soundness depends on them.
Lean.trustCompiler → native evaluation in the TCB (deprecated since 4.29).

Re-checking Proofs: `lean4checker`

lean4checker replays declarations from .olean files through the kernel in a fresh context.

Catches bugs in meta-programs that intentionally bypass the kernel or corrupt memory.

Caveat: it trusts that .olean files are well-formed. A malicious library cannot be validated this way.

Gold Standard: `comparator` + External Checkers

Comparator

Builds the proof in a sandbox
Exports it in a format validated by multiple checkers
Replays through Lean's kernel and independent implementations (e.g., nanoda in Rust)
Curated formal statements of known conjectures (e.g., Formal Conjectures) provide a human-verified reference point.
External checkers implement their own pretty printers independently of Lean.

Implementation bugs in any single checker are caught by disagreement with another.

Lean Kernel Arena hosts additional external checkers for manual verification.

Coming soon: Comparator at live.lean-lang.org

On Trust: Lean as a Programming Language

If you are using Lean as a programming language, you are also trusting:

Our compiler, implemented in Lean
Our runtime, implemented in C/C++
LLVM
Any other code external to Lean that is linked in

CSLib

CSLib aims to be a foundation for teaching, research, and new verification efforts, including AI-assisted.

A Foundation for Computer Science in Lean

CSLib: adding to CS foundations

Steering committee: Clark Barrett, Swarat Chaudhuri, Jim Grundy, Pushmeet Kohli, Leo de Moura, Fabrizio Montesi.

"Lean Rocks": A Student's Story

A student with no formal cryptography training used Lean to implement point-and-permute garbled circuits (MPC).

Circuit DSL via metaprogramming, OpenSSL FFI bindings, monadic imperative evaluation, and formal proofs of encode/decode correctness.

"I genuinely do not know of another tool that can do all of the following: ML-style functional programming, metaprogramming so easy you can do it just to make your life easier, access to every C library ever written, honest to god proofs for your pure code, monadic style imperative programming."

"This was so easy I could actually use it to understand something! I was assisted! By my proof assistant!" — Markus de Medeiros

markusde.ca/pages/leanrocks.html

Why Lean?

Kernel trust: RL can't hack it. Multiple independent kernels.
Mathlib as training data: 200,000+ theorems, 2.2M+ lines.
Extensibility: tactics, metaprogramming, embedded DSLs.
Community: 8,000+ GitHub repos, 200,000+ VS Code + Open VSX installs.

ACM SIGPLAN Award 2025: "Lean has become the de facto choice for AI-based systems of mathematical reasoning."

Why Lean?

How to Get Started

Concrete entry points:

Summary

Lean FRO is a nonprofit. 20 engineers. Not controlled by any single company.

One platform: verification, programming, AI.

Deployed across multiple domains.

The Making of Lean

lean-lang.org | lean-lang.org/fro

Thank You

Leo de Moura
lean-lang.org
leanprover.zulipchat.com

leodemoura.github.io

ETAPS Industry Day | April 15, 2026

Lean: A Unified Platform for Verification, Programming, and AI

Monday recap

AWS and Lean

Cedar: Authorization Policy Language

Cedar: Verified Components

Cedar: Verification Queries

Cedar: The Engineering Story

SampCert: Verified Differential Privacy

SampCert: What's Verified

SampCert: Deployed at AWS

Strata: Language Syntax and Semantics

Strata: Why Lean

Waimea: Verified Compiler for Trainium

Waimea: Why It Matters

Anneal: Verified Unsafe Rust at Google

Anneal: AI-Ready Verification

Veil: Verified Distributed Protocols

Veil: Beyond EPR

One Platform, Many Domains

AI + Lean

Harmonic Aristotle: IMO Gold

Aristotle: Beyond Competition Math

Move Borrow Checker: The Problem

Move Borrow Checker: What Was Proved

Move Borrow Checker: AI's Role

Your Program Is Already a Specification

AI's Response: Tail-Recursive Implementation

The Equivalence Theorem

On Trust

Everyday Trust: Axioms and the Blue Check

Re-checking Proofs: lean4checker

Gold Standard: comparator + External Checkers

On Trust: Lean as a Programming Language

CSLib

"Lean Rocks": A Student's Story

Why Lean?

How to Get Started

Summary

Thank You

Re-checking Proofs: `lean4checker`

Gold Standard: `comparator` + External Checkers