Lean Together 2025

Lean together, Jan 2025

Real World (Pragmatic)

Autoformalization

Siddhartha Gadgil

Indian Institute of Science, Bangalore

https://github.com/siddhartha-gadgil/LeanAide

Autoformalization is the translation of natural language mathematics to Lean code.
This can help with formalization:
- Interactively, helping with syntax and Mathlib terminology (having a search component).
- By producing a draft translation (in the spirit of the Mathlib port), with LLMs also adding details.
Autoformalization also provides a bridge between the natural language reasoning of foundation models and Lean.
Lean helps both with ensuring correctness and organizing mathematics in a modular fashion.

AI/ML is deeply integrated with Lean in LeanAide.
This helps with:
- Better Autoformalization.
- Usability in Lean.
Tools and workflows are a central focus for us, and we make tools available at all stages.
We hope to iteratively improve and extend capabilities focusing on MWEs.
Ideally, Lean and AI can complement each other in tools for mathematicians without expertise in either.
We see our first demo (note that the responses are very fast because of caching).

LeanAide: Prompting

Suppose we are given the statement of a theorem.
We find (say) 20 documentation strings in Lean's mathematical library Mathlib that are closest to the statement (for OpenAI embeddings) and include them in the prompt as examples.
We query GPT-4 with the prompt for (say) 10 responses.

LeanAide: post-processing

We discard responses that are not valid Lean code (do not parse or do not elaborate).
However, the correct response may be preceded by some text or enclosed in a code block (in spite of instructions). So we attempt to parse and elaborate for each group of consecutive lines.
We also do some mild autocorrections, mainly for convention issues like Prime versus IsPrime.
We also use majority voting, grouping together responses that can be proved equivalent by the Aesop tactic.

If a vector space is $2$-dimensional, then it is finite dimensional.


								[{"role": "system",
								"content": "You are a coding assistant who translates from natural language to Lean Theorem Prover code following examples. Follow EXACTLY the examples given."},
								{"role": "user",
								"content": "If a vector space has a finite basis, then it is finite-dimensional. "},
								{"role": "assistant",
								"content": "∀ {K : Type u} {V : Type v} [inst : DivisionRing K] [inst_1 :
								AddCommGroup V] [inst_2 : Module K V] {ι : Type w}\n [inst_3 : Finite ι], Basis ι
								K V → FiniteDimensional K V"},
								...
								{"role": "user",
								"content": "If a vector space has dimension ‘2‘ then it is finite dimensional."}]

Translation to Lean: If a vector space is $2$-dimensional, then it is finite dimensional.


							theorem: ∀ {K : Type u} {V : Type v} 
							[inst : DivisionRing K] [inst_1 : AddCommGroup V] 
							[inst_2 : Module K V],
								Module.rank K V = 2 → FiniteDimensional K V

This was one of our $40$ test silly statements (to avoid contamination).
We also had $40$ theorems and $40$ false statements.
On the test set of 120 example problems, we had a success rate of about 85%.
However, on the Putnam bench the success rate is about 50%.

Informalization and Data Augmentation

Informalization is describing a formal theorem, proof or some code in natural language.
We informalized all Lean theorems that either occured in more than 3 proofs or had docstrings.
To do this, we included in the prompt all the definitions in the statement of the theorem.
These have also been incorporated in autoformalization in LeanAide, along with the APIs from LeanSearch and Moogle.

Better translation?

We perform roundtrips and use a reasoning LLM to compare statements to both evaluate and filter.
Many failed translations were because of missing definitions, which could be fixed by:
- Getting a language model to generate definitions.
- Translating these and including in the prompt.
Even when the definition is present in the statement or in Mathlib, it helps to include it in the prompt.
Rewording the statements avoiding concise phrases like greatest odd divisor.
Meta-Fix: e.g. exists-unique with multiple variables.

Larger-scale Formalization

and

Natural language reaoning

Iterative formalization


					flowchart LR
					  P[(Problem)] --> reasoning[Reasoning by LLM]
					  reasoning --> pf[(Theorem and Proof)];
					  L[(Literature)] --> pf;
					  ans[(Notes/Answers)] --> ocr[OCR and formatting];
					  ocr --> pf;
					  pf --> |LLM|struct[(Structured JSON text)];
					  struct --> form[Autoformalization];
					  form --> lean[(Lean code)];
					  lean --> elab{Elaborate};
					  elab --> |Success|done((Done));
					  elab --> |Error|error((Fix error));
					  elab --> |Sorries|sorry((Fix Sorry));
					  sorry --> |Gap|P;
					  error --> |Details|pf;
					  error --> |Try again|form;

We can start with a problem statement, a theorem in the literature or handwritten notes/answers.
We generate a text, then structured JSON by LLM.
This is translated to Lean programmatically and elaborated, with errors used to iterate.

Lean code via Structured Proofs

A language model is used to translate text to a JSON format with detailed instructions, with each object having a type field as "definition", "theorem", "assumption", "assertion" etc.
We translate theorems and assertions into statements including their context built from assumptions, lets etc and purge duplicates.
Assertions within a proof become have tactics.
Proofs end with (or consist of) a call to the Aesop search tactic with appropriate configuration; including "known results" and fallback to sorry.

My work has depended on:
- Collaborators: Anand Rao Tadipatri, Ayush Agrawal, Navin Goyal, Ashvni Narayanan (phase 1); Anirudh Gupta, Vaishnavi Shirsath, Ajay Nair, Malhar Patel, Sushrut Jog (phase 3).
- Compute/Models: Microsoft's Accelerating Foundation Model Research program; Google's Research credits; DST-FIST funding.

Hopes and Plans

Fix code generation.
Improve translation at sentence and proof level.
Handle issues regarding toolchains and embeddings to make using LeanAide straightforward.
Tasks and prompts to autoformalize at a larger scale by decomposing, summarizing etc.
Add commands for a Lean project to be better suited for AI and software assisted mathematical discovery.