Official repository for the paper
MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics
Xinyu Liu, Zixuan Xie, Amir Moeini, Claire Chen, Shuze Daniel Liu, Yu Meng, Aidong Zhang, Shangtong Zhang
This repository contains MathlibLemma, a dataset of synthesized "folklore" lemmas that bridge the gap between high-level intuition and formal verification in Lean 4. It serves both as a library of verified theorems and a benchmark for evaluating the formal reasoning capabilities of LLMs.
The repository is organized into two main components:
benchmark/: Contains the 4,028 type-checked statements (without proofs). These serve as the test set for auto-formalization and theorem proving tasks.lemma/: Contains the 1,812 formally verified theorems (with complete proofs generated by our pipeline).
Each component is categorized into three domains based on the complexity and abstraction level of the mathematics:
| Directory | Description |
|---|---|
foundational/ |
This domain covers the “bread and butter” of formalized mathematics: real analysis, discrete structures (e.g., sets, matrices), basic probability, and others. |
applied/ |
This domain includes domains such as advanced probability (martingales), information theory, and convex analysis. |
abstract/ |
This domain spans abstract fields including category theory, algebraic topology, and differential geometry. |
This is a standard Lean 4 project. To build the project and verify the lemmas:
- Install Lean 4: Follow the instructions at leanprover-community/lean4.
- Clone the repository:
git clone https://github.com/Sequential-Intelligence-Lab/MathlibLemma.git cd MathlibLemma - Get dependencies:
lake exe cache get
- Build:
lake build
A key contribution of this project is the discovery of missing lemmas that have been accepted into the official Mathlib library. Below are the representative merged Pull Requests:
-
- Adds
gronwallBound_mono:gronwallBoundis monotone non-decreasing in timex(given non-negative parametersδ,ε,K).
- Adds
-
- Adds
Kernel.restrict_const: restricting a constant kernel to a measurable set commutes with restricting the underlying measure.
- Adds
-
- Adds
centralMoment_congr_ae: central moments agree for a.e.-equal random variables.
- Adds
Contributed by Sequential Intelligence Lab (SIL), University of Virginia.
If you find this repository useful, please cite our paper.
@article{liu2026mathliblemma,
title={MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics},
author={Xinyu Liu and Zixuan Xie and Amir Moeini and Claire Chen and Shuze Daniel Liu and Yu Meng and Aidong Zhang and Shangtong Zhang},
year={2026},
journal={arXiv preprint arXiv:2602.02561}
}This project is licensed under the Apache License 2.0; see LICENSE.