Futures of Mathematics Institute

An independent research institute dedicated to the discovery, formalization, and radical extension of mathematical knowledge — through the collaboration of human and artificial intelligence, and through the convening of the global mathematical community.

Purpose and goals

FMI was founded around a thesis about where formal reasoning is heading. Language models on their own produce fluent output without enforcing consistency; the resulting hallucinations are a property of the substrate, not a fixable defect. Paired with proof assistants — Lean, Coq, Agda — a different computational medium emerges, in which an assertion must pass type-checking and proof-checking before it counts as having been said. Validity is no longer evaluated after generation; it is a precondition of generation.

Louis Herlands set out this thesis in The Crystalline Phase (October 2025), the position paper that frames the institute's intellectual program. The thesis applies first to mathematics, where formalization of contemporary work has reached scale, but it is not confined to mathematics. High-assurance software, scientific inference, and any field that depends on logical derivation are subject to the same shift. FMI is organized around the institutional, infrastructural, and educational adaptations that this shift requires.

The present moment is best understood as an interregnum. The destination is not in serious dispute, and many of the field's most senior mathematicians are publicly supportive; their day-to-day practice is slower to follow. Hand formalization in Lean is widely treated as a net positive for individual mathematical understanding — the discipline of forcing implicit assumptions into the open is itself clarifying. Autoformalization is more contested. Long autoformalized proofs have been observed to introduce unstated axioms in place of unproven steps, and to certify nearby statements that are not the lemma a human submitted. A formal certificate is not nothing; without careful human-readable audit, it is not yet enough to warrant the trust on which a working mathematician's reputation depends.

FMI's primary objectives and purposes are:

to advance and openly share mathematical knowledge
to support formal methods and human–AI collaboration in mathematical research
to convene the international mathematical community across regions
to support early-career mathematicians, particularly from regions historically underrepresented at the frontier of the field
to publish openly the recordings, transcripts, and proceedings of FMI programs

FAIR for Formal Mathematics

FMI's principal program is FAIR for Formal Mathematics (F4FM), a community-led standards effort to apply the FAIR Data Principles — Findable, Accessible, Interoperable, Reusable — to the outputs of formalized mathematics. The principles were originally articulated for scientific data management (Wilkinson et al., 2016) and are now standard across most quantitative disciplines. Formal mathematics has not adopted them in a coordinated way.

The discipline is at a scaling threshold. Mathlib has crossed two million lines of Lean. Major projects — the Liquid Tensor Experiment, the Polynomial Freiman–Ruzsa conjecture, sphere packing, Fermat's Last Theorem — now ship with blueprints, large contributor counts, and increasing AI assistance. Infrastructure for attribution, provenance, and reuse has not scaled with the mathematics. There are no persistent identifiers at the theorem level. Refactors erase contribution chains. AI-assisted proofs carry no machine-readable record of which agent generated which step. Cross-project provenance — from paper to blueprint to Lean source to upstreamed Mathlib namespace — exists only in human memory.

Deliverables

F4FM is organized around one concrete deliverable per FAIR pillar:

Findable. Theorem-level persistent identifiers issued via DataCite, surviving rename, generalization, and refactor; resolvable from any paper, blueprint, or downstream library.
Accessible. A machine-readable provenance schema attached to every declaration: original author or authors, AI agents involved (linked to the KYA Registered Agent Registry), refactor lineage, blueprint of origin, and license.
Interoperable. A cross-linking standard between paper sources (LaTeX, arXiv), blueprint nodes, Lean declarations, and Mathlib namespaces — bidirectional, queryable, and version-stable.
Reusable. Attribution preservation under upstreaming, so that the chain of credit carries forward when a private-repository lemma is generalized into Mathlib rather than being flattened.

Why now

Three catalysts converge in 2026. AI co-authorship is structural rather than occasional, and the human–AI dyad cannot be governed without provenance. Mathlib's refactor cadence has outpaced its attribution model — the maintenance burden was documented in the August 2025 Growing Mathlib paper, and attribution loss is a known cost. Funders across quantitative disciplines are moving toward requiring FAIR compliance; formal mathematics will need this for major grants regardless, and is better positioned leading the standard than retrofitting to one.

Plan

FMI will convene a F4FM working group in Q4 2026, with a draft specification v0.1 targeted for Q1 2027. The standard and reference tooling will be released publicly under a permissive license. Two pilot implementations are planned, one Mathlib-internal and one in an external blueprint project.

Partners

F4FM will engage natural partners across the Lean ecosystem (the Lean FRO, the Mathlib Initiative), persistent-identifier infrastructure (ORCID, DataCite), publication infrastructure (arXiv), and major mathematics institutes. Participation will be confirmed publicly as memoranda of understanding are signed.

Licensing

Patents covering the application of FAIR principles to Lean and related formal-mathematics infrastructure are made available, on a non-exclusive basis and free of charge, to academic and non-profit institutions for research and educational use — in the model adopted by the Broad Institute for CRISPR. Commercial use is governed by separate license terms.

The Crystalline Phase only compounds capability if every wave is verified, and every contribution is citable. Verification without attribution is fluency without responsibility.

Mathematics for Hardware Verification

A second area of FMI focus is the mathematics underlying the formal verification of hardware. Modern chip design is dominated by simulation and coverage-driven testing; rigorous mathematical proof of correctness remains the exception. FMI's working hypothesis is that the proof-assistant infrastructure now hosting formal mathematics — most prominently Lean 4 and Mathlib4 — can serve as a substrate for hardware verification, alongside the engineering formal-methods tradition represented by communities such as the NASA Formal Methods Symposium.

The program connects naturally to F4FM's standards work on attribution and provenance for formal-mathematical artifacts. Materials and tooling produced under the program will be released openly when available, with the same licensing posture as F4FM — non-exclusive, free of charge to academic and non-profit institutions for research and educational use.

LLM Ethology and AI Safety

The behavior of large language models is increasingly observable, increasingly consequential, and increasingly poorly characterized. Existing evaluation methods — capability benchmarks, accuracy on held-out tasks — capture what a model can do but say little about how it behaves under conversational and contextual variation. Behavioral changes between model versions are noted by users but not systematically measured. The question of when a model has changed enough to warrant a new name is currently unanswerable on empirical grounds.

FMI is developing a program for the rigorous behavioral characterization of large language models, drawing on the discipline of animal ethology. The program builds on the field of "machine behavior" articulated by Rahwan and colleagues in Nature (Rahwan et al., 2019), which argued that AI systems should be objects of behavioral study in their own right. The methodological scaffolding the field has been waiting for can be drawn from ethology — the observational study of behavior in context — which has long-developed methods for characterizing systems whose neural substrate is opaque, requires no access to internal mechanism, and is therefore well-suited to LLMs.

Methodology

The program organizes behavioral characterization into a six-family test battery: boundary behavior under variation, long-context consistency, register stability under pressure, self-report calibration, cross-session variability, and generational comparison. Together these produce a behavioral profile for a given model under controlled conditions, supporting cross-model comparison and the detection of meaningful between-version shifts.

A key methodological design choice is the separation of front-of-house personality (system-prompt behaviors, RLHF artifacts, character cards) from substrate behavior (capability profile, reasoning patterns). The two layer separately and require separate measurement. The methodology requires no access to model weights, training data, or internal activations — the entire approach is on observable behavior, in the manner of classical ethology applied to biological subjects whose neural substrate is opaque. The constraint makes the methodology lab-independent and enables research access without compromising proprietary material.

The full methodology is documented in the working draft An Ethological Approach to Large Language Model Behavior: Study Design and Methodology (May 2026). A naturalistic pilot observation conducted in May 2026 informed the design and identified candidate behavioral features for inclusion in the structured battery, including a candidate seventh family on wrapper-substrate decomposition.

AI safety relevance

Three institutional needs depend on systematic behavioral characterization that does not currently exist: detection of behavioral signatures that indicate model malfunction or unwarranted capability shift; behavioral envelopes against which agentic AI systems can be monitored under governance frameworks; and stable behavioral characterization that makes agent attribution in provenance metadata empirically meaningful, as required by F4FM. The LLM ethology program is the basic-science engine that makes the governance and standards programs possible.

Plan

A six-month pilot phase will produce a methodology paper, a behavioral characterization protocol, and a first cohort of model behavioral profiles, including a wrapper-versus-substrate comparison drawn from a personality-mode and its underlying default deployment. Subsequent phases extend characterization to additional models and to generational comparison across versions of the same underlying architecture.

Partners

The program will be led by a principal investigator with primary expertise in the behavioral characterization of living systems, working with collaborators on the machine-learning side. Research access for systematic behavioral study will be sought through the established research-access programs at the major model-developing laboratories. FMI provides convening capacity and editorial coordination. Materials produced under the program will be released openly, with the same licensing posture as F4FM — non-exclusive, free of charge to academic and non-profit institutions for research and educational use.

Programs

Convening

Annual Symposium, rotating across regions
Regional Salons
Futures Reports, published open access
Open Archive of recordings, transcripts, and proceedings

Training

Math + AI Bootcamps and Hackathons
Bridge Fellowships for early-career researchers
Mentoring Network

Infrastructure

Compute Access Coalition
Regional anchor partnerships

Training and Infrastructure programs launch in 2026–2027 with named partners.

Symposium

Stanford University · May 1–2, 2026

The two-day program addressed formal methods and Lean-based formalization, and AI for mathematics, and closed with a Fields Medalist roundtable.

This event is co-sponsored by Stanford HAI and SISL, in partnership with the Futures of Mathematics Institute, with support from OpenAI, KeyBank, BroadRiver Asset Management, and Renaissance Philanthropy. Special thanks to all the volunteers whose work made this possible.

Recordings

Day 1 — Formal Methods

Day 2 — AI for Mathematics

Speakers

Terence Tao, UCLA “New mathematical workflows” (keynote)
Maryna Viazovska, EPFL “Formalizing the sphere-packing problem” (keynote)
Michael Freedman, Harvard “Compression is all you need — modeling mathematics” (keynote)
Mohammed Abouzaid, Stanford
Sanjeev Arora, Princeton “Rethinking thoughts — the power of self-reflection for AI”
Clark Barrett, Stanford “CSLib: A platform for AI-assisted formal verification”
Andrea Bertozzi, UCLA “A foray into AI for mathematics”
Adam R. Brown, Google DeepMind / Stanford “A.G.I. and the future of reasoning”
Sébastien Bubeck, OpenAI “Probability, combinatorics, and optimization — GPT-5 to GPT-5.5”
Kevin Buzzard, Imperial College London “On autoformalisation”
Emmanuel Candès, Stanford
Leonardo de Moura, AWS · Lean FRO “Machine-checked mathematics in the age of AI”
Sergei Gukov, Caltech “AI tools for long-horizon, sparse-reward tasks”
Deirdre Haskell, Fields Institute “Mathematical AI at the Fields Institute”
Thang Luong, Google DeepMind “Towards AI superhuman reasoning”
Ravi Vakil, Stanford

Panels

Formal and Informal Methods — Day 1: Sanjeev Arora (Princeton), Clark Barrett (Stanford), Leonardo de Moura (Amazon), Sergei Gukov (Caltech).
AI for Mathematics — Day 2: Mohammed Abouzaid (Stanford), Adam Brown (DeepMind), Thang Luong (DeepMind), Ravi Vakil (Stanford).
Fields Medalist Roundtable — Day 2: Deirdre Haskell (Fields Institute), Terence Tao (UCLA), Ravi Vakil (Stanford), Maryna Viazovska (EPFL).

Peer Institutions

FMI's program sits within a wider field of independent research institutes whose work in mathematics has shaped the discipline. Several of them are natural collaborators on the work ahead.

Clay Mathematics Institute — mathematical research foundation; the Millennium Prize Problems.
Fields Institute for Research in Mathematical Sciences — thematic mathematical-sciences programs, Toronto.
Institute for Advanced Study — independent research institute, Princeton; School of Mathematics.
American Institute of Mathematics — collaborative-research institute, San Jose; long-running role in formalization workshops.
Institut des Hautes Études Scientifiques (IHES) — advanced research in mathematics and theoretical physics, France.
SLMath — Mathematical Sciences Research Institute, Berkeley (formerly MSRI).

Philanthropy Partners

Mathematical research has been sustained over decades by a small number of philanthropic institutions whose long horizons have made foundational work possible.

Simons Foundation — private foundation supporting mathematics and the basic sciences.
Alfred P. Sloan Foundation — philanthropic support for science, technology, and economic research.
Renaissance Philanthropy — coordinated philanthropy for science, technology, and education at scale.

Tools: The Lean FRO

Mathematics progresses by asking the right questions and by having industrial-strength tools. The Lean FRO is building the tools.

The Lean Focused Research Organization, founded by Leonardo de Moura, advances Lean — the proof assistant that hosts F4FM and much of the Crystalline Phase agenda.

People

Louis Herlands, PhD — Founder and Director

Through Yale's Scholars of the House program, Louis Herlands studied the structure of the nucleosome in the laboratory of Donald M. Crothers. He earned his PhD from The Rockefeller University in the laboratory of Vincent Allfrey, the father of epigenomics, and held an NSF Postdoctoral Fellowship in Gerd Maul's laboratory at the Wistar Institute, working in collaboration with Günter Blobel's nuclear pore complex group at Rockefeller; the research isolated NPCs from clam oocytes by gentle methods Maul had developed — without high salt, detergents, or nucleases — in pursuit of structural RNA as a possible constituent of the complex. He spent his career as a life sciences operating executive, leading drug development programs in neuropathic pain, obesity, diabetes, and cancer, and co-founding biotechnology and pharmaceutical companies. He serves on the Rockefeller University Council and the McLean Hospital National Council. FMI is based in Woods Hole, where Herlands attended the Physiology Course at the Marine Biological Laboratory in 1977.

A Board of Directors, Scientific Advisory Council, Executive Director, and Editorial Board are being constituted. Appointments will be posted as confirmed.

Support

Contributions to FMI support the Annual Symposium, Futures Reports, and Bridge Fellowships. For donor enquiries, write to support@futuresofmath.org. For partnership and program enquiries, join@futuresofmath.org.

FMI is preparing its Form 1023 application for federal 501(c)(3) determination, which has not yet been granted. Donors who wish to make tax-deductible contributions to mathematics today may prefer to give through established peer institutions — the Clay Mathematics Institute, the Fields Institute, or the American Mathematical Society — whose work overlaps substantially with FMI's. We are happy to make introductions.

Contact

Futures of Mathematics Institute
13 Church Street, Woods Hole, MA 02543
contact@futuresofmath.org