How the ESM3 Model Simulates 500 Million Years of Evolution: A Real-World "Evolution Simulator"

Rodrigo
Mar 12
4 min read

Updated: Apr 10

In the Japanese manga Doraemon, there exists a fictional gadget called the "Evolution and Devolution Ray." When a creature is struck by this device, it can either evolve forward into a future form or revert to an ancestral state. The concept is amusing, but everyone understands that such a device cannot exist in reality. Biological evolution is an extraordinarily complex and prolonged process. Humans cannot directly observe organisms evolving in real time with a machine, let alone simulate the entire evolutionary history of a protein.

Yet a study published in Science in 2025, titled "Simulating 500 million years of evolution with a language model," has dramatically shifted this perspective. In this research, scientists used generative artificial intelligence to reconstruct approximately 500 million years of evolutionary history for green fluorescent protein (GFP). In simpler terms, the researchers created a system that functions somewhat like Doraemon's fictional ray: a computer model that gradually transforms a protein sequence until it resembles a fluorescent protein and actually performs the same function.

So how did the research team use artificial intelligence to simulate evolution?

Fig. 1. GFP sequence model (Image source：Zephyris， CC BY-SA 3.0 )

Building an Evolutionary Language Model

The researchers first developed a generative language model known as ESM3. This model can generate large numbers of protein sequences by learning patterns from extensive protein databases.

To make the system capable of reasoning about proteins, the researchers represented proteins using three types of tokens:

sequence
structure
function

These tokens can be thought of as a biological language. Within this framework, ESM3 learns relationships between amino-acid sequences, the three-dimensional structures those sequences form, and the biological functions that emerge from those structures.

Through training on massive datasets of natural proteins, the model learns which sequences tend to produce certain structures and which structural patterns are associated with specific biochemical functions.

The Transformer Engine Behind ESM3

At its core, ESM3 uses a multi-layer transformer architecture, a type of neural network widely used in modern AI language models. The system was trained at three different scales:

small: 1.4 billion parameters
medium: 7 billion parameters
large: 98 billion parameters

The model processes the sequence, structural, and functional tokens simultaneously within a shared latent representation. This allows the system to reason about proteins in an integrated manner.

To handle protein geometry, the researchers implemented a geometric attention mechanism that monitors structural relationships during generation. This ensures that predicted protein structures remain physically plausible rather than producing unrealistic shapes.

Simulating Natural Selection with Masked Language Modeling

In addition to generating sequences, the researchers designed a system resembling evolutionary selection.

The model uses a masked language modeling framework, where parts of a protein's tokens are hidden and the model attempts to predict them. By repeatedly masking and filling in tokens, the model explores many possible sequence variants.

One can think of this masking system as an analogue of natural selection. Sequences that fail to satisfy the given conditions are discarded, while promising candidates continue to evolve through further iterations.

This iterative process allows the system to explore protein sequence space in a way that resembles evolutionary trial and error.

Testing Whether ESM3 Can Simulate Evolution

To test the system, the researchers attempted to generate a fluorescent protein similar to GFP.

They began with a starting protein sequence that shared only limited similarity with known fluorescent proteins. Then they provided the model with specific prompts, including:

key amino acid residues involved in GFP fluorescence
atomic-level structural information such as catalytic site coordinates
functional keywords such as "fluorescent" or "autocatalytic"

The model then iteratively generated candidate proteins by filling in masked tokens and producing new sequence variants.

These candidates were evaluated using structural prediction tools such as ESMFold, with selection criteria including:

structural confidence (pTM > 0.8)
structural similarity to the prompt (cRMSD < 1.5 Å)

Through repeated sampling and filtering, the model gradually produced increasingly promising designs.

A Synthetic Protein with the Equivalent of 500 Million Years of Evolution

Eventually, the system generated a protein variant called esmGFP.

Remarkably, its amino-acid sequence shared only 58% similarity with the closest known natural GFP. Based on evolutionary comparisons, this degree of divergence corresponds to roughly 500 million years of natural evolutionary distance.

To verify whether the protein truly functioned as a fluorescent protein, researchers synthesized the sequence and expressed it in Escherichia coli.

The result was striking:the engineered protein emitted fluorescence with brightness comparable to natural GFP proteins.

This experiment demonstrated that the model was not merely predicting protein sequences but could actually generate functional proteins far outside the sequence space explored by natural evolution.

Future Possibilities

As a protein design tool, ESM3 opens far broader possibilities than simply simulating evolutionary history.

In the future, scientists could potentially design entirely new proteins on demand. Possible applications include:

antibodies for cancer therapy
enzymes capable of degrading plastic waste
new classes of protein-based materials

With larger models and expanding protein databases, the potential applications in medicine, biotechnology, and synthetic biology may continue to grow.

Fig. 2. Natural source of GFP, the jellyfish Aequorea victoria (Image source：Mnolf， CC BY-SA 3.0 ) — Fig. 2. Natural source of GFP, the jellyfish *Aequorea victoria* (Image source：Mnolf， CC BY-SA 3.0 )

Ethical Considerations

Like Doraemon's fictional evolution ray, however, this technology can be seen as a double-edged sword.

While the ability to design proteins offers enormous benefits, it also raises important questions about safety and misuse. Artificially designed proteins could potentially be applied in harmful ways if not carefully regulated.

Addressing these ethical and security concerns will become an increasingly important challenge as AI-driven biological design technologies advance.

A New Era of Programmable Evolution

This study represents a milestone in the history of protein science. Instead of merely predicting protein structure or function, researchers have begun to simulate the evolutionary process itself.

Although the technology cannot yet replicate evolution as freely as the fictional gadget from Doraemon, ESM3 marks the beginning of a new era in which protein evolution can be explored through programmable computation.

For more information about the ESM3 model, see the official project website: https://www.evolutionaryscale.ai/blog/esm3-release

Author: Rodrigo

Reference:

Hayes, T et al. (2025). Simulating 500 million years of evolution with a language model. Science, 387(6736), 850-858. https://doi.org/10.1126/science.ads0018