Drug Discovery, STAT! NVIDIA, Recursion Speed Pharma R&D With AI Supercomputer

Described as the largest system in the pharmaceutical industry, BioHive-2 at the Salt Lake City headquarters of Recursion debuts today at No. 35, up more than 100 spots from its predecessor on the latest TOP500 list of the world’s fastest supercomputers.

The advance represents the company’s most recent effort to accelerate drug discovery with NVIDIA technologies.

“Just as with large language models, we see AI models in the biology domain improve performance substantially as we scale our training with more data and compute horsepower, which ultimately leads to greater impacts on patients’ lives,” said Recursion’s CTO, Ben Mabey, who’s been applying machine learning to healthcare for more than a decade.

BioHive-2 packs 504 NVIDIA H100 Tensor Core GPUs linked on an NVIDIA Quantum-2 InfiniBand network to deliver 2 exaflops of AI performance. The resulting NVIDIA DGX SuperPOD is nearly 5x faster than Recursion’s first-generation system, BioHive-1.

Performance Powers Through Complexity

That performance is key to rapid progress because “biology is insanely complex,” Mabey said.

Finding a new drug candidate can take scientists years performing millions of wet-lab experiments.

That work is vital; Recursion’s scientists run more than 2 million such experiments a week. But going forward, they’ll use AI models on BioHive-2 to direct their platform to the most promising biology areas to run their experiments.

“With AI in the loop today, we can get 80% of the value with 40% of the wet lab work, and that ratio will improve going forward,” he said.

Biological Data Propels Healthcare AI

Recursion is collaborating with biopharma companies such as Bayer AG, Roche and Genentech. Over time, it also amassed a more than 50-petabyte database of biological, chemical and patient data, helping it build powerful AI models that are accelerating drug discovery.

“We believe it’s one of the largest biological datasets on Earth — it was built with AI training in mind, intentionally spanning biology and chemistry,” said Mabey, who joined the company more than seven years ago in part due to its commitment to building such a dataset.

Creating an AI Phenomenon

Processing that data on BioHive-1, Recursion developed a family of foundation models called Phenom. They turn a series of microscopic cellular images into meaningful representations for understanding the underlying biology.

A member of that family, Phenom-Beta, is now available as a cloud API and the first third-party model on NVIDIA BioNeMo, a generative AI platform for drug discovery.

Over several months of research and iteration, BioHive-1 trained Phenom-1 using more than 3.5 billion cellular images. Recursion’s expanded system enables training even more powerful models with larger datasets in less time.

The company also used NVIDIA DGX Cloud, hosted by Oracle Cloud Infrastructure, to provide additional supercomputing resources to power their work.

Animation of how Recursion trains AI models for drug discovery on NVIDIA GPUs
Much like how LLMs are trained to generate missing words in a sentence, Phenom models are trained by asking them to generate the masked out pixels in images of cells.

The Phenom-1 model serves Recursion and its partners in several ways, including finding and optimizing molecules to treat a variety of diseases and cancers. Earlier models helped Recursion predict drug candidates for COVID-19 nine out of 10 times.

The company announced its collaboration with NVIDIA in July. Less than 30 days later, the combination of BioHive-1 and DGX Cloud screened and analyzed a massive chemical library to predict protein targets for approximately 36 billion chemical compounds.

In January, the company demonstrated LOWE, an AI workflow engine with a natural-language interface to help make its tools more accessible to scientists. And in April it described a billion-parameter AI model it built to provide a new way to predict the properties of key molecules of interest in healthcare.

Recursion uses NVIDIA software to optimize its systems.

“We love CUDA and NVIDIA AI Enterprise, and we’re looking to see if NVIDIA NIM can help us distribute our models more easily, both internally and to partners,” he said.

A Shared Vision for Healthcare

The efforts are part of a broad vision that Jensen Huang, NVIDIA founder and CEO, described in a fireside chat with Recursion’s chairman as moving toward simulating biology.

“You can now recognize and learn the language of almost anything with structure, and you can translate it to anything with structure … This is the generative AI revolution,” Huang said.

“We share a similar view,” said Mabey.

“We are in the early stages of a very interesting time where just as computers accelerated chip design, AI can speed up drug design. Biology is much more complex, so it will take years to play out, but looking back, people will see this was a real turning point in healthcare,” he added.

Learn about NVIDIA’s AI platform for healthcare and life sciences and subscribe to NVIDIA healthcare news.

Pictured at top: BioHive-2 with a few members of the Recursion team (from left) Paige Despain, John Durkin, Joshua Fryer, Jesse Dean, Ganesh Jagannathan, Chris Gibson, Lindsay Ellinger, Michael Secora, Alex Timofeyev, and Ben Mabey. 

Unlock the power of our talent network. Partner with QAT Global for your staffing needs and experience the difference of having a dedicated team of experts supporting your enterprise’s growth.

Explore Articles from QAT Global