Annual Report 2020

Revolutionizing Simulations of the Universe With AI

Center for Computational Astrophysics

Reverse engineering the universe is a tricky thing. To figure out how dying stars explode, an astronomer can’t blow up a thousand stars. But they can create and destroy virtual stars on a computer, where lines of code emulate the laws of physics. Such simulations can predict the fate of not just single stars but also solar systems, galaxies and even the entire universe.

But the practicality of simulations is often limited by the sheer amount of computing resources they require. That’s where machine learning, or ‘deep learning,’ is starting to help. Neural networks that mimic the brain’s web of biochemical connections can study a handful of simulations and learn to fill in the gaps. Critically, they can do so millions of times faster than it would take to run a continuum of simulations from scratch.

“We usually want to understand some fundamental quantities of the universe; it could be the planets, it could be gravitational waves, black holes, it could be the universe itself,” says Shirley Ho, head of the Cosmology X Data Science group at the Flatiron Institute’s Center for Computational Astrophysics (CCA). “We want to accelerate simulations to produce the observations we want … and then compare them to the data.” 

The top panels show the large-scale distribution of dark matter for thousands of simulations performed with the IllustrisTNG (left) and SIMBA (right) galaxy formation models as part of the CAMELS project. The bottom panels compare the distribution of dark matter, galaxies (and their stars), gas density and gas temperature for one representative simulation as performed by each model with the same initial conditions. Credit: Francisco Villaescusa-Navarro, Daniel Anglés-Alcázar and Shy Genel

Researchers at the CCA are using deep learning to speed up simulations of vast volumes of space that enclose millions of galaxies. The goal is to get more out of upcoming telescopes such as the Simons Observatory, whose observations will refine measurements of fundamentals such as the evolution of dark energy, the enigmatic force accelerating the expansion of the cosmos.

“We’ve got a new set of tools,” says CCA director David Spergel. “Machine-learning tools may let us even recover the initial conditions of the universe.”


Cosmological simulations that track the assembly of gargantuan superclusters of galaxies are massively ‘multi-scale’ problems: Cumulative effects from even individual stars can ripple across millions of light-years to alter the fates of entire galaxies. Building and running a simulation that connects all these scales is no small feat, often requiring millions of CPU hours. 

The spatial distribution of dark matter in a region of space roughly 300 million light-years across, taken from one of the Quijote simulations, which astrophysicists use to train neural networks. Dense halos connected by long thin filaments form a cosmic web; the densest regions (dots) house galaxy clusters. Click the plus button to toggle additional elements in this image.

But neural networks are good at linking multiscale phenomena, says Stéphane Mallat, a distinguished research scientist at the Flatiron’s Center for Computational Mathematics (CCM). What’s more, a neural network doesn’t need to see simulations of every imaginable scenario. It can study representative samples and then churn out new simulation results without working through all the underlying physics from scratch. 

“We think we can train a neural network to learn the relationship between dark matter and galaxies, or dark matter and gas,” says Rachel Somerville, leader of the CCA’s Galaxy Formation group. “Then the network is much faster than running this full simulation.” Early results are promising, she says. “We’ve already started doing some tests. We know that it kind of works.”

One ongoing test for training a neural network is CAMELS, the Cosmology and Astrophysics with MachinE Learning Simulations. The project is led by CCA associate research scientist Daniel Anglés-Alcázar, CCA associate research scientist Shy Genel and former CCA research fellow Francisco Villaescusa-Navarro. Within CAMELS, CCA researchers recently trained a neural network on thousands of cosmological simulations to accomplish a number of tasks, such as predict the star formation rate of galaxies based on only a few parameters such as the abundance of matter. Another effort, known as dm2gal — for ‘dark matter to galaxies’ — taught a neural network to add the right amount of stellar mass in virtual galaxies using knowledge of how much dark matter was present.

While these researchers take on the universe, others at the CCA are using machine learning on a smaller scale. 

Colliding pairs of black holes are all the rage: Since 2015, astronomers have detected gravitational waves from 46 such mashups, and they’d like to know how these pairings form. Perhaps they come from massive stars paired up since birth, wandering their home galaxy as a lonely duo. Or maybe they’re churned out in jam-packed globular star clusters, where hundreds of thousands of stars routinely trade partners.

Flatiron researchers set a neural network loose on the problem. They fed the machine the output from simulations that predicted black hole masses produced in these two scenarios. The network deduced the relevant relationships connecting various parameters of binary stars and globular clusters to the black hole masses that each produces, and determined that globular clusters account for about 80 percent of the known binary black holes.

That result is far from the final word on the matter, partly because the neural network didn’t know about other formation options. But it demonstrates the power of machine learning in exploring binary black hole origins, says study author Katelyn Breivik, a research fellow at the CCA. 

“The best way to understand all of this would be to just simulate all possible scenarios,” she says. “And that’s, of course, completely intractable. But you can pick points along the way and then fill in the other points with machine learning, and then it’s not intractable.”

Remarkably, these machines know nothing about astronomy or physics. But could a machine learn the laws of physics, including some not yet discovered? 

That’s what Princeton graduate student Miles Cranmer set out to do. Working closely with researchers at the Flatiron, he showed a neural network simulation of particles moving about, subject to typical forces found in nature. Just from watching the particles move, the machine “discovered” physics standbys such as Newton’s law of gravity and Hooke’s law of spring force.

He then set the network on an astronomical problem: Is there a way to predict how much dark matter gathers in the center of the dark matter “halos” that envelop every galaxy based only on a halo’s mass and the mass of halos around it? Although astronomers have come up with such a relationship, its precision is a bit shoddy. Cranmer’s neural network cranked out an equation that was far more accurate than the one humans produced.

“It’s a long process to unravel these mysteries,” says Cranmer. “But with artificial intelligence, we can tether science to Moore’s law — the law that says you get an exponential increase in computing power — and maybe get an exponential increase in knowledge, too.”

The Flatiron is well poised to lead the way. Connections across fields as diverse as astrophysics and machine learning are opening possibilities for researchers of all stripes.

“We have maximum freedom at Flatiron, and we have this encouraging interdisciplinary atmosphere,” Ho says. “Here, it’s encouraged to work across fields. It’s encouraged to take a little bit of risk and do something different.”