Annual Report [2018]

Developing the Common Language of Computational Science

Flatiron Institute

The universe has an inherent elegance illuminated by mathematics. A single class of equations can help describe how planets spin around a star, how blood cells flow through a vein, and how electrons travel along a wire.

In October 2018, the Flatiron Institute established its fourth research center to further the computational tools that play a crucial role in modern science and engineering and strengthen their mathematical foundations. The Center for Computational Mathematics (CCM) collaborates with the institute’s centers for astrophysics, biology and quantum physics and conducts its own research on problems faced by the scientific community at large.

“Like the other centers, CCM will be a place that builds software tools for the greater academic community,” says CCM director Leslie Greengard, who previously directed the Center for Computational Biology (CCB). “The difference is that the other centers typically have a particular application in mind, but the nice thing about mathematics is that often the solutions you develop apply to multiple fields.”

“The CCM embodies one of the Flatiron Institute’s core tenets: that math is the common language of science.”

The CCM embodies one of the Flatiron Institute’s core tenets, “that math is the common language of science,” says CCM project leader Christian Müller. That idea has formed a cornerstone of the Flatiron Institute since its inception as the Simons Center for Data Analysis in 2013.

“Since the beginning, there was a desire for people in different centers to interact and work together,” says CCM research scientist Eftychios Pnevmatikakis. “But often we were getting lost in the details of the applications. The astronomers couldn’t talk biology, and vice versa. With CCM, I see the interactions happen much more organically. We all talk math. CCM looks like it will be a bridge for the different centers and also have a life of its own.”

At full capacity, the CCM will house about 50 scientists, mathematicians and programmers. Many of the initial staff transferred from groups at the CCB, taking with them mathematical and computational problems rooted in biology. As the CCM continues to grow, so too will the breadth of the inspirations and applications of the center’s work.

The 3-D structure of an 80S ribosome molecule from Plasmodium falciparum, a protozoan parasite responsible for around 50 percent of all malaria cases in humans. Researchers reconstructed the shape from electron microscopy measurements. Structures that vary little from molecule to molecule are shown in blue, whereas regions that vary a lot are shown in red and may need additional measurements and analysis to produce an accurate reconstruction. Image courtesy of Joakim Andén; data from W. Wong et al./eLife 2014

“At Flatiron, CCM is surrounded by a problem-rich environment and a diverse set of experts we can collaborate with,” says CCM group leader Alex Barnett. “Opportunities like this center only come along once in a lifetime.”

One of the CCM’s areas of interest is leveraging machine-learning techniques. Machine-learning models ‘learn’ by ingesting large amounts of example training data. After training, the models can produce results not possible through conventional methods.

The problem, though, is that machine-learning algorithms such as neural networks “are black boxes,” says Stéphane Mallat, CCM distinguished research scientist and a professor at Collège de France and École Normale Supérieure in Paris. “It works well, but we don’t know what’s being learned. It doesn’t help us understand the phenomena.”

The tech companies driving machine-learning development, including Google and Facebook, focus on applications such as image recognition, natural language and marketing. “Right now, machine learning is very much an empirical field,” Mallat says. “There are many algorithms which are working well, but we don’t understand what type of structures they learn and the mathematics behind them. This means that we cannot interpret results or guarantee their robustness.”

“With CCM, I see the interactions happen much more organically. We all talk math. CCM looks like it will be a bridge for the different centers and also have a life of its own.”

At the CCM, researchers will reverse engineer solutions from machine-learning applications to figure out what led to the result. Using that information, the researchers hope to learn more about real-world systems and improve conventional methods. The CCM hosts regular meetups in which researchers from all four centers discuss the latest developments in the field and talk about their machine-learning projects.

“There’s a need for new mathematics in this area,” Mallat says.

Another research focus for the CCM also involves working backward, so to speak, but using experimental results. Scientists often calculate cause and effect: for example, how light from a lamp will bounce around a room. A trickier question is the reverse, known as an inverse problem. Given the lighting in a room, what can you learn about the light source? Inverse problems crop up in astronomy and neuroscience as well as in medical-imaging applications such as CT and MRI scans.

CCM research scientist Joakim Andén focuses on an inverseproblem related to discerning the 3-D layout of a molecule. The experiment involves chilling molecules to extremely low temperatures and bombarding them with electrons. The electrons graze off the molecules, losing some speed. Based on how much each electron slowed down, scientists deduce the molecule’s shape. A challenge is that molecules break down when hit by too many electrons, meaning scientists can only get a relatively small number of data points from each experiment.

“It just looks like pure static,” Andén says. “If I showed you one of these images, you wouldn’t believe that there was anything in there.” He and his colleagues at the CCM are working to make sense of the static faster and more accurately.

The CCM also focuses on speeding up basic computational tasks used across many applications. One such task involves solving partial differential equations, which appear in a variety of areas, ranging from acoustics to astrophysics to fluid dynamics. Those equations, dubbed PDEs, arise whenever a quantity depends on more than one independent variable — such as the three spatial coordinates and time — and the rates of change in these variables are coupled in a known way.

Solving PDEs accurately is often incredibly slow, says CCM research scientist Manas Rachh. Much work has focused on removing this computational speed bump. In 1986, Greengard and Vladimir Rokhlin co-invented the fast multipole method — a technique that accelerates the calculation of long-range forces in problems with many components that influence one another. This method has played a pivotal role in the development of fast, robust and accurate PDE solvers. At the CCM, Rachh and others continue to hunt for shortcuts for solving PDEs.

With fast enough solutions, engineers could potentially design, test and optimize devices such as microfluidic controllers and computer chips without the need for the costly hassle of producing and testing prototypes. “That’s what we’re working towards,” Rachh says. “We want tools robust and accurate enough for engineering applications.”

The CCM has many other research focuses, each with potential benefits to many research areas. Barnett expects the center to continue taking on new ideas as it expands.

“We shouldn’t be scared of leaping into problems that are new to us,” Barnett says. “Academia doesn’t often reward that risk-taking, or the software development that needs to go along with it, and instead encourages you to do an incremental version of what you did before. Here, we are lucky enough to be able to take such risks, and that can lead to larger breakthroughs.”