Community Perspective – Anima Anandkumar

Q&A with Anima Anandkumar, AI2050 Senior Fellow

Hurricane prediction, nuclear fusion reactor safety, and medical catheter design don’t appear to have much in common—but they’re all recent applications of Anima Anandkumar’s research. Anandkumar’s work tackles physical processes that are governed by complex mathematical equations, where microscopic interactions impact macroscopic phenomena. For example, wind velocity might be affected by different altitudes, air pressures, or temperatures at fine scales to ultimately influence the path of a hurricane. While traditional methods need to re-run calculations for each new scenario, Anandkumar’s recently developed framework, neural operators, do not. Instead, they learn from the data, finding shortcuts in order to come up with predictions much more efficiently. Neural operators generalize solutions for entire families of equations at multiple scales, enabling applications across multiple scientific domains.

“Neural operators learn at many different resolutions,” says Anandkumar. “It’s giving us the flexibility to first of all incorporate data that may be available at different resolutions, but also any kind of physical constraints and other knowledge we have. That additional power helps us model scientific processes accurately.”

Anima Anandkumar is a 2023 AI2050 Senior Fellow, and Bren Professor of Computing at Caltech. Her work has been recognized with numerous awards and fellowships, including the 2025 Institute of Electrical and Electronics Engineers (IEEE) Kiyo Tomiyasu Award, a 2023 Guggenheim Fellowship, and a 2014 Sloan Research Fellowship. She has been invited to speak on AI’s impacts on science and society at a meeting of the President’s Council of Advisors on Science and Technology, and given a TED talk at TED2024. She is also a fellow of the IEEE and the Association for Computing Machinery, and is part of the World Economic Forum’s Expert Network.

Her AI2050 project aims to use neural operators to simulate complex multi-physics systems such as climate modeling or drug discovery, with orders-of-magnitude improvements to speed and accuracy. This project addresses Hard Problem #4, which leverages AI to make contributions to humanity’s greatest challenges and opportunities.

The AI2050 initiative gratefully acknowledges Fayth Tan for assistance in producing this community perspective.

Your work proposes applying AI to such a wide range of applications from weather forecasting, to climate change to drug discovery. For some extra context—take something familiar, like weather forecasting. How do we typically forecast the weather?

The standard approach to doing that is to think about modeling all the processes in a bottom-up fashion. How do the clouds turn turbulent? How do they move about? That’s fluid dynamics, what we call the Navier-Stokes equation. You have the heating up of the atmosphere during the day, so what are the thermal properties? You can think of that as multi-physics modeling—there are different kinds of physical phenomena, they have partial differential equations, you’re coupling all of them. You’re trying to solve those equations given the satellite observations. Based on that you make the prediction for the next one to two week range, because that’s when it’s predictable and you can give some level of confidence.

Instead of this bottom-up approach, AI learns from historical data in order to make predictions. What AI is able to do is learn that very well because there is historical data that is openly available. Our lab was the first to do weather models at that high of a resolution, [using] the highest resolution, publicly available dataset.

For weather phenomena like a hurricane, where it’s going to move next really requires the finer details. If you only look at it with coarse resolution, you may lose those finer details. Physics is all about those fine details—the weather has lots of variables for wind speed, pressure, temperature, all at different levels of the atmosphere [and] you’re learning how all those variables evolve over time.

Earlier attempts were using AI as part of the numerical method, not to completely replace it, [but] maybe augment it. People were very skeptical—AI may not be able to do such a good job compared to decades of innovation with traditional weather models. What I think everyone was surprised about was [that] it did very well—not only on typical weather forecasts, but also on extreme weather forecasts.

When it came to Hurricane Lee last September, AI-based weather models [predicted] the hurricane three to four days earlier than numerical weather models. In the beginning, the numerical weather models thought Hurricane Lee would not come to the coast, [but] our AI-based model correctly predicted the landfall. More recently, our predictions for Hurricane Beryl this July had much lower uncertainty and were closer to the ground truth compared to numerical models. The European Centre for Medium-Range Weather Forecasts is hosting our and other AI-based weather models, so you can go and study the charts of different models. There’s a lot more activity in this space, because people are very surprised—AI models can do forecasting tens of thousands of times faster, so [it’s] cheaper to do, but also better in terms of accuracy.

When most people think about a hurricane, they mostly think about the whole phenomena. But [it] can be broken down to much finer scale physical properties—neural operators would help model all of those properties in concert?

The analogy I like to give is graphics—there’s vector graphics and raster graphics. With vector graphics, you can keep zooming in as much as you like, and you still get the shape. Whereas in raster, there’s a limited number of pixels, so if you keep zooming, it just gets blurry. The way we want to think of neural operators is the same analogy compared to a neural network. A neural network—let’s say, image models—they all work at a fixed resolution. If you zoom in, it just gets blurry. But neural operators have an output that can be varied at any resolution, so you can ask it to provide very high resolution outputs.

The fact that it was able to predict hurricane landfall—that's not everyday weather, that's something quite extreme. When we think about climate change, we need to be able to predict rare outcomes. Could you speak more to neural operators’ ability to predict rare events?

Phenomena like Hurricane have a lot of very clear signatures—it’s this pressure difference, there’s hot and the cold temperature air coming together. Scientists understand that there is the signature, but we didn’t explicitly say that that was present in the data. If there are specific signatures like that, machine learning models tend to pick it up. But the other aspect, as you mentioned, is with climate change, we’re going to see an increasing number of such extreme weather events, and not all of them will be predictable.

In the example I mentioned, we did predict early, but in other cases we may not be able to, because these are what we call chaotic events. What you need for that is risk assessment—what’s the probability it’s going to hit the coast of Florida, [for example], and that probability [keeps] getting updated as the hurricane comes closer. But there is uncertainty. What AI-based models help with is reducing that uncertainty, but also giving us an accurate measure of what that uncertainty is.

This is where we can run different scenarios. [If] I observe a storm that is developing somewhere far away from the coast, I can slightly add noise to the current measured variables and look at how the path of my forecast will change, how my forecast is sensitive to different perturbations. This is what we call an ensemble. In current weather systems, there’s only around 50 ensemble members, so it’s expensive to generate.

Our model is much faster— tens of thousands of times faster. We’re able to generate ensembles that can be much larger—you can now generate 1000 different paths or 10,000 different paths. This way, you can come up with very accurate estimates of probability. That’s critical for these chaotic events, like hurricanes, or heat waves, but also longer ranges. If you go beyond about two weeks, there’s no predictability, but you have averages. At some point, weather becomes sub-seasonal, and if you run it long enough, it’s climate—the average statistics over a long time.

For climate change, you want those statistics, and you need to run these different scenarios and get averages to come up with [them]. That’s where these models are also going to be very impactful. With current climate models, some of the highest resolution [models] can barely do one run, so they literally just run the trajectory once. But that’s not how we get estimates for making policies or any kind of insurance—all of those require probability estimates. Once we’ve trained these models, people can run it under different scenarios—we call this climate forcing. If there are changes in the model, as long as it’s not a big change, we can fine-tune models and do that—and we continue to do that in so many other applications too.

It sounds like these AI models would also allow for the incorporation of updated domain-specific knowledge much easier and faster. Would you be able to incorporate [new knowledge] more quickly?

Nuclear fusion is one such example—in our recent paper, we [were] looking to prevent disruptions before they happen in real time, so that we can take corrective actions and prevent damage to the reactor.

Nuclear fusion has a lot of challenges, but one of the important ones is you try to bring this hot plasma and confine it—that’s how you hope to get to the ignition and start the fusion reaction. But it’s very unstable, it can quickly change and get to a very high temperature near the central coil of the reactor and damage [it]. You want to be able to keep detecting this and take corrective actions like reducing the temperature or other aspects to prevent damage. Our models are able to train on the camera data of the reactor, and in real time, are able to predict that disruption is likely to occur. We can do these predictions much faster—in this case, a million times faster than numerical methods. Plasma is a very challenging case because it’s highly sensitive to small changes, in that plasma behavior can suddenly become unstable. That’s where I think AI is able to do well because it’s quite hard to model those effects—when we have experimental data, AI models could directly learn [from] it.

It seems like a field that really benefits from curiosity—could you talk about one of your most unexpected collaborations?

I would say the medical catheter, right here at Caltech. The catheter is where you draw fluids out of the human body—apparently, that is the biggest source of infection in hospitals. People who are in long term care almost always get an infection through catheters. We were working with many collaborators who knew fluid modeling or in healthcare—they were initially thinking about [how] the infection is caused because the bacteria is swimming upstream. It’s able to do that near the wall [of the catheter] because the flow isn’t as fast there. Thinking of it in terms of the physics of the fluid flow, their idea was introducing triangular shapes [to] create vortices, so it becomes harder for the bacteria to come near the wall and swim upstream. The idea was entirely theirs—we weren’t at all involved.

Then they kept thinking, “Oh, how should that triangle be? Should the triangle be this angle, or that [angle], or equilateral?” That’s where it’s very hard for humans to come up with the optimal shape, because this is a highly nonlinear phenomenon. If you slightly change the shape, you can change bacterial contamination quite a bit. You also don’t want those triangles on the wall to be very big, because you want the fluid flow to be seamless. They came to my students, and that’s where we were able to design the optimal shape of the triangles. Our AI model learned the fluid flow—how the fluid flow would change with each change in the shape, but also how the bacteria [was] swimming upstream. We were able to model that, and we directly optimized and came up with a shape that the AI model said [was] the best.

They went ahead, did 3D printing—and it resulted in 100 times reduction in bacterial contamination. It was just done once, and that’s now patented. In a typical scenario, you would have to go back and forth— the initial design doesn’t work, they come back, and so on. That’s what this really avoided, and it led to such promising results. I was not expecting that [the] one design that it proposed [would] lead to such a big impact. I think that that’s been very fulfilling, but also surprising.