How can we make AI more equitable? Much of the research around this question is concerned with how AI functions once it is already in use. While undoubtedly important, Emma Pierson thinks that a critical, under appreciated piece of the discussion should occur much earlier. Before thinking about how an AI system operates, Pierson emphasizes that we should consider what it is being used for.
“If you care about equity, you should be thinking about the entire pipeline. You should not just be thinking about how to mitigate bias in tools whose applications are already set,” says Pierson. “You should be thinking more fundamentally: what uses are we going to apply this to from the outset?”
Emma Pierson is a 2023 AI2050 Early Career Fellow and Assistant Professor of Computer Science at Berkeley, affiliated with the Berkeley AI Research Lab, Computational Precision Health, and the Center for Human-Compatible AI. She was recognized by a National Science Foundation CAREER award in 2022, a Rising Star in EECS award in 2018, and a Rhodes Scholarship and Hertz Fellowship in 2014. She was also selected for MIT Technology Review 35 Innovators Under 35 in 2021, and Forbes 30 Under 30 in Science in 2019. She has written about her work for publications including The New York Times, FiveThirtyEight, and Wired.
Pierson develops data science and machine learning methods to study issues of social inequality and healthcare. Her AI2050 project leverages large language models (LLMs) to create structured databases of health equity-related data, and seeks to identify additional promising avenues of research for LLMs to improve health equity. This project addresses Hard Problem #6, which concerns equitable access to AI’s capabilities and benefits.
The AI2050 initiative gratefully acknowledges Fayth Tan for assistance in producing this community perspective.
You’re a computer scientist who is researching health equity. How did you find yourself working on these issues?

I started as a physics major — I’ve always been a math nerd. Halfway into college, I got some bad health news. I’d inherited a genetic mutation that predisposed you to a high risk of cancer, [and] it upended my academic trajectory as a 20-year old. Once I got over the shock, I came across a paper that offered me a lot of hope. It was an AI paper from a professor at Stanford, Daphne Koller. They took images of a patient’s cancer cell pathology slides, and used it to predict the patient’s outcomes. They showed that AI [found] features that had not been previously appreciated as important. This was a very exciting possibility to me. I went from being a physics major to being a computer science major, and got into computational biology. It’s not quite my area now, but that was where my interest in AI for healthcare originated.
My interest in equity…I think it came from the statistician’s desire to chase signals in the data. Features like age or race or gender or socioeconomic status produce big signals — it didn’t exactly come from an ideological perspective. [It] was clearly where the signal was, mathematically.
You’ve spoken about pursuing research that is intrinsically equity-enhancing. How does this perspective motivate your research?

Oftentimes we’re trying to mitigate the harms caused by AI from an equity standpoint — for example, reducing bias in AI models. Sometimes overlooked in this discussion is how we can also affirmatively apply AI models to improve equity. The use cases and problems that you choose are a major determinant of the impact of your product. If you’re helping an insurance company make an insurance denial program write slightly more fair notes as you’re mass-denying people’s claims, that is never going to be intrinsically equity-enhancing. Thinking about that from the outset is important.
AI has powerful applications to improve social equity because we get vast new datasets — from medical images or mobile health apps, for example — that humans are never going to make sense of. AI offers you the ability to make sense of it. If you choose to focus that lens on underserved populations, that’s a very powerful application.
Could you speak to some examples of equity-enhancing applications of AI?

A lot of the work I did in my Ph.D. was focusing on policing. We collected this dataset of 100 million police stops, analyzed it statistically, and found evidence of discrimination. That study eventually influenced policing policy in major American cities.
Near the end of my Ph.D., I started working with mobility data sets, which track people’s movements over time from cell phone data. We started out using them to study segregation — how much the rich and the poor mix. Usually, when we measure segregation, we use Census data, [which] has obvious limitations in terms of quantifying who people actually interact with. Using cell phone data, you can do a better job. While we were doing this work on segregation and how people intermingle, there was [also] this mysterious disease rising, and how people intermingle had important implications for what we later realized was COVID-19. We used that data to study its spread, and in particular, how inequality and mobility linked to inequality and infection rates.
More recently, I’ve been interested in inequality in a whole host of different areas. For example, we took dashcam images from drivers in New York City. We wanted to map out inequality in where the police were deployed. Specifically, we made a map across New York City of where police presence was the heaviest, which is data the police have not often been willing to release. That work got interest from public defenders.
How do you identify avenues for AI applications that would benefit a community or be of interest to policymakers?

It’s from talking to other folks. The COVID-19 mobility modeling arose from my collaborators, recognition that this was a growing problem, and also talking to a lot of public health people. Combining our experience with the data, it became clear that this was a powerful application. After we developed our initial COVID-19 model, we talked to departments of public health to build out a dashboard for them.
In New York City, Cornell Tech has deep connections to urban decision-makers, and we talk to them to see how the work could be useful. The policing work was done in close collaboration with journalists, and those journalists wrote articles about the analysis to increase the impact on policymakers.
One context where this happens a lot is in healthcare, where making sense of health data and generating processes without a clinician is not something you want to attempt. Working closely with clinicians to understand the problems that matter, but also on a more basic level to [understand] where the data comes from — it’s something we do all the time.
Are there challenges that come from working interdisciplinarily?

Getting familiar with the nuances of the data. Each dataset has its own devilish qualities which will totally upend your analysis, and they’re not anticipate-able. They’re only learned through difficult and painful experience. Often with difficult datasets, you want to work with an expert in that particular type of data to make sure you’re analyzing the data rigorously.
Another challenge is that people in different academic fields are trained to speak different languages and respond to different incentives. I’m trained as a statistician to think of generalities, like the population average, whereas a clinician will often be trained to think about particular patients and the nuances of a situation.
To circumvent these challenges, it’s key to focus on the upsides — namely, that this work is only possible a lot of the time with interdisciplinary collaboration.
Academia can be isolating and hyper-specific — could you speak to the importance of being curious and reaching out to people around you?

Read broadly — don’t just read things in your field. When we were working on the policing projects, I read books about mass incarceration and the history of policing in America to get the broader context of what we were working on.
In my life, I’ve been lucky enough to be in spaces where there are people from a whole range of different fields, like the AI2050 fellowship. I’ve realized that there are certain groups of disciplines I particularly enjoy talking to — they’re people who are a medium distance away from me. They’re not computer scientists, but they’re also not too far away — we have some common language. It gives me a bit of a headache to talk to them and do the translation, but I can do it. Finding those medium-distance ties can be intellectually rewarding.
What advice would you give to the younger researchers with a technical background, who want to apply their skills to equity-focused problems?

Recognize that the problems you [choose to] work on are really critical. Prioritize working with people who will prioritize your intellectual growth, and treat you as a human being and not a means to the end of producing papers. Academia — we can be hit or miss in that regard.
Having a general technical skill set [means] that you have a Swiss Army knife that can be applied to many problems. People will often naturally want to work with you — it’s a question of finding and talking to people who have problems that you can solve, and seeing how you can be helpful.