Clear data for clear air: data science gives new insight into air pollution in the US

At MIT’s Henry W. Kendall Memorial Lecture, Harvard professor Francesca Dominici illuminates the interplay between air pollution, environmental injustice, and Covid-19.

“To do really important research in environmental policy,” said Francesca Dominici, “the first thing we need is data.” Dominici is a professor of biostatistics at the Harvard T.H. Chan School of Public Health, and co-director of the Harvard Data Science Initiative. By leveraging massive amounts of data, Dominici and a consortium of her colleagues across the nation are revealing, on a grand scale, the effects air pollution levels have on human health in the United States. Their efforts are critical for providing a data-driven foundation on which to build environmental regulations and human health policy. “When we use data and evidence to inform policy, we can get very excellent results,” said Dominici.

Overall, air pollution has dropped dramatically nationwide in the past twenty years, thanks to regulations dating back to the Clean Air Act of 1970. “On average, we are all breathing cleaner air,” said Dominici. But the research efforts of Dominici and her colleagues show that even relatively low air pollution levels, like those currently present in much of the country, can fall well within national regulations and still be harmful to health. Moreover, recent patterns of decreasing air pollution have left certain geographic areas worse off than others, and exacerbated environmental injustice in the process. “We are not cleaning the air equally for all of the racial groups,” said Dominici.

Speaking over Zoom to audience members tuning in from around the world, Dominici presented these findings and discussed the underlying methodologies at the 18th Henry W. Kendall Memorial Lecture on April 21st. This annual lecture series, which is co-sponsored by the MIT Center for Global Change Science (CGCS) and MIT’s Department of Earth, Atmospheric and Planetary Sciences (EAPS), honors the memory of the late MIT professor of physics Henry W. Kendall. Professor Kendall was instrumental in bringing awareness of global environmental threats to the world stage through the World Scientists’ Warning to Humanity in 1992 and the Call for Action at the Kyoto Climate Summit in 1997. The Kendall Lecture spotlights leading global change science by outstanding researchers, according to Ron Prinn, TEPCO Professor of Atmospheric Science in EAPS and Director of CGCS.

Video: How Much Evidence Do You Need? 18th Henry W. Kendall Lecture

In the various studies Dominici discussed, she and her colleagues honed in on a specific kind of harmful air pollution called fine particulate matter, or PM2.5. These tiny particles, less than 2.5 microns in width, come from a variety of sources including vehicle emissions and industrial facilities that burn fossil fuel. “Particulate matter can penetrate very deep into the lungs [and] it can get into our blood,” said Dominici, noting that this can lead to systemic inflammation, cardiovascular disease, and a compromised immune system.

To analyze how much of a risk PM2.5 poses to human health, Dominici and her colleagues turned to the data – specifically, to large datasets about people and the environment they experience. One dataset provided fine-grained information on the more than 60 million Americans enrolled in Medicare, including not only their health history, but also factors like socioeconomic status and zipcode. Meanwhile, a team led by Joel Schwartz, a professor of environmental epidemiology at the Harvard T.H. Chan School of Public Health, amassed satellite data on air pollution, weather, land use, and other variables, combined it with air quality data from the EPA’s national network, and created a model that provides daily levels of PM2.5 for every square kilometer in the continental United States over the last 20 years. “In this way we could assign, to every single person enrolled in the Medicare system, their daily exposure to PM2.5,” said Dominici.

Combining and analyzing these datasets provided a holistic look at how PM2.5 affects the population enrolled in Medicare, and yielded several important findings. Based on the current national ambient air quality standards (NAAQS) for PM2.5, levels below 12 micrograms per cubic meter are considered “safe.” However, Dominici’s team pointed out that even levels below that standard are associated with a higher risk of death. They further showed that making air quality regulations more stringent by lowering the standard to 10 micrograms per cubic meter would save an estimated 140,000 lives over the course of a decade.

The scope of the datasets enabled Dominici and her colleagues to use not only traditional statistical approaches, but also a method called matching. They compared pairs of individuals who had the same occupations, health conditions, and racial and socioeconomic profiles, but who differed in terms of PM2.5 exposure. In this way, the researchers could eliminate potential confounding factors and lend further support to their findings.

Their research also illuminated issues of environmental injustice. “We started to see some drastic environmental differences in risk across socioeconomic and racial groups,” said Dominici. Black Americans have a risk of death from exposure to PM2.5 that is three times higher than the national average. Asian and Hispanic populations, as well as people with low socioeconomic status, are also more at risk than the national population as a whole.

One factor behind these discrepancies is that air pollution has been decreasing at different rates in different parts of the country over the past twenty years. In 2000, nearly the entire eastern half of the US had relatively high levels of PM2.5 at 8 micrograms per cubic meter or higher. In 2016, those pollution levels had dropped dramatically across much of the map, but remained high in areas with the highest proportions of Black residents. “Racial inequalities in air pollution exposure are actually increasing over time,” said Dominici. She noted that one thing to consider is whether future regulations can tackle such inequities while also lowering air pollution for the entire nation on average.

Issues of both air pollution and environmental injustice have been thrown into stark relief during the Covid-19 pandemic. An early study on Covid-19 and air pollution led by Dominici showed that long-term exposure to higher levels of air pollution increased the risk of dying from Covid-19, and that areas with more Black Americans are even more at risk. Additional research showed that during last year’s wildfire season in California, up to 50% of Covid-19 deaths in some areas were attributable to the spikes in PM2.5 that result from wildfires.

Due to a lack of data on individual Covid-19 patients, some of these analyses were based on county-level data, which Dominici noted was a major limitation. “Fortunately, in some geographical areas, we’ve started getting access to individual-level records,” said Dominici. Access to more and better data has sparked additional research around the world on the link between air pollution and Covid-19. Dominici was also part of an international collaboration that estimated, for example, that 13% of Covid-19 deaths in Europe were attributable to fossil-fuel related emissions.

For Dominici, “a data scientist at heart,” findings like these highlight the role of data science in influencing critical environmental policy decisions. “Our all being devastated by this pandemic could provide an additional source of evidence of the importance of controlling fossil-fuel related emission.”

Story Image: A map of the NYC tri-state area showing the average tropospheric NO2 concentrations prior to the March 2020 COVID-19 stay-at-home measures, superimposed over the hazy NYC skyline. Credit: courtesy NASA