Deep Learning for Air Quality

Deep Learning for Air Quality | Sep 4, 2025

This is a poster I am going to present at the 2025 SMARTFIRES all-hands meeting (a gathering for the grant that paid me this summer) that I’ve transferred from Latex to HTML for this site.

I have mixed feelings about the research for this summer, but they tell me I did a pretty decent job. I definitely went a little off track at times. I ended up getting to the deep learning part of the project late in the summer, and had quite a time of it learning PyTorch as well as working with neural nets for the first time, and ended up running out of time and not getting a positive result. This poster is also my first attempt at any kind of scientific writing, and I am quite happy with it for the amount of time I got to spend on it, as clumsy as it might be.

Overall learned quite a bit, and am staying on for the semester to continue working on this which is nice :)

- Lorn

Extending Deep Learning Models of Wildfire Spread to Air Quality

Lorn Jaeger Rupert Williams Kyle Krstulich Jake Bova
Jesse Johnson

Introduction

The AI/ML research thrust of the SMART Fires project aims to model prescribed fires and their emissions. This poster builds on earlier work predicting next-day wildfire spread, extending the approach to a new target variable, namely PM 2.5, a major component of harmful emissions. The ultimate goal is to be able to do predictions of how prescribed burn emissions will disperse, given information about the area and expected fire spread, helping land managers plan burns that minimize risks to air quality and public health.

Dataset

The base of the dataset we’re training on comes from the WildfireSpreadTS (WSTS) next-day wildfire spread benchmark. WSTS is a collection of 607 U.S. fire events (2018–2021), with data for each day the fire burned, with fields including fuel/vegetation, topography, current and forecasted weather, and landcover.

Table 1. WildfireSpreadTS Fields
Category	Source	Resolution	Feature
Vegetation	VIIRS	375–750 m	VIIRS bands M11, I2, I1; NDVI; EVI2
Weather	GridMET	4 km	Total precipitation; Wind speed and direction; Min/Max temperature; Energy release component; Specific humidity; PDSI
Topography	USGS	30 m	Slope; Aspect; Elevation; Landcover type
Forecast Weather	GFS	28 km	Total precipitation; Wind speed and direction; Temperature; Specific humidity
Target	VIIRS	375 m	Active fire

WSTS improved upon prior datasets by adding multi-day time series inputs and higher temporal resolution paired with high spatial resolution. Our lab has further extended it with several additional years of fires, bringing the total to 1,067 events from 2016-2024.

Selecting a Source of PM 2.5 Data

Selecting the best source of PM 2.5 data is not a trivial task. There are many datasets available but they vary in the methods used to produce estimates, usually relying on a combination of physics based models, reanalysis, and satellite measurements of aerosol optical depth. Of these we evaluated four.

CAMS (Copernicus Atmosphere Monitoring Service) is a global, 3-hourly PM 2.5 forecast from the European Centre for Medium-Range Weather Forecasts that uses the IFS chemistry model, assimilates satellite aerosol optical depth, and reports PM 2.5 directly without sensor correction. MERRA-2 (NASA) is a global GOCART reanalysis where PM 2.5 is calculated from aerosol components; it does not use ground sensors directly. MERRA2R is the same dataset but bias-corrected against AirNow monitors to improve surface accuracy. NCAR (National Center for Atmospheric Research) provides a 12 km CONUS reanalysis using the WRF-CMAQ model that assimilates satellite AOD, designed for higher resolution U.S. air quality.

We tested each source by checking it’s accuracy relative to a PM 2.5 sensor reading dataset provided by AirNow.

A visualization of Airnow Sensor Distribution over the United States — Figure 1. AirNow sensor locations.

The metrics we really care about are RMSE, which reflects overall accuracy, and slope, which shows how well the dataset scales as PM 2.5 concentrations increase. Because these datasets tend to under predict high PM 2.5 events, a slope closer to 1 means the dataset might better capture the high PM 2.5 coming from a fire.

Table 2. Performance metrics by source. Bold values indicate best performance per column.
Source	Spearman	RMSE	Slope
NCAR	0.475	20.600	1.336
CAMS	0.352	40.890	0.189
MERRA2	0.426	54.019	0.166
MERRA2R	0.493	20.323	1.071

MERRA2R outperforms the others in all three metrics while being available globally from 2001-2024, and is what we will be using as a target variable for our training runs.

Model Selection & Training

Training a model to predict fire spread is a binary semantic segmentation task, where the input is a multichannel raster made up of the WildfireSpreadTS dataset, the target is a binary mask of the next day’s burned pixels, and the output is a raster of probabilities that each pixel burned the next day. To train a model to predict PM 2.5 we will do something similar. Each input will be the same raster but with MERRA2R PM 2.5 added, and a mask of healthy / unhealthy PM 2.5 concentrations (over / under 80 ppm) as the target variable.

There are a few options for model architecture. Every architecture is a deep neural net that takes tensor representation of our raster as input, and does some sort of convolution on it, sometimes also using techniques like attention to take advantage of WSTS’s temporal component. Previous work has used U-Net, ConvLSTM, SwinUnet, and U-TAE architectures. We ended up using the U-TAE, which had performed the best on the original WildfireSpreadTS dataset.

Figure 2. Illustration of the U-Net and UTAE models.

We train with three loss functions designed for binary semantic segmentation. Binary Cross-Entropy (BCE) evaluates each pixel individually:

L_BCE = average over all pixels of [ -(y * log(ŷ) + (1 - y) * log(1 - ŷ)) ]

Jaccard Loss optimizes for overlap between predicted and true masks:

IoU = (intersection of prediction and ground truth) / (union of prediction and ground truth)
L_Jaccard = 1 - IoU

Dice Loss is similar to Jaccard but gives more weight to small regions:

DSC = (2 * intersection of prediction and ground truth) / (size of prediction + size of ground truth)
L_Dice = 1 - DSC

BCE focuses on per-pixel accuracy, while Jaccard and Dice better capture spatial overlap, which is useful for segmentation.

A bad illustration of training a neural net, if you're blind be glad you can't see it — Figure 3. Model Training Diagram.

For training we make use of the setup found by previous modeling improvements by our lab. The previous five days of data are used as input, with dates encoded positionally. Our encoder is initialized with weights from pre-trained Res18 model and our decoder is trained from scratch. We use all three loss functions in a grid search. Our training split uses data from the year 2020 as a validation set, 2021 as test set, and the rest of the dataset for training.

Results

Each model failed to converge to anything predictive, with average precision under 0.09 and flat loss curves. This may be due to several factors:

The coarse resolution of MERRA2R (44 km grid cells) relative to our input images. Only a handful of values are downscaled into each raster, and misalignment by even a pixel or a day can place the plume outside the model’s input entirely.
Severe class imbalance. MERRA2R frequently reports low PM 2.5 values even near active fires, resulting in many input days with no unhealthy pixels.
General data limitations. The dataset is relatively small, and covariates that are fine for fire such as wind may not be sufficient to accurately place PM 2.5 in the correct area.

Future directions include expanding the spatial domain of the dataset to better capture full smoke plumes, incorporating covariates more directly relevant to PM 2.5 formation and transport, evaluating alternative PM 2.5 datasets with finer resolution or better integration with wildfire models, and including future burn area in model input. Additionally we’d also like to focus on predicting other harmful emissions more localized to the fire source.

← A Letter From My Grandfather https://lorn.us