Air Quality Data Sources
Learning Objectives
- Compare regulatory monitoring networks with low-cost sensor deployments
- Navigate EPA AQS and AirNow data portals
- Evaluate data quality using accuracy, precision, and completeness metrics
- Understand satellite-derived air quality products and their limitations
- Access and format air quality datasets for analysis
The Data Landscape
"Air quality monitoring has evolved from sparse regulatory networks to dense deployments of low-cost sensors and global satellite coverage. How do we integrate these different data sources while understanding their strengths and limitations?"
Types of Air Quality Monitoring
| Type | Instruments | Strengths | Limitations |
|---|---|---|---|
| Regulatory (FRM/FEM) | BAM, TEOM, chemiluminescence | High accuracy, QA/QC, legal standing | Sparse network, expensive, time lag |
| Low-cost sensors | PMS, SPS30, electrochemical | Dense deployment, real-time, affordable | Variable accuracy, drift, interference |
| Satellite | MODIS, TROPOMI, GOES | Global coverage, spatial patterns | Column measurements, cloud interference |
| Mobile monitoring | Vehicle-mounted sensors | Spatial resolution, hot spots | Temporal coverage, expensive |
EPA Data Systems
Air Quality System (AQS)
- Historical data back to 1980
- All criteria pollutants and HAPs
- Hourly, daily, annual summaries
- Site metadata and methods
- Access: EPA AQS API or download
AirNow
- Real-time data (hourly updates)
- AQI calculations and forecasts
- Fire and smoke data integration
- Public-facing with visualizations
- Access: AirNow API, open data
Data Quality Metrics
Key Quality Indicators
- Accuracy: Closeness to true value (bias). Assessed via colocation with reference instruments.
- Precision: Reproducibility of measurements. Coefficient of variation (CV) or standard deviation.
- Completeness: Percentage of valid data points. EPA requires ≥75% for regulatory purposes.
- Detection limit: Minimum concentration reliably measured.
- Drift: Change in response over time. Requires periodic calibration.
For low-cost sensors: EPA recommends R2 ≥ 0.7 and slope 1.0 ± 0.35 vs. reference for "performance target" designation.
Satellite-Derived Products
Common Products
| Product | Source | Resolution | Application |
|---|---|---|---|
| AOD (Aerosol Optical Depth) | MODIS, VIIRS | 3-10 km | PM2.5 estimation |
| Tropospheric NO2 | TROPOMI | 3.5 x 7 km | Emission mapping |
| CO column | MOPITT, TROPOMI | 22 km | Fire, transport |
| O3 column | OMI, TROPOMI | 13 x 24 km | Stratospheric ozone |
Key limitation: Satellites measure column-integrated values, not surface concentrations. Statistical models needed to estimate ground-level pollution.
Activity: Data Exploration
- Access EPA AQS data for your state for the past 5 years (PM2.5 daily means)
- Download data from at least 3 monitoring sites in different settings (urban, suburban, rural)
- Calculate data completeness for each site each year
- Identify any sites with low-cost sensors (AirNow Fire and Smoke Map)
- Compare the number of regulatory monitors vs. PurpleAir sensors in your county
Questions to explore:
- How representative are regulatory monitors of neighborhood-level exposure?
- What populations might be underserved by current monitoring?
- How could low-cost sensors fill monitoring gaps?
Key Takeaway
Air quality data comes from diverse sources with different strengths and limitations. Regulatory monitoring provides accurate, legally defensible data but with sparse coverage. Low-cost sensors enable dense networks but require careful calibration and quality control. Satellites offer global coverage but measure different quantities than ground monitors. Effective data science integrates these sources while understanding their uncertainties.