According to Thomas Reuters, people who live in places with high air pollutants have a 20% larger risk of dying from lung cancer than those living in less polluted areas. TADA can help meteorologists anticipate pollution peaks before they occur, giving them the means to take action to protect populations.

Industry

Environment

Project Duration and Effort

One week

Type of Prediction

Binary Classification

Customer Benefits

  1. 90% accuracy in identifying pollution peaks before they occur.
  2. Recognizing which criteria among a plethora are critical indicators of incoming pollution peaks.

Problem to solve

Pollution is one of the greatest global killers, affecting above 100 million souls. That’s similar to global illnesses like malaria and HIV, according to the World Health Organization (WHO), which mentions that 4.2 million deaths each year happen as a consequence of ambient (outdoor) air pollution. Moreover, 95% of the global population are exposed to average particulate matter concentrations, which exceed the WHO prescribed limit of 10 micrograms per cubic meter, according to ‘our world in data.

We want to see whether TADA can predict pollution peaks based on a dataset comprising 602 daily air quality records. They include 19 fields of information, which can be summarised as:

  • The season
  • Various measures of CO,
  • Various measures of O3,
  • Various measures of SO2.


The prediction of pollution peaks has been traditionally made using mathematical models.

TADA’s Machine Learning platform can help meteorologists predict pollution peaks before their occurrence. A well-anticipated pollution peak gives the means to take proactive actions to protect populations: ask fragile people to stay home and limit traffic.

Objectives

  • Make the distinction between a day with a pollution peak and a typical day with good accuracy
  • Understand which criteria are indicators of incoming pollution peaks

 

It poses the following meteorology question: 

Can a pollution peak be anticipated before it occurs?

Solution

TADA manages to predict pollution peaks with a 90% accuracy and a 95% AUC in this classification case. Air pollution is a significant problem, especially in more urban areas. It is typically caused by fossil fuel combustion from the transportation and industrial sectors, which emit harmful pollutants such as PM2.5 and nitrogen dioxide. While people from all walks of life are affected, the most polluted areas are typically developing. As of 2019, the most polluted country globally is Bangladesh, which has dangerously high PM2.5 particulate matter concentrations. In this same year, Delhi, India, was ranked as the world’s most polluted capital city for the second year. People worldwide suffer from high air pollution exposure, causing millions of premature deaths every year.

TADA has selected the following four main criteria out of the 19 available in the dataset:

  • The O3 maximum record, which weights for 34% in TADA’s decision,
  • The air quality index for ozone, which weights for 33% in TADA’s decision,
  • The season, which weights for 17% in TADA’s decision,
  • The average record for carbon monoxide, which weights for 14% in TADA’s decision.

It is interesting to note that the ozone levels are an essential indicator of incoming pollution peaks, combined with the season.

Live Predict

Customer Benefits

In one week, meteorologists were equipped with an accurate model predicting pollution peaks; they managed to:

  • Make the difference between an incoming pollution peak and a typical day with an accuracy of 90% and an AUC of 95%
  • Obtain an immediate “what-if” analysis (live predict) linking the critical criteria to the likelihood of pollution peaks.