Integration of Detection Techniques and Machine Learning to Improve Data Quality in Atmospheric Monitoring

Authors

  • Eladio Quintero Universidad Tecnológica De Panamá - (Pa), Panama
  • Jonathan González Universidad Tecnológica De Panamá - (Pa), Panama
  • Felisindo García Universidad Tecnológica De Panamá - (Pa), Panama
  • Edwin Collado Universidad Tecnológica De Panamá - (Pa), Panama; Centro De Estudios Multidisciplinarios En Ciencias, Ingeniería Y Tecnología - Cemcit Aip
  • Antony García Universidad Tecnológica De Panamá - (Pa), Panama
  • Yessica Saez Universidad Tecnológica De Panamá - (Pa), Panama; Centro De Estudios Multidisciplinarios En Ciencias, Ingeniería Y Tecnología - Cemcit Aip

DOI:

https://doi.org/10.18687/LACCEI2025.1.1.1268

Keywords:

Data analysis, convolutional autoencoder, Machine Learning, particulate matter, environmental monitoring.

Abstract

Concentrations of particulate matter (PM) in the air pose a significant risk to human health and the environment. Accuracy in the measurement of these pollutants is critical for effective air quality management; however, monitoring stations present errors and inconsistent data that affect the reliability of the analysis. In this study, different methods based on data science and machine learning are presented and compared to correct and improve the quality of PM measurements. This includes an exploration data analysis to identify temporal patterns in air pollution specifically of PM, detection and removal of outliers using the interquartile range method, normalization and transformation of temporal variables, and implementation of a convolutional autoencoder model for missing data correction. The methodology was applied to a dataset collected by a monitoring station in Panama, and the results showed that the removal of outliers significantly reduced the distortion in the data, while the autoencoder achieved a moderate reconstruction of missing values, with a MAE of 0.1322 and a coefficient of determination R² of 0.5770. The findings suggest that the combination of statistical techniques and machine learning models allows to improve the reliability of PM monitoring data, providing more accurate information for environmental decision-making. In addition, this study opens new lines of research, such as the development of low-cost correction models for community stations, the analysis of the impact of meteorological events on particulate matter concentrations, and the comparison of pollution patterns in different urban environments.

Downloads

Published

2025-04-09

How to Cite

Quintero, E., González, J., García, F., Collado, E., García, A., & Saez, Y. (2025). Integration of Detection Techniques and Machine Learning to Improve Data Quality in Atmospheric Monitoring. LACCEI, 1(12). https://doi.org/10.18687/LACCEI2025.1.1.1268

Most read articles by the same author(s)