Clustering Algorithms based Noise Identification from Air Pollution Monitoring Data

Xinyi Fang,Chak Fong Chong,Xu Yang,Yapeng Wang

Published 2022 in 2022 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)

ABSTRACT

The development of data science has brought about many discussions of noise detection, and so far, there is no universal best method. In this paper, we propose a clustering-algorithm-based solution to identify and remove noise from air pollution data collected with mobile portable sensors. The test dataset is the air pollution data collected by the portable sensors throughout three seasons at the campus in Macao. We have applied and compared six clustering algorithms to identify the most appropriate clustering algorithm to achieve this goal: Simple K-means, Hierarchical Clustering, Cascading K-means, X-means, Expectation Maximization, and Self-Organizing Map. The performance is evaluated by their accuracy and the best number of clusters calculated by the Silhouette Coefficient. Additionally, a classification algorithm J48 tree can extract the key attributes and identify the noise cluster for future unlabeled data that may contain noise. The experiment results indicate that the Expectation Maximization and Cascading Simple K-Means perform the best. Moreover, temperature and carbon dioxide are vital attributes in identifying the noise cluster.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-23 of 23 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1