Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data

descriptionPublicationkeyboard_double_arrow_right Article 20 Jul 2020Publisher:IEEEJournal:2020 IEEE 18th International Conference on Industrial Informatics (INDIN)

Authors: Xiaohan Wang; Lin Zhang; Xuesong Zhang; Kunyu Xie;

doi: 10.1109/indin45582.2020.9442093

Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data

- Summary
- Metrics

Abstract

The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data.

Related Organizations

Jilin University
China (People's Republic of)
Beihua University
China (People's Republic of)
Jilin University
China (People's Republic of)
Beihang University
China (People's Republic of)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now