
Clustering data streams is one of the prominent tasks of discovering hidden patterns in data streams. It refers to the process of clustering newly arrived data into continuously and dynamically changing segmentation patterns. This article presents a stream mining algorithm to cluster the data stream with focusing on its evolution and concept drift. Even though concept drift is expected to be present in data streams, explicit drift detection is rarely done in stream clustering algorithms. Concept drift is caused by the changes in data distribution over time. Relationship between concept drift and the occurrence of physical events has been studied by applying the algorithm on the education data stream. Viber education data streams produced by Viber Groups in our Computer Science Department are used to conduct this study. The results show that our proposed algorithm superiority existing ones in purity, entropy, and sum of square error measurements. Experiments led to the conclusion that the concept drift accompanied by a change in the number of clusters and outliers indicates a significant education event. This kind of online monitoring and its results can be utilized in education systems in various ways, such as present the capabilities of participants.
Big Data, Chemistry, Clustering Educational Data, Physics, QC1-999, QD1-999, Data Stream Clustering Algorithms
Big Data, Chemistry, Clustering Educational Data, Physics, QC1-999, QD1-999, Data Stream Clustering Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
