Downloads provided by UsageCounts
Abstract—Cluster computing was introduced to replace the superiority of super computers. Cluster computing is able to overcome the problems that cannot be effectively dealt with supercomputers. In this paper, we are going to evaluate the performance of cluster computing by executing one of data mining techniques in the cluster environment. The experiment will attempt to predict the flight delay by using random forest algorithm with apache spark as a framework for cluster computing. The result shows that, by involving 5 PC’s in cluster environment with equal specifications can increase the performance of computation up to 39.76% compared to the standalone one. Attaching more nodes to the cluster can make the process become faster significantly. Keywords—Cluster computing, random forest, flight delay prediction, pyspark, apache spark.
Information Security, Computer Science, Information Technology, Data mining, Cluster Computing
Information Security, Computer Science, Information Technology, Data mining, Cluster Computing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 7 | |
| downloads | 10 |

Views provided by UsageCounts
Downloads provided by UsageCounts