Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ International Journa...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2022
License: CC BY
Data sources: ZENODO
versions View all 2 versions
addClaim

End-to-End Machine Learning Pipeline for Real-Time Network Traffic Classification and Monitoring in Android Automotive

Authors: Sriram M; Susmithaa Raam A; Vignesh B; Dr. Balasubramanian V;

End-to-End Machine Learning Pipeline for Real-Time Network Traffic Classification and Monitoring in Android Automotive

Abstract

The aim of this work is to build a network traffic monitoring application that is capable of categorizing network data traffic based on their application usage into 7 types: Browsing, Chat, Email, File Transfer, Streaming, VoIP and P2P. Flow-wise data is analyzed after the traffic stream is fed into the CICFlowmeter. Live traffic flow is fed to various ML models and algorithms such as K-Means Clustering algorithm, Agglomerative Clustering, Mean-shift algorithm, Random Forest Classifier, Adaptive Boosting algorithm, Gradient Boosting algorithm, Linear Discriminant analysis, Naive Bayes classifier, Classification and regression trees and the Support Vector Machine model. K-fold cross validation test is conducted, which derived results depicting the best of the models to be the Random Forest Classifier. We used 23 features for model training based on their importances. Model evaluation is done using the confusion matrix. Class imbalances are handled effectively with a comparative study of both under-sampling and oversampling of the dataset. Oversampling using SMOTE produces better results. The important timebased features in classification is recorded for further studies. The model used was fast enough to classify the flows in real time and display the analytics in the dashboard. The Flask framework is used to build a live dashboard to display the network traffic classified along with the several important features. We were able to prove that network traffic classification cam be done using time-based features which does not violate data protection laws. Network traffic classification using Random forest algorithm on oversampled dataset gave an overall accuracy of 0.92 was achieved.

Keywords

Machine Learning, Android Automotive, CICFlowmeter, Network Flow Classifier

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    2
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 8
    download downloads 10
  • 8
    views
    10
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
2
Average
Average
Average
8
10
Green
gold