
This work presents a multi-stage feature selection framework for efficient and accurate encrypted network traffic classification using flow-level features. Starting from an initial set of 202 features extracted using Tranalyzer2, the proposed pipeline progressively reduces feature redundancy through domain-driven pruning, correlation analysis, wrapper-based selection, and importance-guided elimination. The framework integrates multiple machine learning models and evaluates them consistently using stratified cross-validation to analyze performance, training cost, and inference latency. Experimental results demonstrate that the proposed approach successfully reduces the feature space to a compact subset of seven features while maintaining high classification accuracy. The study highlights that gradient boosting models, particularly LightGBM, achieve the best trade-off between predictive performance and computational efficiency, making the framework suitable for real-world, resource-constrained deployment scenarios.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
