Abstract:
Nowadays, with the high volume of captured data in computer networks the anomaly detection has become one of the main challenges. To deal with this some works have used machine learning algorithms and feature selection methods with traditional tools that are not dedicated to big data analysis, other works have used machine learning algorithms on big data frameworks without the feature selection methods application. In this paper, we propose an approach that aims to detect network intrusion with higher accuracy, using the minimum of features and supporting massive data. This approach combines the machine learning algorithms, the feature selection methods, and the Spark framework. For experimentation, we use the UNSW-BN15 dataset. The obtained results and the carried comparisons show that the proposed approach provides better accuracy using a small subset of features.