Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Tamilselvan Arjunan
DOI Link: https://doi.org/10.22214/ijraset.2024.58946
Certificate: View Certificate
: In light of the increasing sophistication of cyberattacks and the rapid growth in network traffic, it is essential to detect network traffic anomalies or intrusions as they occur. Manual inspection is inefficient due to the large volume, speed, and variety network traffic data. This paper suggests using deep learning techniques in order to build intelligent models which can detect network traffic anomalies automatically within big data environments. We present a framework for anomaly detection using long-short-term memory models (LSTM) and convolutional neural network (CNN). The models are based on data extracted from packet captures. The models are evaluated on benchmark intrusion datasets as well as a large scale real network traffic dataset. The results show that deep learning models are able to detect anomalies more effectively than traditional shallow learning methods. Models can handle high-volume streaming data with low latency and in real time. To improve detection efficiency, we also propose optimization methods such as model compression and transfer learning. This work shows the effectiveness of deep learning for real-time anomaly detection within big data environments.
I. INTRODUCTION
As Network bandwidth continues to expand exponentially and new applications emerge, the volume of network traffic data has reached unprecedented levels. This surge presents a significant challenge for network traffic analysis, necessitating the real-time processing of massive data streams at high speeds. Concurrently, the landscape of cybersecurity threats is evolving rapidly, with attacks becoming more frequent, sophisticated, and damaging. Attackers continually devise new tools and techniques to breach network defenses. Therefore, the timely and accurate detection of anomalies and intrusions is paramount for ensuring network security [1]. This requires advanced analytical methods and technologies capable of identifying suspicious patterns and behaviors amidst the vast sea of network traffic data, enabling proactive defense measures to mitigate potential threats effectively [2]. Traditional anomaly detection techniques relying on manual inspection and rule-based systems are inefficient and ineffective for modern networks. Data mining and machine learning have been applied for automated network traffic analysis. However, shallow learning models like support vector machines (SVMs) and random forests have limited capability in handling complex networks with dynamic behavior [3].
Deep learning has become a powerful force in many domains. Its ability to detect intricate patterns and relationships among vast datasets is what makes it so transformative.
The remarkable success of deep learning in fields such as computer-vision, natural language processing and time series analyses highlights its versatility and effectiveness. Deep neural networks are particularly good at extracting abstract features from network data and capturing the complex nonlinear dynamics.
Recent studies have revealed the potential of deep learning to enhance network traffic classification systems and anomaly detection, leading the way to more intelligent and adaptive security solutions. Deep learning techniques can be used to strengthen network defenses, and protect against new threats.
This paper focuses on the use of deep learning to detect network traffic anomalies real-time in big data environments. Traditional analytics are challenged by the volume, velocity and variety of data generated from network traffic. Deep learning's predictive ability will be used to create highly accurate models capable of processing streaming network data with low latency and scale.
Framework of the anomaly network traffic detection system [5]
The main contributions of this paper are:
The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 explains the proposed methodology. Section 4 presents the experimental setup and results. Section 5 concludes the paper.
II. RELATED WORK
This section reviews research on network traffic analysis and anomaly detection using machine learning and deep learning models.
As network traffic grew in volume and complexity, limitations of machine learning algorithms began to become more evident. Researchers and practitioners started exploring the capabilities and potential of deep learning models and deep neural networks to solve the challenges in network classification and anomaly identification [6]. DNNs, unlike shallow learning models can learn hierarchical data representations automatically. This allows them to capture intricate patterns within large and complex datasets. DNNs are also well-suited to tasks that require complex feature extraction and presentation due to their ability handle high-dimensional data. Adoption of deep learning techniques have shown promising results for improving accuracy and scalability in network traffic analysis systems. This has paved the way for sophisticated approaches to security and network management.
As deep learning techniques continue to evolve, researchers have increasingly turned to deep neural networks (DNNs) to tackle various challenges in network traffic analysis. Notably, deep belief networks have emerged as a promising approach for classifying different network application types, offering improved accuracy and efficiency. Additionally, the utilization of autoencoders has facilitated anomaly detection in Software Defined Networks (SDNs), leveraging their capability to reconstruct input data and identify deviations from normal behavior [8]. Convolutional neural network (CNN) architectures have demonstrated remarkable success in accurately classifying encrypted traffic, showcasing their efficacy in handling complex data formats. Moreover, recurrent neural networks (RNNs) equipped with Long Short-Term Memory (LSTM) cells have exhibited superior performance in network intrusion detection tasks, particularly evidenced by their robust results on benchmark datasets like NSL-KDD, outperforming conventional machine learning models. These advancements underscore the growing significance of deep learning methodologies in enhancing the security and efficiency of network traffic analysis systems [9].
Researchers have also developed hybrid deep learning architectures combining CNN and LSTM for network traffic analysis. A 7-layer CNN-LSTM model outperformed shallow models for malware detection. A similar CNN-LSTM model detected denial of service attacks (DOS) with high accuracy. Another study combined 1D CNN, LSTM, and SVM ensembles for accurate detection of DOS and distributed DOS (DDOS) attacks.
In spite of promising results, the majority of existing research relies on offline training and evaluation using small datasets. Online processing of large data streams is required for real-world network analysis. Recent works have used deep learning to analyze online network traffic. A dual-stage PCA-LSTM system detected anomalies with low latency in real time.
Our research focuses on developing deep learning models capable of processing large volumes of heterogeneous traffic data to detect anomalies in a low-latency manner. We evaluate the performance of our models on large datasets that represent big data. The models are designed to provide real-time security threat identification through situational awareness.
III. METHODOLOGY
This section explains our methodology for real-time network traffic anomaly detection using deep learning. We first present the formulation of the anomaly detection problem. Next, we provide details on the CNN and LSTM models used for detection. Finally, we describe the model training process and optimization techniques.
The input layer takes sequential windows of network flow data. The 1D CNN layers extract spatial features and reduce data dimensionality. We use small 3x1 convolutions and max pooling to capture local dependencies and patterns between adjacent flows. The LSTM layers model temporal behavior and long-term dependencies in the traffic sequence. Bidirectional LSTMs process the data in both forward and reverse order. The outputs are concatenated to capture past and future context. Dropout and batch normalization enhance model generalization.
The dense output layers classify the input windows as normal or anomalous. For binary classification, we use sigmoid-activation. The binary cross-entropy function is optimized by training the model end-to-end.
Model We train the models using servers equipped with Nvidia GPUs, which speed up deep learning computations. The flow data has been preprocessed in order to normalize the features to 0.
1 scale. We use 80% of traffic for training, a further 10% for validation and 10% for testing. Adam optimizer is used to train the models for 50 epochs. If validation loss doesn't decrease after 5 epochs, we stop training. Checkpoint callbacks are used to save the model weights that minimize validation loss. The batch size is optimized as a hyperparameter for model convergence and training.
We use weighted ratios to avoid bias, as network traffic is highly unbalanced and has far more anomalous than normal flows. We experiment with SMOTE and other oversampling methods to synthesize more minority class examples.
a. Model Optimization: Training deep models on large datasets is computationally intensive. We propose optimization techniques to improve detection efficiency:
b. Transfer Learning: Training deep models on large datasets is computationally intensive. We propose optimization techniques to improve detection efficiency. Transfer learning involves initializing models with weights pretrained on similar network datasets. Fine-tuning on new data is faster than training from scratch, as it leverages the knowledge already encoded in the pretrained weights. This approach significantly reduces training time and computational resources, making it suitable for scenarios with limited resources or time constraints.
c. Model Compression: Model compression techniques, including quantization, pruning, and knowledge distillation, are employed to compress trained models with minimal accuracy loss. By reducing the size of the model, these techniques enable more efficient inference and deployment on resource-constrained devices such as mobile phones or IoT devices. Compact models require less computation during both training and inference, making them particularly valuable in applications where computational resources are limited or latency is critical.
d. Parallelism: Parallelism plays a crucial role in accelerating the training of deep learning models. By splitting data across multiple GPUs and utilizing data parallelism, models can be trained faster, effectively reducing the overall training time. Moreover, in production environments, parallelism enables low-latency concurrent inference by streaming data to multiple models simultaneously. This distributed approach enhances throughput and responsiveness, making it suitable for real-time applications such as video processing or autonomous driving [13].
e. Incremental Learning: Incremental learning allows models to be updated incrementally on new data without requiring full retraining from scratch. Continual learning, a form of incremental learning, adapts models to evolving traffic patterns or changing environments. This capability is particularly beneficial in dynamic domains where the data distribution may change over time, such as in online advertising or recommendation systems. By continuously incorporating new information, models can maintain their performance and relevance without the need for periodic retraining, ensuring adaptability and responsiveness to emerging trends or shifts in user behavior.
IV. EXPERIMENTS AND RESULTS
This section evaluates the proposed deep learning framework for real-time network traffic anomaly detection on benchmark and large-scale real-world datasets.
1) Experimental Setup: We conduct experiments using the CNN-LSTM model architecture shown in Figure 1. The model hyperparameters are tuned by grid search over learning rate, layers, filters, and batch size. This systematic approach ensures that the model is optimized for performance while avoiding overfitting or underfitting to the training data. By exploring a range of hyperparameters, we aim to identify the combination that yields the best results in terms of accuracy, precision, recall, F1- score, and latency.
2) Results on Benchmark Datasets: Table 1 shows model results on the NSL-KDD dataset. The CNN-LSTM model achieves the highest accuracy, precision, recall and F1-score compared to the baseline models. The deep model effectively learns complex features needed to distinguish between different types of attacks and normal traffic.
The paper proposes a novel deep learning methodology leveraging Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks for the real-time detection of network traffic anomalies within big data environments. By employing these advanced neural network architectures, the research aims to enhance the accuracy and efficiency of anomaly detection systems in handling large-scale and rapidly changing network traffic streams [18]. Through rigorous evaluation against shallow baseline models using both benchmark datasets and large real-world network traffic data, the efficacy of the deep learning approach is thoroughly assessed. The findings reveal that deep learning models exhibit remarkable capabilities in accurately identifying anomalies within high-volume and high-velocity traffic streams while maintaining low latency, thus showcasing their potential for deployment in real-time network security systems [19]. Moreover, the deep learning models demonstrate a superior ability to learn complex traffic representations and temporal dynamics compared to traditional machine learning techniques, leading to improved detection performance [20]. This study underscores the significant promise of deep learning methodologies in addressing the challenges posed by big data and evolving cyber threats in the domain of network security. By leveraging the scalability and adaptability inherent in deep learning architectures, organizations can develop robust and scalable real-time network security systems capable of effectively mitigating a wide range of cyber threats. The integration of CNNs and LSTMs enables the models to capture intricate patterns and correlations within the network traffic data, facilitating more accurate anomaly detection even in dynamic and heterogeneous environments. Furthermore, the low-latency nature of the proposed approach ensures timely detection and response to emerging threats, thereby enhancing overall cybersecurity posture. These findings contribute to advancing the field of network security by offering a data-driven and scalable solution that aligns with the requirements of modern big data environments [21]. Future research directions may involve exploring additional deep learning architectures and techniques to further enhance the performance and robustness of real-time anomaly detection systems, as well as investigating the applicability of the proposed approach to other domains beyond network security. Additionally, efforts to optimize the computational efficiency of deep learning models for deployment in resource-constrained environments could further broaden the practical utility of these systems [22]. Overall, this study underscores the transformative potential of deep learning in revolutionizing the landscape of network security and lays the groundwork for future advancements in this critical domain. Future work can further optimize deep model performance and efficiency for deployment. Testing on very large real-world network data at scale would better validate operational feasibility [23]. Ensembling diverse models and incorporating expert domain knowledge could improve detection accuracy. Automated hyperparameter tuning would simplify model development. Overall, advanced deep learning models show immense capability for automated real-time analysis of massive, complex network traffic data [24].
[1] Z. Huabing, Y. Sisi, C. Xiaoming, and L. Zhida, “Real-time detection method for mobile network traffic anomalies considering user behavior security monitoring,” in 2021 International Conference on Computer, Blockchain and Financial Development (CBFD), Nanjing, China, 2021. [2] M. Muniswamaiah, T. Agerwala, and C. Tappert, “Big Data in Cloud Computing Review and Opportunities,” arXiv [cs.DC], 17-Dec-2019. [3] O. I. Sheluhin and I. Y. Lukin, “Network traffic anomalies detection using a fixing method of multifractal dimension jumps in a real-time mode,” Autom. Contr. Comput. Sci., vol. 52, no. 5, pp. 421–430, Sep. 2018. [4] F.-B. Meng, N. Jiang, B. Liu, R. Li, and F. Xia, “A real-time detection approach to network traffic anomalies in communication networks,” DEStech Trans. Eng. Technol. Res., no. ssme-ist, Nov. 2016. [5] C. Yang, “Anomaly network traffic detection algorithm based on information entropy measurement under the cloud computing environment,” Cluster Comput., vol. 22, no. S4, pp. 8309–8317, Jul. 2019. [6] Z. R. Zaidi, S. Hakami, B. Landfeldt, and T. Moors, “Real-time detection of traffic anomalies in wireless mesh networks,” Wirel. Netw., vol. 16, no. 6, pp. 1675–1689, Aug. 2010. [7] M. Muniswamaiah, T. Agerwala, and C. C. Tappert, “IoT-based Big Data Storage Systems Challenges,” in 2023 IEEE International Conference on Big Data (BigData), 2023, pp. 6233–6235. [8] P. Bia?czak and W. Mazurczyk, “Characterizing anomalies in malware- generated HTTP traffic,” Secur. Commun. Netw., vol. 2020, pp. 1–26, Sep.2020. [9] R. Fontugne, T. Hirotsu, and K. Fukuda, “A visualization tool for exploring multi-scale network traffic anomalies,” J. Netw., vol. 6, no. 4, Apr. 2011. [10] I. Doghudje and O. Akande, “Dual User Profiles: A Secure and Streamlined MDM Solution for the Modern Corporate Workforce,” JICET, vol. 8, no. 4, pp.15–26, Nov. 2023. [11] W. Wang, T. Guyet, R. Quiniou, M.-O. Cordier, F. Masseglia, and X. Zhang, “Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks,” Knowledge-Based Systems, vol. 70, pp. 103–117, Nov. 2014. [12] B. Zhong et al., “Research on the identification of network traffic anomalies in the access layer of power IoT based on extreme learning machine,” in 2022 International Conference on Artificial Intelligence, Information Processing and Cloud Computing (AIIPCC), Kunming, China, 2022. [13] I. M. Lavrovsky and State University of Telecommunications, “Detection of traffic anomalies in the home Wi-Fi network using Waidps and Nzyme utilities,” Modern Information Security, vol. 52, no. 4, 2022. [14] N. Kuchuk, A. Kovalenko, H. Kuchuk, V. Levashenko, and E. Zaitseva, “Mathematical methods of reliability analysis of the network structures: Securing QoS on hyperconverged networks for traffic anomalies,” in Lecture Notes in Electrical Engineering, Cham: Springer International Publishing,2022, pp. 223–241. [15] M. Muniswamaiah and T. Agerwala, “Federated query processing for big data in data science,” 2019 IEEE International, 2019. [16] H. Deng, W. Chen, and G. Huang, “Deep insight into daily runoff forecasting based on a CNN-LSTM model,” Nat. Hazards (Dordr.), vol. 113, no. 3, pp. 1675–1696, Sep. 2022. [17] L. Zhang, “The evaluation on the credit risk of enterprises with the CNN- LSTM-ATT model,” Comput. Intell. Neurosci., vol. 2022, p. 6826573, Sep. 2022. [18] H. Li, Z. Wang, and Z. Li, “An enhanced CNN-LSTM remaining useful life prediction model for aircraft engine with attention mechanism,” PeerJ Comput. Sci., vol. 8, no. e1084, p. e1084, Aug. 2022. [19] N. Thakur and C. Y. Han, “Indoor localization for personalized ambient assisted living of multiple users in multi-floor smart environments,” Big Data Cogn. Comput., vol. 5, no. 3, p. 42, Sep. 2021. [20] J. P. Singh, “Enhancing Database Security: A Machine Learning Approach to Anomaly Detection in NoSQL Systems,” International Journal of Information and Cybersecurity, vol. 7, no. 1, pp. 40–57, 2023. [21] D. Gudu, M. Hardt, and A. Streit, “On MAS-based, scalable resource allocation in large-scale, dynamic environments,” in 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, 2016. [22] J. P. Singh, “Mitigating Challenges in Cloud Anomaly Detection Using an Integrated Deep Neural Network-SVM Classifier Model,” Sage Science Review of Applied Machine Learning, vol. 5, no. 1, pp. 39–49, 2022. [23] O. Kamara-Esteban et al., “Bridging the gap between real and simulated environments: A hybrid agent-based smart home simulator architecture for complex systems,” in 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, 2016. [24] H. Lauer and N. Kuntze, “Hypervisor-based attestation of virtual environments,” in 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, 2016.
Copyright © 2024 Tamilselvan Arjunan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET58946
Publish Date : 2024-03-12
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here