Machine learning (ML) focuses on the development of computer programs that can access data and use it for further learning. In this era of automation, due to the great success of artificial intelligence, ML is being integrated into almost everything. In this article, we will see how ML is solving the problems of complex networks and helping with network security.
“Machine learning is the field of study that gives computers the
ability to learn without being explicitly programmed.”
— Arthur Samuel, 1959
According to Brownlee J. in his work ‘Practical Machine Learning Problems’, there are four broad categories of problems that can leverage ML, namely, clustering, classification, regression and rule extraction. In clustering problems, the main objective is to group similar data and increase the gap between the groups. In classification and regression-based problems, the goal is to map a set of new input data to a set of discrete or continuous-valued outputs, respectively. Rule extraction problems are essentially different; here, identifying statistical relationships in data is the main goal.
Machine learning for/in networking
A lot of research is being done today in adopting AI as a tool for solving the problems of modern computer networks, as these are becoming increasingly complex and dynamic. AI is revolutionising network services by making more informed decisions based on the huge data available. It is a central component in cognitive networks and communication research.
The various applications of AI/ ML/ deep learning in computer networks and communication are (but not only limited to):
- Autonomous management of data centres and cloud infrastructures
- Modern applications of AI or ML in the management of networks and services
- Cyber security, including anomaly detection, malware detection, etc
- Modern approaches in cognitive computing
- Self-managing middleware and tools for extreme scales
- Big Data analytics frameworks for networking data
- Network monitoring and performance anomaly detection
- Machine learning for multimedia networking
- Resource allocation in networks using ML
- Deep learning and reinforcement learning in network control and management
- Applications of game theory in computer networks
- Applications of evolutionary computing in network optimisation
- Applications of AI in network configuration tuning
- Testing of cyber-physical systems using AI and ML
- Autonomous sensors networks and self-organising systems
- Adaptive stream-mining and resource-efficient scientific computing
The major networks that use machine learning
Traffic routing: One of the fundamental concepts essential for a network is routing, and this entails selecting a path for packet transmission. Routing takes into consideration cost minimisation, maximisation of link utilisation, operational policies and a few other attributes. Hence, ML models are challenged with the ability to cope and scale with today’s dynamic and complex network topologies. They should also have the ability to learn the correlation between the selected path and then predict the consequences to be faced for a particular routing decision made. Reinforcement learning has done wonders in this aspect of traffic routing.
The initial use of reinforcement learning was done through the Q-routing (based on Q-learning) algorithm, in which a router ‘X’ learns to map a particular routing policy (for example, to destination ‘D’ via neighbour ‘Y’) to its Q-value. This Q-value is nothing but an estimate of the time that will be taken by the packet to reach ‘D’ via ‘Y’ including all the queue and transmission delays over the link.
Even though this Q-routing algorithm performs exceptionally well in a dynamically changing network topology, under heavy load the algorithm constantly changes the routing policy, which creates bottlenecks in the network. The most successful model was ‘Team-Partitioned Opaque-Transition Reinforcement Learning (TPOT-RL)’ proposed by Veloso and Stone. This algorithm has high computational complexity considering the very large number of states to be explored, and high communication overhead.
Traffic prediction: Network traffic prediction plays a major role in today’s complex and diverse networks. Time series forecasting (TSF) is the major solution that helps forecast future traffic in a network. A TSF is a simple regression model that is capable of drawing an accurate correlation between future traffic and previously observed traffic volumes.
The existing models for traffic prediction are statistical analysis models and supervised ML models. Statistical analysis models are usually built on the autoregressive integrated moving average (ARIMA) model, while the majority of learning is achieved via supervised neural networks. But due to the rapid growth of networks and the corresponding complexity of traffic, the traditional TSF models are compromised, which has led to the rise of advanced machine learning models.
As per the survey (https://jisajournal.springeropen.com/articles/10.1186/s13174-018-0087-2#Sec1) by Raouf Boutaba, “Eswaradass proposed an MLP-NN based bandwidth prediction system for grid environments and compared it to the Network Weather Service (NWS) bandwidth forecasting AR models for traffic monitoring and measurement.
The goal of the system is to forecast the available bandwidth on a given path by feeding the NN with the minimum, maximum and average number of bits per second used on that path in the last epoch (ranging from 10s to 30s).”
Apart from the TSF based solutions, network traffic can also be predicted through non-TSF methods like Frequency Domain based methods in addition to Elephant flows for the network traffic flow. One of the non-TSF implementations incorporates the False Nearest Neighbour algorithm trained with backpropagation using simple gradient descent and wavelet transform to enable the model to capture both frequency and time features of the traffic time series.
Traffic classification: To perform a wide range of network operations, traffic classification is a must. This classification includes capacity planning, security and intrusion detection, and performance monitoring. During an operation of a big network, unnecessary traffic in business-critical applications is a waste of resources.
We have predefined classes of networks such as HTTP, FTP, WWW, DNS, and P2P for training supervised models. But payload based traffic classification can be performed without any prior knowledge of the application classes using supervised learning, as shown by Ma et al in ‘Unexpected Means of Protocol Inference’ – 2006. For Host-Behaviour based traffic classification, the four features — service proximity, activity profiles, session duration and periodicity mentioned by Schatzmann — act as good discriminators for a support vector machine (SVM) classifier to distinguish between Web mail and non-Web mail traffic using a five-fold cross-validation scheme. One of the oldest and successful network classifiers was by Roughan, where he implemented a K-Nearest Neighbour and Linear Discriminant Analysis to map network traffic into different classes of interest.
Congestion control: The feature responsible for throttling the number of packets entering the network is congestion control. It also ensures network stability, fairness (resource utilisation) and packet loss ratio. A Bayesian packet loss classifier with up to 90 per cent detection probability on different data sets like BU and PMA, along with an analytic Markov Model (for evaluating TCP variant) enhanced with Bayesian packet loss classifier (by Fonseca and Crovella), resulted in a throughput improvement of up to 25 per cent on the classical TCP-Reno algorithm.
Coming to the optical network variants, the congestion control was tackled in optical burst switching (OBS) networks. The data was collected by simulation with OBS modules, and then a new feature was derived from the observed losses known as the No. of Burst between Failures – NBBF. The TCP variant of the expectation-maximisation (EM) algorithm performed better than EM with Hidden Markov Models and clustering.
Queue management is an additional mechanism in the intermediate nodes that helps with TCP congestion control mechanisms. Hence it is responsible for dropping packets whenever necessary to control the queue length in the intermediate nodes. The conventional approach for queue management was the Drop-tail mechanism. Artificial Neural Network – Active Queue Management extends the neuron Proportional-Integral-Derivative (PID) controller by including another PID. In a superficial explanation, this implementation improves the performance when compared to PID NN in real-life scenarios, thus incurring higher computational overhead.
Resource management: The vital resources of the network, including the CPU, memory, switches, routers and frequencies, are under resource management of a network. Here, resource allocation becomes a binary classification or decision problem, which should actively manage the resources ensuring long-term goals of resource utilisation. Admission control, which is a subdivision under resource management, was ensured by Blenk. He employed a recurrent neural network (RNN) for the online virtual network embedding (VNE) problem by predicting the probability of the virtual network request by the VNE algorithm, before running that algorithm itself. This was done based on the current state of the substrate and the request. These RNNs gained an accuracy of about 90 per cent using the previous performance data of the VNE algorithms.
Machine learning in network security
As per the SANS Institute, network security is the process of taking preventive measures with respect to the hardware and software, to protect the underlying networking infrastructure from unauthorised access, misuse, malfunction, modification, destruction, or improper disclosure, thereby creating a secure platform for computers, users and programs to perform their permitted critical functions.
There are various specialised techniques to implement this defence. Cisco has broken down network security into the following types:
- Access control
- Anti-malware
- Application security
- Behavioural analytics
- Data loss prevention
- Email security
- Firewalls
- Intrusion detection and prevention
- Mobile device and wireless security
- Network segmentation
- Security information
- VPN
- Web security
We will now look into the solutions provided by machine learning for the prevention of various types of intrusions.
Misuse based intrusion detection: In misuse based detection, abnormal system behaviour is defined as making everything else normal. Hence, anything which is not known is considered normal. There have been proposals for a real-time misuse based intrusion detection system.
To reduce the number of features used, the information gain concept was used. The best technique for this purpose turned out to be the decision tree, which runs on traces collected in a 2-second interval of time and resulted in 98 per cent detection accuracy. This implementation could only detect two types of attacks — DoS and Probe — and still had some vulnerabilities to persistent threats and distributed attacks. Further research has led to works that use a transductive confidence machine for a K-NN with a strangeness measure.
Anomaly based intrusion detection: The major parameter on which the anomaly detection system relies is the network behaviour. If the network behaviour is found to be within the predefined behaviour, the network transaction is accepted; otherwise, an alert gets triggered in the system.
One of the recent solutions for anomaly based intrusion detection used a support vector machine (SVM) with a radial basis function kernel. This RBF-SVM was used to devise an IDS for SDN based malware detection. Limited numbers of features like number of packets, number of bytes, flow duration, byte rate, packet rate, length of the first packet and average packet length were evaluated, and collected via SDN switches. This model resulted in 98 per cent accuracy for malware traces.
Hybrid intrusion detection: Apart from the above IDSs, we also have custom/hybrid IDSs, which apply both misuse based and anomaly based intrusion detection. Their sole purpose is to achieve high accuracy in detecting patterns of known attacks along with detecting new attacks in the system.
When compared to the SVM based solution, neural networks based hybrid intrusion detection systems take more training time and computational power. An SVM solution can achieve 99.5 per cent accuracy within a training time of 17.77 seconds on the KDD Cup data set, hence outperforming neural networks both with respect to accuracy and runtime. Further improvements in this field have resulted in developing a hierarchical IDS framework based on RBF for hybrid intrusion detection, reducing the complexity of the system. This implementation was also evaluated using the KDD data set against a backpropagation learning algorithm and achieved an accuracy of around 99.2 per cent.
However, more research is needed in the field of using ML for network security. Solutions can be developed for some of the problems that remain unaddressed, and can also be made more efficient than the ones that exist.
The integration of ML into networks is not limited to the topics mentioned above, but has a vast usage, as shown in Figure 1.