Imtiaz Ali Soomro , Hamood ur Rehman Khan , Syed Jawad Hussain , Zeeshan Ashraf , Alnfiai Mrim M. and Alotaibi Nouf NawarLightweight Privacy-Preserving Federated Deep Intrusion Detection for Industrial Cyber-Physical SystemAbstract: The emergence of Industry 4.0 entails extensive reliance on industrial cyber-physical systems (ICPS). ICPS promises to revolutionize industries by fusing physical systems with computational functionality. However, this potential increase in ICPS makes them prone to cyber threats, necessitating effective intrusion detection systems (IDS) systems. Privacy provision, system complexity, and system scalability are major challenges in IDS research. We present FedSecureIDS, a novel lightweight federated deep intrusion detection system that combines CNNs, LSTMs, MLPs, and federated learning (FL) to overcome these challenges. FedSecureIDS solves major security issues, namely eavesdropping and man-in-the-middle attacks, by employing a simple protocol for symmetric session key exchange and mutual authentication. Our Experimental results demonstrate that the proposed method is effective with an accuracy of 98.68%, precision of 98.78%, recall of 98.64%, and an F1-score of 99.05% with different edge devices. The model is similarly performed in conventional centralized IDS models. We also carry out formal security evaluations to confirm the resistance of the proposed framework to known attacks and provisioning of high data privacy and security. Keywords: Federated learning , industrial cyber-physical systems , Internet of things , intrusion detection system , symmetric key I. INTRODUCTIONINDUSTRIAL cyber-physical systems (ICPS) aim to establish remote connections between industrial physical systems and control systems by incorporating cyber components [1]. These systems consist of coupled sensors and actuators. ICPS provides real-time monitoring, control, and automation of industrial operations, boosting efficiency and productivity and enabling diverse and versatile services and capabilities. These include smart transportation, intelligent manufacturing, medical systems, and other kinds of industrial automation [2], [3]. Despite advantages, the increased interconnectivity of ICPS exposes them to significant security threats [4], such as unauthorized access, denial of service (DoS) attacks, man-in-the-middle (MITM) attacks, and data breaches. IDS plays a crucial role in identifying and alleviating potential security threats while safeguarding the integrity and dependability of industrial operations [5]. IDS systems typically operate by centralized learning and information sharing. Conventional information-sharing methods dealing with raw data can impede the viability of centralized learning, particularly when it comes to sensitive data. This constraint is more pronounced when confronted with the difficulties linked to sharing raw data with external entities. Sharing raw data can present challenges due to various data management regulations and privacy considerations [6]. Individuals may also be hesitant to disclose their information, further complicating the situation [7]. ICPS fundamentally operates at three layers: Perception, transmission, and application [8]. Each layer is defined by the type of devices within it and the related functions that should be implemented [9]. Based on the functions achieved at each layer, this paper considers a three-level CPS architecture with different attack types as shown in Fig. 1. Federated learning (FL) provides a potential solution to addressing data privacy and confidentiality concerns by serving as a decentralized platform for machine learning (ML). FL allows for the creation of an effective IDS for IoT devices through training while maintaining privacy. Clients conduct local training and send their models to a central server, which combines them to form a global IDS model. The FL training procedure consists of various stages, such as client selection, local model training, and global model aggregation, as illustrated in Fig. 2. Although FL-based IDS has been widely adopted for its privacy-enhancing features, it is vulnerable to data leakage attacks launched by malicious clients [11]. Recent research shows that a malevolent device can introduce backdoors or replace the global FL model [12]–[14]. The issue of potential private user data extraction from gradients in FL has garnered more attention in the data security and AI ethics field, commonly known as the data leakage problem [15]–[17]. It is becoming more and more crucial to tackle possible security risks as the adoption of FL-based IDS is growing in popularity. One notable threat is the MITM attack, which presents a considerable risk to the integrity of FL-based IDS. In this attack, the attacker strategically places themselves between the communication channels of the device and the central server, as illustrated in Fig. 3. The attacker can exploit vulnerabilities to intercept, manipulate, or infiltrate malicious data and thus jeopardize the quality and data protection guarantees of FLbased IDS. Therefore, addressing the potential security threats is crucial for the success of FL-based IDS in ensuring data privacy and security. To address the challenges of safeguarding ICPS, we propose FedSecureIDS, a novel FL-based IDS. FedSecureIDS enhances security by enabling multiple distributed entities to collaboratively train a global intrusion detection model without sharing sensitive data, thereby preserving data privacy. Imagine a smart manufacturing factory where FedSecureIDS operates within an ICPS environment. The factory employs various interconnected devices, including sensors on machinery, actuators controlling robotic arms, and a central control system managing the entire production line. During normal operation, sensors continuously monitor machinery performance, collecting data on temperature, vibration, and operational speed. Actuators adjust machinery parameters based on control system commands to ensure optimal performance, while the control system processes sensor data in real-time, optimizing operations and preventing potential issues. In the above scenario, a potential threat could arise when an attacker attempts an MITM attack to intercept and alter sensor readings, causing the control system to make incorrect decisions that could damage machinery or halt production. FedSecureIDS counteracts such threats through a multi-step process. First, data is collected by sensors on the factory floor and continuously sent to local edge devices for initial processing. Local intrusion detection modules analyze the data using long short-term memory (LSTM) and convolutional neural networks (CNN) models to identify any anomalies indicative of an MITM attack. The local models on each edge device are periodically updated and sent to the FL Coordinator on the cloud server. This coordinator aggregates the local models to update the global intrusion detection model without accessing raw data, thus preserving data privacy. The updated global model is then distributed back to the edge devices, enhancing their capability to detect sophisticated cyber threats. Upon detecting anomalies, the system triggers alerts to the control system, which initiates predefined countermeasures such as isolating affected components and verifying data integrity through secure communication protocols. FedSecureIDS ensures resilience and privacy-preservation by leveraging FL, allowing collaborative training without sharing sensitive data. This approach not only enhances detection accuracy but also ensures data confidentiality, protecting the factory’s operations from evolving cyber threats. A. Motivation· Motivated by the need to address the existing limitations of data leakage attacks on FL, our research endeavors to enhance the security landscape by proposing innovative countermeasures. · Our research is driven by the complex threat scenario in ICPS, where the presence of diverse cyber threats demands an advanced IDS to effectively tackle challenges such as denial-of-service, reconnaissance, tampering, exploitation, and weaponization attacks. · In ICPS, mobile devices have to operate with limited computational resources. Our research is prompted by the imperative for an efficient security model. Thus, we strive for a lightweight solution to the problem. · The need to strengthen security systems is brought into further consideration by the increasing number of MITM and eavesdropping attacks being reported. B. ContributionsOur contributions main contributions are as follows: · We propose advanced countermeasures in the FedSecureIDS model to enhance data protection during FL, addressing the vulnerabilities associated with data leakage. · We introduce a novel, FL-based hybrid IDS designed for ICPS (FedSecureIDS). This model seamlessly incorporates diverse components to proficiently identify cyber threats, including DoS attacks, reconnaissance attacks, tampering incidents, exploitation attempts, and weaponization attacks, demonstrating a noteworthy increase in detection accuracy relative to centralized MLbased methods. · Our study introduces a lightweight and self-validated symmetric session key exchange algorithm tailored for resource-constrained devices. The algorithm effectively produces session keys that ensure secure communication while meeting these constraints. · Our research develops a robust cryptographic algorithm that uses self-authentication and a pre-shared session key with advanced encryption standard (AES) to ensure secure communication and data confidentiality, effectively countering eavesdropping. · With the help of BAN logic and the AVISPA tool, we provide both formal and informal security proofs for our proposed model. The rest of the paper is structured as follows. Section II delivers a comprehensive review of existing research on IDS designed for ICPS, with a particular focus on IDS techniques leveraging FL. Section III outlines the system model assumptions and the threat model employed in this work. Section IV then delves into the methodology underpinning the proposed framework. The experimental setup is outlined in Section V, detailing the implementation of our proposed model, the datasets used for evaluation, and the metrics employed to assess the system’s performance. A thorough security analysis of this framework is subsequently provided in Section VI. Section VII presents the experimental results and a discussion of their implications. Finally, Section VIII concludes the paper by summarizing the key findings and outlining potential avenues for future research. II. RELATED WORKSA. Anomaly-based IDSIn their work [18], the authors proposed a framework for identifying anomalies in cyber-physical systems (CPS). This framework involves a group of agents representing virtual digital shells of assets in the production line. These agents collect data from the assets, and a central agent, functioning as middleware, employs a learning algorithm to detect anomalous behaviour in the others. The framework is specifically designed for tasks related to anomaly identification, where intelligent agents oversee real-time data collection, analysis, and processing, while ML models, termed predictive models, are integrated into the multi-agent system to forecast the typical state of the cyber-physical system. The study presented in [19] introduces a methodology for detecting anomalies in cyber-physical systems through the application of long shortterm memory recurrent neural networks (LSTM-RNN). By employing LSTM-RNN to forecast data series, the approach addresses the temporal nature of anomalies and cyber-attacks, allowing for the correlation of time-series data over intervals. The model characterizes normal behaviour, serves as a predictor, and integrates the cumulative sum approach to identify anomalous behaviour with a low false positive rate. This method is beneficial in real-world CPS applications, where instances of unusual behaviour are infrequent. Notably, the research leverages a modest dataset collected from a single component of the system for training and validation, a practical consideration dictated by limited resources and infrastructure. In the research detailed in [20], the authors put forth a fresh perspective on identifying replay attacks within cyberphysical systems. This involves a comprehensive assessment of system stability coupled with the implementation of lossless watermarking techniques. Each agent in this framework is equipped with both a local estimator and an anomaly detector. The paper also introduces a method for detecting adversaries by incorporating a watermarked control approach, distributing the watermarking signal among the various agents in the network. However, it’s noteworthy that the incorporation of watermarked signals has been observed to potentially diminish the overall performance of the network. In [21], [22] researchers proposed an ML-based approach for detecting and mitigating communication threats in cooperative autonomous car applications. The methodology revolves around constructing models with the ability to comprehend typical behaviour patterns linked to benign V2X communication. It aims to discern abnormal behaviour, enabling the identification of potentially hazardous communication instances. The method is utilized in various cooperative autonomous vehicular situations, such as platooning, cooperative adaptive cruise control, intelligent intersection detection, and dynamic cooperative route management. However, the architecture only takes into account attacks that are carried out through a single communication channel. To detect potential dangers of fake data injection in smart grid networks, the authors of [23] put up a real-time detection method. An analytical model was created using the adaptive CUMSUM algorithm to build a detection system that adheres to essential criteria for ensuring performance assurance. This method is capable of detecting fake data attacks even in cases where the probability density function post-change is not known. In [24], the authors described an alternative approach where a real-time detection method was implemented to combat denial-of-service (DoS) attacks. The fuzzy IDS effectively distinguishes between regular and malicious network traffic under uncertainty. The system demonstrates an impressive detection rate of 99.9% over a dataset of over 5 million test sessions, with a low false alarm rate of approximately 1600. The method is based on a limited number of features to identify DoS flooding attacks. In [25], a comprehensive analysis of distributed denial of service attacks (DDoS), intrusion tolerance, and various mitigation techniques is provided. The paper then explores the impact of DDoS attacks on cloudbased services, highlighting how attackers can leverage the distributed nature of the cloud to amplify the effects of their attacks. Additionally, the authors examine the characteristics of different types of DDoS attacks, such as the use of botnets to generate large volumes of traffic. The core of the paper focuses on evaluating various mitigation techniques, including rate limiting, traffic filtering, and intrusion tolerance strategies. The authors provide a detailed assessment of the strengths and weaknesses of each approach due to a need for multifaceted defense against the evolving threat of DDoS attacks. B. FL for Anomaly-based IDSTo distribute the workload and improve scalability, FL uses a server-client design that allows computing at both ends. Within this paradigm, [26] presents FL models for IDS that use multi-layer perceptron (MLP) and autoencoders. The efficacy of these frameworks was assessed by conducting tests with FL, distributed, and centralized systems and then by using rigorous statistical analytics. A unique mitigation architecture called FLEAM, which leverages FL, was presented in [28] to combat massive DDoS attacks within the context of the industrial Internet of things (IIoT). Through the integration of FLEAM, fog computing, and cloud computing, the workload is distributed, which increases the effectiveness of the mitigation process. Each fog node in the network receives a pre-trained model, does local retraining, and then uploads the modified model parameters to the cloud as part of the operational workflow. In the research conducted by [29], a semi-supervised learning technique grounded in disagreement is introduced for collaborative IDS, employing FL to address limitations related to data size. Iterative model updates are made following each labeling phase in the suggested method, which entails training three classifiers and using a majority vote mechanism to give labels to unlabeled data. In terms of prediction accuracy and false alarm minimization, comparative evaluations show that their approach performs better than supervised ML techniques. A unique technique for deep learning (DL) training was suggested by the researchers in [30]. This approach utilized a privacy-preserving federated architecture, which capitalized on the heterogeneity of real-time intrusion data. Researchers in this study used a method called MT-DNN-FL to identify outliers using the same model and data. By using this method, it became easier to pinpoint and classify VPN traffic. This method not only lowered the amount of energy required for communication and the expenses associated with training, but it also demonstrated superior performance in comparison to other single-task learning methods. C. Cryptographic based IDSThe most efficacious measures for countering a diverse range of threats, including MITM attacks, unauthorized access to agents, provenance attacks, and tampering with agent logs, are represented by the encryption and authentication of sensitive data. In [31], a message-oriented middleware was proposed to improve communication performance in multiagent systems. The authors introduced a new component called the certification authority service, which generates certificates for agents to authenticate their identities and securely transmit messages in the architecture for cyber-physical systems. However, the technique falls short of addressing the concern regarding an agent’s discretion in placing trust in a particular message sender, thereby leaving room for potential vulnerabilities. In [32]–[38], the researchers used asymmetric key-based cryptosystems to provide security services such as privacy-preservation and data confidentiality. Most of these researches used asymmetric key-based Pallier cryptosystems for data confidentiality. Authentication services were also provided by asymmetric key-based cryptosystems. This results in increased computation and communication costs, making these cryptosystems heavyweight for smart devices. In asymmetric cryptosystems, the size of the secret keys is large, increasing key generation time. Our proposed model, FedSecureIDS, differs from existing FL for anomaly-based IDS and cryptographic-based IDS in the following ways. Traditional FL models often rely solely on a complex single architecture, such as an MLP, an autoencoder, or some other semi-supervised learning method that faces challenges like data leakage and high computational overhead. In contrast, FedSecureIDS integrates CNNs, LSTMs, and MLPs to enhance detection accuracy while utilizing a lightweight symmetric cryptographic approach to ensure data privacy. Also, unlike conventional cryptographic-based IDS that rely on symmetric key systems with high computational costs due to complex key exchange algorithms, FedSecureIDS utilizes a lightweight symmetric key exchange mechanism to deliver security services, encompassing authentication and data confidentiality. Symmetric key cryptosystems utilize smaller secret keys in comparison to asymmetric keys while ensuring an equivalent level of security. As per the National Institute of Standards and Technology (NIST), the strength of a 128-bit symmetric key is deemed comparable to that of a 3072-bit asymmetric key [39]. This approach thus reduces overhead and delivers robust security for resource-constrained devices within ICPS. The robustness of the symmetric key-based cryptosystem for mobile server-client communication has been formally and informally proved against several known attacks [40]. Table I presents a comparative analysis of the different IDS methodologies cited above, detailing their core techniques and strengths. This systematic comparison highlights the trade-offs and effectiveness inherent in various IDS approaches. TABLE I
III. SYSTEM MODEL, ASSUMPTIONS, AND THREAT MODELThe system model, assumptions, and threat model used in our proposed collaborative learning framework for ICPS are presented. The system model includes edge devices, cloud servers/aggregators, and a registration authority (R.A.) We assume that the edge devices have partial trustworthiness and that the R.A. and cloud servers are completely trustworthy. The threat model accounts for potential risks in ICPS and FL, including attacks such as MITM attacks, DDoS attacks, eavesdropping, membership inference attacks, and unintended data leakage attacks. A. ICPS based System ModelWe describe both the ICPS and our proposed FL framework within ICPS. The ICPS model involves remote users, IoT-based sensor nodes, gateways, and access points collaborating to ensure secure data exchange, thereby optimizing industrial operations. Our FL framework, illustrated in Fig. 4, strategically integrates edge devices, cloud servers/aggregators, and the R.A. This tailored system model addresses inherent challenges and risks in FL within ICPS, specifically focusing on privacy and security threats, providing a holistic approach to a solution. 1) Edge devices: The proposed FL framework consists of several edge devices, each representing a node that participates in the FL process. These devices collect local data from their respective sensors and use it to build a local machine-learning model. 2) Cloud server: The FL framework incorporates a cloud server with dual functions. In the beginning, it will initialize the parameters of the global model and then send them to the edge devices. Next, it takes the parameters that were provided by the edge devices and combines them until the model converges. After that, it sends the updated model back to the edge devices. These operations play a pivotal role in constructing an IDS model through FL. 3) Registration authority: Besides its role as a central aggregator, the cloud server serves as a registration authority that maintains a database of registered agents and their associated information. The R.A. facilitates secure communication between registered agents by distributing symmetric session keys. It is responsible for verifying the authenticity of registered agents and establishing a secure communication channel between them. B. AssumptionsOur model operates under the following assumptions: Firstly, the cloud server is regarded as a trustworthy entity. The R.A. is assumed to be fail-safe in ensuring that communication between edge devices and the cloud server is safe and reliable. The edge devices are regarded as somewhat trustworthy, suggesting that they will follow the established protocol but may also have an interest in the data resources of other edge devices. C. Adversary ModelThe Dolev-Yao adversary model [46] was chosen for this research. In computer and network security, the Dolev-Yao adversary model is a popular formal model that describes the actions of an attacker who controls the entire communication network. In the Dolev-Yao model, the adversary is assumed to be an attacker who can intercept, modify, delete, and insert messages in the communication channels between the participants of the protocol. In addition to this, it is presumed that the adversary possesses a limitless amount of processing power and is familiar with the whole protocol specification. The Dolev-Yao model operates under the presumption that messages are sent via a secure channel that is guarded by encryption methods. In an MITM attack, the adversary may gain access to encryption keys by intercepting the communication between parties or through other means, allowing them to decrypt and potentially manipulate the exchanged data. The adversary can also eavesdrop on the communication channels and observe the behaviour of the participants. The Dolev-Yao model is widely used in the analysis of security for cryptographic protocols, providing researchers with a means to identify and address potential vulnerabilities before they can be exploited by real attackers. Furthermore, we have assumed that the adversary would be unable to retrieve the data contained in the server’s database due to the secure nature of the R.A.’s database. IV. METHODOLOGYA. Data Collection and Pre-processingIn our study, we utilized the X-IIoTID dataset to ensure the generalizability of our proposed methods. The dataset includes various industrial-related data, enabling us to validate our FL-based IDS across different industrial settings with diverse processes, devices, and network configurations. For model training and testing, we partitioned the dataset into 80% for training and 20% for testing. The training data was evenly distributed among the smart devices using it to train their local models. To address variations in data strengths across clients, a scaling factor was incorporated into the federated framework to ensure fair consideration of all client data while accounting for uneven data distribution. We employed a multi-stage data preprocessing pipeline to ensure the quality and consistency of the dataset used to train and evaluate FedSecureIDS. This involved addressing missing values, handling categorical features, normalizing numerical features, and addressing class imbalance. To prepare data for ML algorithms, all features must be converted into numerical values. One approach to accomplish this involves label encoding for features in string format. Once the label encoding is done, the next step is to use the min-max normalization technique to normalize all feature values [47]. This technique scales the minimum feature value to 0 and the maximum value to 1. Normalization ensures that all feature values fall within a similar range, aiding in faster model convergence. After normalization, the feature values lie within a range of 0 and 1. In the proposed model, the pre-processing unit serves as the first stage, where the data is transformed for the input layer. This approach enhances the robustness and applicability of our IDS framework, ensuring its effectiveness in real-world industrial scenarios. These preprocessing steps are in line with established practices in [48]–[50]. B. Proposed FedSecIDS FrameworkThe proposed framework, FedSecureIDS, is designed to enhance cybersecurity within ICPS by leveraging FL to develop a privacy-preserving IDS. This FL-based IDS enables multiple distributed entities to collaboratively train a global anomaly detection model without sharing sensitive data, ensuring robust data privacy. The framework comprises several key components, including data collection agents, local intrusion detection modules, an FL coordinator, and a security manager. The proposed framework comprises two integral components. Firstly, we develop an FL-based IDS capable of addressing various types of attacks, including DDoS attacks. In the second phase, we focus on designing a robust communication protocol to ensure secure and resilient data exchange within the system. This dual-phase approach aims to enhance the overall cybersecurity posture by proactively detecting and mitigating potential threats, coupled with the implementation of a secure communication infrastructure. The complete FL framework technique, which permits collaborative learning among multiple agents, is described in Algorithm 1. The system enables the building of a strong and comprehensive FLbased IDS model that is trained on numerous data resources. This approach enhances the performance of DL models and enables the building of a highly effective IDS. The proposed privacy-preserving FL framework comprises several phases, as shown in Fig. 4, and details are described below. 1) In the first phase of the proposed system, the registered agents connect to the cloud server which also works as a registration authority. The agent first exchanges a session symmetric key with the server described in Section IV-E. Then, in order to confirm the validity of the registered agent, an authentication procedure is carried out on both ends using a pre-shared session key. If the authentication completes and the connection is established, an agent will send or receive secure model parameters to the server. If the authentication fails, the connection immediately terminates. 2) The cloud server sends the first set of model parameters, which include batch size, learning rate, and loss function, to the edge devices. At the same time, the server receives data size information from every edge device to determine their respective contribution ratios. The weight given to the gradient updates from each edge device during model training is determined through the utilization of the contribution ratio. 3) The IDS model parameters acquired from the cloud server are utilized to train the IDS model using the local dataset [TeX:] $$D_i$$ from each edge device [TeX:] $$E_n$$. The IDS model utilized for training is a hybrid CNN, LSTM, and MLP model, as detailed in Section IV-D. Algorithm 1 explains the stages for local model training, gradient calculation, and uploading to the cloud server. The technique also outlines the cloud server’s mechanism for collecting model gradients and updating global model parameters. 4) The edge devices in the suggested architecture use symmetric key encryption to secure the model’s gradients [TeX:] $$W_r$$ after training it on their local data [TeX:] $$D_i$$. It is represented by [TeX:] $$W_r$$ that the model’s gradients are in the Rth round, following training at edge device [TeX:] $$E_n$$. The whole global model is then constructed by aggregating the generated encrypted gradients [TeX:] $$E\left(W_r\right)$$ in the cloud. 5) Each local device updates its model and sends it back to the central server once training is complete. The server then compiles all of the modifications and uses them to create a new global model. C. Federated Learning AlgorithmAlgorithm 1 assumes a synchronous, non-IID setting where each client has access to a local dataset [TeX:] $$D_i \text {, and } f_i\left(w_{i, t}, B_{i, j}\right)$$ is the loss function for client i at round t with local model [TeX:] $$w_{i, t}$$ and mini-batch [TeX:] $$B_{i, j}.$$ At each round, the algorithm employs random selection to choose a subset of clients, and each client conducts local training for a predetermined number of epochs E. Following this, the updated model is transmitted back to the server for aggregation. The final global model is generated by the aggregation process, which involves averaging the local models of the selected clients after executing T rounds in the algorithm. D. Proposed Deep Learning-based Intrusion Detection ModelThis section presents the introduction of the proposed FedSecureIDS model, which combines elements from CNN, LSTM, and MLP neural networks to form a hybrid architecture. Before finalizing this particular ML model, we conducted comprehensive testing and evaluation of multiple DL methodologies, including CNN, LSTM, and MLP. The proposed model comprises four distinct units, i.e., a pre-processing unit, CNN unit, LSTM unit, and MLP unit. The integration of CNN and LSTM networks enables precise extraction of time-series patterns in network traffic data, which is vital for ensuring robust IDS. The success of the model can largely be attributed to CNN’s effective capability in extracting highlevel feature representations. The exceptional performance of this ML model has led to its selection as the IDS model within the proposed FL architecture, which aims to identify cyber threats in ICPS. Each constituent element of the proposed FL model will be thoroughly explored and analyzed in subsequent sections of the research paper. 1) CNN unit: The CNN unit’s architecture in the proposed IDS model is the focus of this section. CNNs are employed for feature extraction due to their proven effectiveness in identifying spatial hierarchies in data. They can automatically and adaptively learn spatial hierarchies of features from input data, making them highly suitable for analyzing network traffic data to detect anomalies indicative of cyber threats. According to [51], the CNN unit consists of five components: An input layer, a convolutional layer, a pooling layer, a fully connected (FC) layer, and an output layer. The CNN architecture in this study was customized to meet the specific needs of the use case. This involved the utilization of convolutional layers, pooling layers, and FC layers, as visually depicted in Fig. 5. The convolutional layer plays a pivotal role in the primary task of extracting features from the input data. It consists of multiple convolution kernels, each with a weight and deviation coefficient. During the convolution process, the weight coefficient and deviation quantity for each kernel are denoted as [TeX:] $$w_k \text { and } a_k,$$ respectively, and the input of convolutional layer k is represented as [TeX:] $$Y_{k-1}$$ in (1).
In the proposed CNN architecture, the convolutional layer is responsible for feature extraction from the input data. It is comprised of multiple convolution kernels, each with a corresponding weight and deviation coefficient. The operation of the convolution kernel k is represented by [TeX:] $$Y_k,$$ and it applies the activation function [TeX:] $$f(Y)$$, where RELU is chosen in this study as shown in (2). This activation function has advantages over others, such as sigmoid and tanh in terms of easy derivation, faster model training, and preventing gradient disappearance. The output result of the convolutional layer is obtained through regular sweeping of the input data by the convolution kernels to extract essential information.
(2)[TeX:] $$\operatorname{ReLU}\left(Y_k\right)= \begin{cases}Y_k, & \left(Y_k \gt 0\right) \\ 0, & \left(Y_k \leq 0\right)\end{cases}$$The pooling layer in a CNN is used to reduce the complexity of the network by downsampling and removing unnecessary information. In this research, max pooling was employed as the selected type of pooling due to its capability of preserving essential information by selecting the highest value within a specific area to represent that area. Max pooling can be expressed as in (3):
In the max pooling operation, [TeX:] $$G^{(m)}$$ represents the output result of the pooling region k, and [TeX:] $$g_k^m$$ denotes the element of the pooling region k. In the CNN model used for this research, the FC layers act as classifiers to evaluate the features extracted by convolutional and pooling layers. These features are then mapped to a hidden layer space and then again to a sample-marker space. To prevent the model from becoming too specialized during training and performing poorly on unseen data, a dropout operation is added to the FC layer. Dropout randomly removes some of the neurons to create a more generalized model. 2) LSTM component description: The LSTM network stands out in ML for its efficacy in handling sequential data, overcoming the challenges of gradient vanishing or exploding found in traditional RNNs. Thanks to its memory function, LSTM units adeptly retain both long-term and short-term information. Integral to this architecture are three gate structures: Forget, input, and output gates, which manage the retention or discard of information. The forget gate, particularly significant, determines the extent of information to discard based on the previous hidden state and current input, generating a value between 0 and 1, indicating the amount of information to retain or discard from the current cell state.
In (4). [TeX:] $$g_t$$ represents the gate value at time step t, σ denotes the sigmoid function, [TeX:] $$W_{g^{\prime}} \text { and } b_{g^{\prime}}$$ are the weight and bias parameters, respectively. [TeX:] $$h_{t-1}$$ is the previous hidden state, and [TeX:] $$x_t$$ is the current input. The input gate [TeX:] $$g_t$$ is pivotal in LSTM networks for determining how to update the cell state based on new input data. It undergoes two main steps: Firstly, it assesses the significance of information to retain or discard using the sigmoid function, as depicted in (5). Secondly, it generates alternative information to adjust the cell state via the hyperbolic tangent function, as illustrated in (6). The cell state is then updated from its previous state [TeX:] $$C_{t-1} \text { to } C_t$$ by considering the current input gate [TeX:] $$i_t$$ and the candidate values [TeX:] $$\tilde{C}_t.$$ This update involves discarding irrelevant information from the previous state and integrating relevant information from the current candidate state, as shown in (7).
In these equations, [TeX:] $$W_{i^{\prime}} \text { and } b_{i^{\prime}}$$ represent the weight and bias parameters for the input gate, respectively. Similarly, [TeX:] $$W_{c^{\prime}} \text{ and } b_{c^{\prime}}$$ denote the weight and bias parameters for the candidate cell state. The current time step’s forgotten gate is labeled as [TeX:] $$f_t,$$ while the input gate is denoted by [TeX:] $$i_t,$$ and the candidate cell state is represented as [TeX:] $$\tilde{C}_t .$$ The forget gate primarily discards specific information from the previous cell state, whereas the input gate integrates new candidate values and updates relevant information. Additionally, the output gate in an LSTM neural network serves as both an input for the subsequent cell state and a key determinant in establishing the final output based on the current cell state. A fundamental role of the output gate is to utilize the sigmoid function to identify which portion of the cell state should be outputted. Moreover, the final output is determined by the content identified by the output gate and involves the utilization of the hyperbolic tangent function. The equation for the output gate, denoted as (8), is presented as follows:
Additionally, the updated hidden state [TeX:] $$h_t,$$ indicated by (9), is computed by element-wise multiplication of the output gate value [TeX:] $$g_t$$ and the hyperbolic tangent of the cell state [TeX:] $$C_t.$$ LSTMs are chosen over other RNNs due to their capability to capture long-term dependencies and temporal patterns in sequential data without suffering from the vanishing gradient problem. This makes LSTMs particularly effective for recognizing patterns over time, which is crucial for identifying complex attack behaviours in network traffic. 3) Multilayer perceptron layer unit: The final component of the proposed model MLP is due to their robust classification capabilities. They can learn complex mappings from input features to output classes, making them suitable for distinguishing between normal and malicious network activities. MLP unit includes a dropout layer added between the dense layer and the output layer to counteract overfitting. The LSTM output, denoted as O in (10), serves as input to the MLP unit, undergoing processing as follows, integrating the dropout layer:
In (11), the output of the dropout layer is represented as D, while Y denotes the output of the MLP with dropout, as shown in (12). Considering the predicted class output can be either ”attack” or ”normal,” the sigmoid function is applied to pass the output of the MLP with dropout to the output layer. The loss function for the proposed ML model is binary crossentropy, expressed in (13):
(13)[TeX:] $$\eta=-\frac{1}{B} \sum_{i=0}^1 t_i \log \hat{t_i}+\left(1-t_i\right) \log \left(1-\hat{t_i}\right).$$In the proposed IDS model, edge devices labeled as [TeX:] $$E_n$$ train local models on their respective local data [TeX:] $$D_n.$$ During each communication round, an edge device receives a ciphertext C, which is decrypted to obtain new gradients [TeX:] $$\hat{w}^r.$$ These gradients are then used to update the local model weights [TeX:] $$W^r.$$ This iterative process continues until convergence is achieved. The model parameters are adjusted using the binary cross-entropy loss function, as both predicted and target values evolve, where B denotes the batch size, [TeX:] $$t_i$$ represents the target value, and [TeX:] $$\hat{t_i}$$ represents the predicted value. E. Secure Communication ProtocolThis section describes the secure communication protocol designed to establish and maintain secure communication between the cloud server and a variety of devices, employing symmetric session keys described in Algorithm 2. The protocol follows several stages: Device registration, login, secret key generation, and mutual authentication, as detailed below. 1) Device registration phase: Each authorized device must undergo registration with the R.A. Throughout this registration process, the R.A. assigns a distinct identity, denoted as [TeX:] $$I D_D$$, to each device and generates a confidential secret value. For security purposes, the device’s hash-based identity, denoted as [TeX:] $$H I D_D$$, is computed as outlined in (14), while the secret value, termed HSV, is determined according to the procedure specified in (15).
The server stores the device’s confidential credentials in its database. If a device tries to register with an identity already on record, the attempt is rejected. 2) Login phase: During the login phase, the device sends [TeX:] $$H I D_D$$ to the server for validation. After receiving the login details from the remote device, the server looks for the corresponding [TeX:] $$H I D_D$$ in its database. If [TeX:] $$H I D_D$$ is found in the database, the server retrieves the associated HSV for that device. Next, the server generates two large random numbers, R1 and R2. Using (16) and (17), the server computes [TeX:] $$X^* \text { and } Y^*$$, respectively, and sends them to the remote device.
The device obtains the values of [TeX:] $$X^* \text { and } Y^*$$ from the server and calculates [TeX:] $$R_1 \text { and } R_2$$ using (18) and (19)
3) Key exchange phase: The system produces a 128-bit random number [TeX:] $$R_D.$$ It then multiplies [TeX:] $$R_D \text{ by } R_1,$$ adds [TeX:] $$R_1$$ to the product, and calculates the final result, as shown in (20).
The device calculates a final result and sends it to the server as [TeX:] $$FR_D.$$ It is important to note that [TeX:] $$FR_D,$$ the output value, does not directly reveal the key. Therefore, if a hacker intercepts FR_D, they will not be able to obtain the key.
Upon receiving the final result [TeX:] $$FR_D$$ from the device, the server infers the hidden number [TeX:] $$R_D$$ by utilizing both numbers [TeX:] $$R_1 \text{ and } R_2$$ as shown in (21). Then, the server calculates the difference between [TeX:] $$R_1 \text{ and } R_2,$$ yielding [TeX:] $$\operatorname{Res}_S$$ as outlined in (22).
The server extracts [TeX:] $$R_D,$$ as presented in (23)
When both parties possess shared secret numbers, denoted as [TeX:] $$R_D \text{ and } R_S,$$ each side performs a bitwise XOR operation between [TeX:] $$R_D \text{ and } R_S,$$ followed by a modulus calculation with respect to the variable M. The resultant value, referred to as the final session key [TeX:] $$K_S,$$ is clandestinely derived on both ends. This process is illustrated in (24), where M represents the larger value of 128 bits. Initially, it limits the size of the key to 128 bits. We can increase the size of the key easily.
4) Authentication phase: After successfully exchanging symmetric session keys, the cloud server, and registered agent will authenticate their identities to prevent MITM attacks. Both parties utilize HMAC with a symmetric session key for authentication. The HMAC calculation involves using their IP addresses, combining them with random numbers and symmetric session keys, as illustrated in (25) and (26), and exchanging the results. SHA-2-256 hashing algorithm is employed to produce HMAC, generating a 64-bit hexadecimal fixed-length code. HMAC values are computed on both ends for verification against the received value. Authentication is deemed successful if the calculated and received HMAC values match on both ends. In case of a mismatch, the connection is promptly terminated. In equation (25) and (26), IA represents the IP address.
5) Encryption/decryption phase: After mutual authentication on both sides, the cloud server and agent proceed to exchange encrypted data. The AES algorithm is used for encryption and decryption to achieve this. AES works on a block-by-block basis, with each block being 128 bits in size. The encryption and decryption procedures follow the equations presented in (27) and (28) correspondingly.
Each agent securely transmits encrypted data to the cloud server. Upon receiving the encrypted data, the cloud server decrypts it using the symmetric session key. Subsequently, it aggregates all the data and securely sends it to all agents in an encrypted format. This approach ensures that the data remains protected against any potential adversary. V. EXPERIMENTAL SETUPIn this section, we detail the experimental setup used to evaluate the effectiveness and performance of our proposed FedSecureIDS scheme within the context of a privacypreserving FL-based IDS. The evaluation focuses on several key aspects, including the implementation details, the datasets employed, and the metrics used for performance assessment. A. Dataset DescriptionThe X-IIoTID dataset, comprising 68 attributes related to network traffic and system activities, was utilized to evaluate the proposed framework. It covers data on nine types of attacks, including reconnaissance, weaponization, lateral movement, C&C, tampering, RDoS, exfiltration, and cryptoransomware. The diversity and breadth of this dataset support the generalizability of the proposed framework across different industrial contexts, as it encapsulates a wide range of scenarios and potential threats commonly encountered in industrial environments. However, this research focused solely on distinguishing between normal requests and attacks by classifying them under the “attack” category. Future work will explore methods for categorizing multiple attack types. The X-IIoTID dataset comprises 820,834 cases categorized as normal or attack. Among these, 421,417 were labeled as normal and 399,417 as attacks. B. Implementation DetailThe FL-based IDS model presented in this study was developed using TensorFlow, an open-source Python library for DL, and the Keras API. The implementation and evaluation of the framework were conducted on a system with a 4-core CPU, 16 GB RAM, and 512 SSD using Python 3.0. C. Evaluation MetricsFour common evaluation metrics, classification accuracy, precision, recall, F1-score, and RoC, were employed to evaluate the efficacy of the suggested framework. These metrics depend on values like true positive, true negative, false positive, and false negative.
(32)[TeX:] $$\text { F1-score }=2 \times\left(\frac{\text { Recall } \times \text { Precision }}{\text { Recall }+ \text { Precision }}\right)$$In (29) to (32): - [TeX:] $$C_P$$ represents the count of true positive predictions. - [TeX:] $$C_N$$ represents the count of true negative predictions. - [TeX:] $$I_P$$ represents the count of false positive predictions. - [TeX:] $$I_N$$ represents the count of false negative predictions. - Accuracy measures the overall correctness of the predictions. - Recall measures the proportion of actual positive cases that were correctly identified. - Precision measures the proportion of positive predictions that were correctly identified. - F1-score is the harmonic mean of precision and recall, providing a balance between the two metrics. VI. SECURITY ANALYSISWe demonstrate the robustness of our proposed lightweight FL-based IDS model through informal and formal security analysis. A. Informal Security AnalysisThe informal security analysis covers protection against known attacks like replay attacks, impersonation attacks, MITM attacks, side-channel attacks, DoS attacks, and perfect forward secrecy discussed in [40], [45]. Different mechanisms and standards have been used to protect against known attacks. B. Security Analysis through AVISPAFor formal security analysis, we utilize the AVISPA tool based on the Dolev-Yao adversary model. The AVISPA tool code can be found on GitHub [52]. Validation tests using OFMC and CL-AtSe were conducted to evaluate the resilience of our security model against severe assaults. The test results, shown in Fig. 6, validate the security and efficacy of our model. C. Security Analysis through BAN LogicIn this section, we showcase the resilience of our proposed scheme against known attacks by leveraging Burrows-Abadi- Needham (BAN) logic. BAN logic, a symbolic logic system devised for scrutinizing security protocols, is utilized to validate the adherence of a protocol to various security properties. A comprehensive BAN logic analysis for the user registration phase of our proposed protocol, complete with equations and explanations, is provided below: – Initial assumption: – Device D places trust in server S. – Device D selects a distinctive identity [TeX:] $$HID_D$$ along with a secret value HSV . – Utilizing a hash function, both [TeX:] $$HID_D$$ and HSV are generated to facilitate secure communication. – Server S securely stores the confidential credentials of devices, encompassing device type and serial number. – Server S authenticates the identity of devices by crossreferencing the stored data in its secure and protected database. – Idealized protocol model: – Protocol description: – Formal agreement analysis: 1) BAN logic formal analysis for device registration phase:: – Device D selects a unique identity [TeX:] $$ID_D$$ and a secret value [TeX:] $$HSV_D$$. – Device D computes hash-based identity [TeX:] $$HID_D=\operatorname{hash}\left(I D_D\right)$$ and hash-based secret value [TeX:] $$HSV=\operatorname{hash}\left(I D_D \| P S W_D\right).$$ Verification of [TeX:] $$HID_D$$ by server S: · For verification, device D transmits [TeX:] $$HID_D$$ to server S. · Server S cross-references the received [TeX:] $$HID_D$$ with its stored database to validate the identity. · If the identity is confirmed, server S continues; otherwise, it terminates the connection. 2) Equations and BAN logic examination:: · Initial assumptions (idealization): – D places trust in [TeX:] $$\left\{\mathrm{S}, K_S\right\} \text { security: D } \mid \mathrm{S}, K_S$$ – D selects a distinct identity and robust password: [TeX:] $$\mathrm{D} \mid \left\{I D_D, P S W_D\right\}$$ – D trusts the security of the hash function: [TeX:] $$\mathrm{D} \mid \operatorname{hash}\left(I D_D\right)$$ – S securely maintains user credentials: [TeX:] $$\mathrm{S} \mid\{\text { Secrets }\}$$ – S verifies identities using stored data: [TeX:] $$\mathrm{S} \mid\{\text { Verified }\}$$ · Idealized protocol model (idealization): – Device D dispatches [TeX:] $$HID_D$$ to server S for verification: [TeX:] $$\mathrm{D} \rightarrow \mathrm{~S}:\left\{H I D_D\right\}$$ – Server S scrutinizes [TeX:] $$HID_D$$ in its database, confirming identity: [TeX:] $$\mathrm{S} \rightarrow \mathrm{D}:\{\text { Verified }\}$$ · Protocol description (formalization): – Device D assumes server S received [TeX:] $$HID_D$$: [TeX:] $$\text { D | S: } \left\{H I D_D\right\}$$ – Device D assumes server S verified the identity: [TeX:] $$\text { D | S: } \{\text { Verified }\}$$ · Formal agreement analysis (inference rules): – Device D believes server S verified identity with [TeX:] $$H I D_D: \mathrm{D} \mid \mathrm{S}:\left\{H I D_D, \text { Verified }\right\}$$ – Server S securely preserves user credentials: [TeX:] $$\mathrm{S} \mid \{\text { Secrets \} }$$ – Server S authenticates identities using stored data: [TeX:] $$\mathrm{S} \mid \text \{\text { Verified }\}$$ – Device D securely registers with server S: [TeX:] $$\mathrm{D} \mid \mathrm{S}: \text \{\text { Registered }\}$$ 3) BAN logic assessment: The device registration phase is designed to ensure the secure registration of device D with server S, utilizing a unique identity and a robust password. server S meticulously validates identities by cross-referencing with stored data, thereby permitting only authorized individuals to complete the registration process successfully. The protocol guarantees the precise utilization of the designated identity and password. This BAN logic formal analysis definitively confirms the adherence of the user registration phase to security properties, consequently diminishing the likelihood of unauthorized access. Comparable analyses can be conducted for additional protocol phases as deemed necessary. D. Attack Scenarios and Mitigation StrategiesTo further illustrate the practical implications of our security analysis, this section presents detailed scenarios for different attack types and discusses strategies for mitigating vulnerabilities. These scenarios demonstrate how FedSecureIDS detects and responds to various threats, reinforcing the findings from our formal and informal analyses. Due to the constraints of paper length, we will focus on two critical attack scenarios: DoS attacks and MITM attacks. 1) DoS attacks: In a DoS attack, an attacker aims to disrupt the communication channels within an ICPS. As illustrated in Fig. 7, the attack agent targets the wireless channels between the control and monitoring center and the actuators, plants, and sensors. The attacker floods these wireless channels with excessive traffic, preventing legitimate control signals and sensor data from being transmitted. Specifically, the attacker overwhelms the wireless channel from the control and monitoring center to the actuators, making it impossible for the center to send control commands. Simultaneously, the wireless channel from the sensors to the control and monitoring center is flooded, causing a loss of critical sensor data needed for real-time monitoring and decision-making. This disruption can result in halted operations, malfunction of the physical plant, and the control system operating blindly without sensor updates, potentially causing unsafe conditions or operational inefficiencies. To counteract DoS attacks, FedSecureIDS employs a multilayered approach combining real-time monitoring, anomaly detection, and dynamic response mechanisms. Firstly, FedSecureIDS continuously monitors traffic on the wireless channels using distributed edge devices, collecting and analyzing traffic patterns to establish a baseline of normal operations. CNNs are then utilized to detect anomalies in traffic patterns, examining features such as packet rate, packet size, and traffic volume to identify abnormal spikes indicative of a DoS attack. Upon detecting an anomaly, rate limiting is applied to control the flow of traffic, ensuring that no single source can overwhelm the wireless channels, and traffic-shaping techniques prioritize critical data such as control commands and sensor updates. 2) MITM Attacks: In an MITM attack, an attacker intercepts and manipulates the communication between components of an ICPS. As depicted in Fig. 8, the attacker, acting as a false data generator, positions themselves between the sensors, actuators, and the control and monitoring center. The attacker intercepts the data transmitted from the sensors, alters it, and sends false information to the control and monitoring center. Similarly, they intercept control commands from the control and monitoring center, modify them, and send erroneous commands to the actuators. This manipulation can cause the system to make incorrect decisions, leading to potential damage to the physical plant, inefficiencies, or unsafe operating conditions. To mitigate MITM attacks, FedSecureIDS utilizes a combination of real-time monitoring, cryptographic protocols, and anomaly detection techniques. Firstly, FedSecureIDS implements end-to-end encryption for all communications between the control and monitoring center, sensors, and actuators using strong cryptographic protocols such as transport layer security (TLS). This ensures that even if an attacker intercepts the data, they cannot read or modify it without the encryption keys. FedSecureIDS also continuously monitors network traffic for signs of anomalies that indicate an MITM attack. Local Intrusion Detection Modules on edge devices employ LSTM networks to analyze temporal patterns in the data, detecting subtle changes indicative of data manipulation. CNNs are used to examine features such as packet integrity, timing, and sequence to identify discrepancies that suggest the presence of an attacker. By implementing these strategies, FedSecureIDS effectively mitigates the risk of MITM attacks, ensuring the integrity and authenticity of data within the ICPS environment. This approach enhances the resilience of the system against sophisticated cyber threats, maintaining safe and efficient operations. TABLE II
VII. RESULT AND DISCUSSIONA. Performance Comparison with Baseline StudiesThe FL-based IDS model’s effectiveness is compared to other well-known research utilizing FL frameworks like [41]–[44]. DL models from these studies are replicated and assessed against the proposed model, as shown in Figs. 9–14. Experimental results demonstrate the superiority of the proposed IDS model in accuracy, precision, recall, and F1-score over current leading models. The FL-based IDS model’s performance typically enhances as the number of communication rounds R escalates from 3 to 15, stabilizing at a sufficiently large R. Figs. 9–11 shows the accuracy of five federated learning algorithms [41]–[44] and the proposed FedSecureIDS model across varying rounds R and a number of edge device [TeX:] $$E_n$$ (3, 7, and 15). In all the considered scenarios, the proposed method consistently achieves the highest accuracy, approaching 0.995%, demonstrating superior performance and scalability. Algorithm [44] generally ranks second, maintaining an accuracy of around 0.98%, while [43] shows moderate performance. Algorithms [41] and [42] exhibit the lowest accuracy, remaining around 0.975% and below. These findings are encapsulated in Fig. 15, which presents the same results in a bar graph format, reinforcing the superior performance of the proposed method. Similarly, for the F1-score, the proposed FedSecureIDS again leads, indicating its superior precision and recall, as shown in Figs. 12–14. This illustrates that with 3, 7, or 15 edge devices, it exhibited slight variation in F1-scores, which evidences its robustness. An algorithm in [44] remains the second-best performer, while [43] demonstrates moderate results. Algorithms [41] and [42] exhibit the lowest F1-scores, reflecting poorer performance. These findings are summarized in Table III, which presents the same results in a tabular format, reinforcing the superior performance of the proposed method. The results of the proposed FedSecureIDS confirm the approach’s its robustness and adaptability in diverse federated learning setups. TABLE III
B. Performance Comparison with Centralized ModelThis section evaluates an FL-based IDS model for detecting cyber threats in CPS while preserving data privacy. We compare its performance with a centralized ideal model that has access to all data. The comparison is shown in Fig. 16 presents a comparative analysis of detection accuracy between the centralized approach and proposed FedSecureIDS across five categories of cyberattacks, i.e., DoS, weaponization, exploitation, tampering, and reconnaissance. Both methods achieve high accuracy rates, exceeding 0.85 in all categories. However, the proposed method consistently outperforms the centralized approach across all attack vectors. This consistent superior performance of the proposed methodology underscores its effectiveness in accurately identifying and mitigating a range of cyber threats, highlighting its potential for enhanced application in cybersecurity contexts. These results are presented in Table II, which includes a detailed comparison of accuracy, F1-score, precision, and recall for both methodologies across the five attack categories. Fig. 17 compares the performance of a centralized learning model and multiple FedSecureIDS configurations. The area under the curve (AUC) values indicate the effectiveness of each model in distinguishing between classes. The centralized model achieves the highest AUC of 0.9429, while various FL setups, including 3 clients with [TeX:] $$E_n=8 \text { and } E_n=10,$$ and 10 clients with [TeX:] $$E_n=9 \text{ and } E_n=10,$$ exhibit slightly lower but comparable AUC values, demonstrating the feasibility of FL in distributed environments. The centralized model achieves the highest accuracy with an AUC of 0.9429 due to its access to a complete, unified dataset. This comprehensive data exposure allows for robust feature learning. Conversely, the FedSecureIDS model achieves an accuracy of 0.93, is trained on subsets of data across clients, and faces challenges such as non-IID data and limited data per client, slightly reducing their AUC values. Despite this, FL models maintain strong performance while enhancing data privacy, making them suitable for sensitive applications in ICPS. FedSecureIDS demonstrates high performance in detecting known cyber threats, its efficacy against emerging, sophisticated attacks may vary. Additionally, environments with constrained computational resources might face challenges in implementing the model efficiently. 1) Key management analysis: Table IV shows the comparative analysis between the asymmetric key management protocols and our proposed symmetric key exchange protocol. Comparison results show that symmetric key exchange protocols have advantages over asymmetric key management protocols in terms of key size, key generation time, computation speed, communication overhead, and security threats. TABLE IV
2) Symmetric and asymmetric cryptography performance analysis: We compared the performance of asymmetric and symmetric key cryptographic systems. Results show that our proposed symmetric key cryptographic system has advantages over asymmetric cryptographic systems in terms of communication and computation overheads [53]. 3) Operational efficiency and performance analysis: When we compared the performance of our proposed secure symmetric key cryptographic system with other systems in terms of computation and communication costs, our proposed secure system showed better performance in terms of communication and computation costs. We implemented our proposed system on an industrial NFV-Based IPv6 network and measured the operational efficiency [45]. VIII. CONCLUSION AND FUTURE WORKICPS in Industry 4.0 is highly vulnerable to cyber threats due to device interconnectivity, making intrusion detection, and data privacy a significant challenge. This paper proposed a FedSecureIDS model where multiple smart devices collaboratively construct an IDS using CNN, LSTM, and MLP architectures. Rigorous testing on a real-world CPS dataset demonstrated the method’s superior performance, achieving 98.68% accuracy, 98.78% precision, 98.64% recall, and a 99.05% F1-score. A lightweight symmetric encryption technique was integrated to secure communications between smart devices and servers, ensuring data privacy. The method’s resilience to well-known cyber attacks was validated through formal analysis using tools such as AVISPA and BAN logic, offering strong assurance of effectiveness. Furthermore, informal security analysis confirmed its ability to mitigate cyber threats, establishing the framework as a reliable option for securing ICPS environments. Several challenges in FL based IDS persist, including vulnerability to data poisoning attacks, which necessitate robust countermeasures. Enhancing model training efficiency and managing communication overhead through client clustering is crucial. Additionally, it is important to reduce latency and computational overhead for real-world deployment and to use different testing environments for model validation. Optimizing latency and computational overhead for real-world deployment alongside diverse testing environments is essential for model validation. Data inference attacks also pose risks, demanding stronger privacy-preserving techniques. Addressing these challenges will improve the security and efficiency of FL-based IDS in ICPS. BiographyImtiaz Ali SoomroImtiaz Ali Soomro is pursuing a Ph.D. in Electrical and Computer Engineering at Sir Syed CASE Institute of Technology, Islamabad, Pakistan. He obtained his B.E. in Telecommunications from Hamdard University, Islamabad, Pakistan 2010. He earned his M.S. in Electrical Engineering with a specialization in Telecom and Networking from COMSATS University, Islamabad, Pakistan, in 2012. His research interests are focused on the application of federated learning for IoT, wireless networks, and cybersecurity, particularly on privacy-preserving technologies and secure communication in distributed systems. He is also an IEEE Member, actively contributing to the research community through his innovative work in machine learning, IoT, and cybersecurity. BiographyHamood ur Rehman KhanHamood ur Rehman Khan obtained his B.S. degree in Electronics Engineering from Ghulam Ishaq Khan Institute of Technology in 2000, M.S. degree from the University of Michigan in 2005 and Ph.D. degree from King Fahd University of Petroleum and Minerals in 2019, both in Electrical Engineering. He is currently a Senior Member Technical Staff at the Center for Advanced Research and Engineering (CARE). He is currently an Adjunct Professor with the Computer Science Department, Sir Syed-CASEInstitute of Technology. At CARE he has been leading projects ranging from IoT Platform-as-a-Service systems, cybersecurity products, and advanced VLSI based AI platforms for large language model (LLM) inference. During the period 2000-2003 he was with Avaz Networks Inc, California as a Senior VLSI Design Engineer working on highdensity voice over IP (V oIP) system-on-chips for gateway media switches. His primary research interests are information theory and signal processing for networked systems like WSNs and IoT. BiographySyed Jawad HussainSyed Jawad Hussain is an Associate Professor and Chairperson at the Sir Syed Case Institute of Technology, Islamabad, Pakistan. He holds a Ph.D. in Computer Science from Massey University, New Zealand, focusing on developing high-definition video quality experience models. His research interests include multimedia communication networks, machine learning, quality of service (QoS), quality of experience (QoE), data and network security, and statistical modeling. Dr. Hussain has extensive experience in academia and industry, having held various leadership roles, including Head of Department positions at institutions in Pakistan and abroad. He has worked on numerous research and consultancy projects, focusing on machine learning, data security, and multimedia communications. Dr. Hussain has published extensively in prestigious journals and conferences, contributing significantly to the field of computer science. BiographyZeeshan AshrafZeeshan Ashraf, SMIEEE received the MSCS de- gree in Computer Networking and Security Services from UMT, Lahore, Pakistan, and the Ph.D. degree in Computer Science from IQRA University, Islam- abad Campus, Islamabad, Pakistan. He has more than 15 years of experience in the IT field. He is CCNP (routing and switching) certified by CISCO. He is the Author of several books published by international publishers. His several research articles have been published in well-reputed international journals, having a good impact factor in Q1 and Q2. Currently, he is working as an Associate Professor in the Department of Computing and Information Technology, at IISAT, Gujranwala, Pakistan. His main research interests include next-generation virtualized internet archi- tecture, IPv6 routing, performance modeling, IoT, optimization techniques, and security services in different networks. BiographyMrim M. AlnfiaiMrim M. Alnfiai is currently an Associate Pro- fessor in Information Technology with Taif Univer- sity, Saudi Arabia. Her research interests include assistive technology, human–computer interaction, accessibility, usable security, AI, and machine learn- ing. She has published several papers at ISI jour- nals and assistive technology, HCI, and accessibility conferences, including ASSETS, ANT, FNC, CIST, JAIHC, and ICCA. Her current research interest includes designing accessible tools for visually im- paired people, including people with no or low vision. She has conducted several studies and experiences to understand visually impaired abilities and behaviors and design accessible systems that help them interact easily with technology. She has also published several articles on accessibility and authentication mechanisms for visually impaired users. She has also published articles related to using NFC technology and machine learning to enhance the healthcare systems. BiographyNouf Nawar AlotaibiNouf Nawar Alotaibi is an Assistant Professor in the Faculty of Education at Najran University. She teaches courses in gifted education and talent development, creativity, and the history and systems of psychology with educational applications. She is studying learning designs that combine advanced content, complex thinking, conceptual understand- ing, and intellectual character development. Cur- rently, she is working on complex thinking, specif- ically focused on developing students’ abilities to think both critically and creatively within domain-specific traditions. References
|