CN113992419A - User abnormal behavior detection and processing system and method thereof - Google Patents

User abnormal behavior detection and processing system and method thereof Download PDF

Info

Publication number
CN113992419A
CN113992419A CN202111268433.1A CN202111268433A CN113992419A CN 113992419 A CN113992419 A CN 113992419A CN 202111268433 A CN202111268433 A CN 202111268433A CN 113992419 A CN113992419 A CN 113992419A
Authority
CN
China
Prior art keywords
abnormal
user
data packet
flow data
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111268433.1A
Other languages
Chinese (zh)
Other versions
CN113992419B (en
Inventor
崔宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202111268433.1A priority Critical patent/CN113992419B/en
Publication of CN113992419A publication Critical patent/CN113992419A/en
Application granted granted Critical
Publication of CN113992419B publication Critical patent/CN113992419B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a system and a method for detecting and processing abnormal user behaviors, which comprises the following steps: a user data processing module: training by adopting user data, and uploading the obtained model parameters as a flow data packet; the network detection and identification classification module: extracting and classifying characteristic information of the flow data packet and marking a characteristic vector; an anomaly identification module: identifying whether the flow data packet is abnormal, processing the abnormal flow data packet, and sending the normal flow data packet and the abnormal flow data packet which cannot be processed; a parameter execution aggregation module: and identifying and processing abnormal flow data packets which cannot be processed, aggregating model parameters in the normal flow data packets to form a global model, and sending the global model to each user. Compared with the prior art, the method and the device can efficiently detect the abnormal behaviors of the user and process the abnormal data in time under the condition of not leaking any user data, and ensure the safety in the data encryption interaction process.

Description

User abnormal behavior detection and processing system and method thereof
Technical Field
The invention relates to the field of computer network security, in particular to a user abnormal behavior detection and processing system and a method thereof.
Background
With the wide popularization of network applications and the continuous development of network attack technologies, social circles give high attention to network space security technologies, and the problem of intrusion detection needs to be solved in the field of network space security.
In recent years, detection of abnormal behavior of users has become an important branch of intrusion detection. Since each user has different work tasks and personal habits, the user command input has characteristics of serialization and diversification. It is therefore necessary to design a detection system to audit shell commands entered by users to quickly detect and prevent malicious activities.
However, the Shell commands entered by the user relate to operational privacy, and many users cannot share the personal data set for algorithm model training. Most intrusion detection systems are constructed based on conventional machine learning algorithms and are difficult to train using a user's local data set without involving the user's privacy.
Disclosure of Invention
The present invention aims to overcome the defects of the prior art and provide a user abnormal behavior detection and processing system and a method thereof, which can efficiently detect the abnormal behavior of the user and process the abnormal data in time under the condition of not leaking any user data, and ensure the security in the data encryption interaction process.
The purpose of the invention can be realized by the following technical scheme:
according to an aspect of the present invention, there is provided a user abnormal behavior detection and processing system, comprising: the system comprises a user data processing module, a network detection and identification classification module, an abnormality identification module and a parameter execution aggregation module;
a user data processing module: training by adopting local data of a user, and uploading model parameters obtained by training as a flow data packet;
the network detection and identification classification module: detecting and extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
an anomaly identification module: identifying whether the traffic data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal traffic data packet, and sending the normal traffic data packet and the abnormal traffic data packet which cannot be processed to a parameter execution aggregation module;
a parameter execution aggregation module: and identifying and processing the abnormal flow data packet which cannot be processed, and aggregating model parameters in the processed normal flow data packet to further form a global model for combined modeling of each user, and sending the global model to each user.
Preferably, the user data processing module and the network detection and identification classification module are located in a terminal server of each user subsystem, and the anomaly identification module and the parameter execution aggregation module are located in a central server.
Preferably, the user data processing module trains local data of the user by using a BilSTM model.
Preferably, the characteristic information of each traffic data packet includes a number, a timestamp, a source address, a destination address, a protocol, a length, and packet information.
Preferably, the anomaly identification module comprises a network anomaly database for storing feature vectors of various anomalies and processing means corresponding to the feature vectors of each anomaly.
Preferably, the system further comprises a virtualized network traffic monitoring module for monitoring the uploaded traffic data packet, thereby preventing processing of the traffic data packet from being blocked.
Preferably, the virtualized network traffic monitoring module suspends the uploading of traffic packets when the number of monitored traffic packets exceeds 70% of the processing capacity of the central server.
According to another aspect of the present invention, there is provided a user abnormal behavior detection and processing method using the user abnormal behavior detection and processing system, including the following steps:
s1: training by adopting local data of a user, uploading model parameters obtained by training as a flow data packet, detecting, extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
s2: summarizing the flow data packets uploaded by a plurality of users, identifying whether each flow data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal flow data packets, and uploading the normal flow data packets and the abnormal flow data packets which cannot be processed;
s3: and identifying and processing abnormal traffic data packets which cannot be processed, aggregating model parameters in all the processed traffic data packets, further forming a safe global model for each user to jointly model, and sending the global model to each user.
Preferably, the specific content of S2 is:
identifying whether each flow data packet is abnormal according to the characteristic vector and the qualification of the characteristic information to obtain an abnormal flow data packet; matching the abnormal characteristic vector in the acquired abnormal traffic data packet with the characteristic vector stored in a network abnormal database, and if the abnormal characteristic vector in the abnormal traffic data packet exists in the network abnormal database, directly calling a processing mode in the abnormal network database for processing; otherwise, uploading the abnormal flow data packet which cannot be processed to the parameter execution aggregation module.
Preferably, the specific content of S3 is:
identifying abnormal flow data packets which cannot be processed, if the abnormal flow data packets are determined, performing corresponding processing and feeding back to a user terminal corresponding to the abnormal flow data packets, and inputting abnormal characteristic vectors of the abnormal flow data packets into a network abnormal database so as to be directly processed next time; if the data packet is determined not to be the abnormal flow data packet, the data packet is recovered to be the normal flow data packet; and aggregating the model parameters in the processed normal flow data packet to form a global model for combined modeling of each user, and sending the global model to each user.
Compared with the prior art, the invention has the following advantages:
1. the invention coordinates a plurality of sub-servers through the central server and unifies the user data set to establish a general model to realize data interaction. The sub-servers of each user use the independent data to conduct local training, and upload the trained model parameters to the central server. And the central processing unit collects and downloads different sub-user models and trains a global model. The whole process only relates to the parameters of the model and does not reveal any user data.
2. According to the invention, under the condition of not leaking any user data, the flow data packet of the model parameters of the user is uploaded, meanwhile, the abnormal condition of the flow data packet is monitored and analyzed through the network detection and identification classification module and the abnormal identification module, and the abnormal user is rapidly screened and processed by utilizing the network abnormal database, so that a safe global model is formed and sent to each user, and each user can process data by utilizing the global model under the condition of protecting privacy.
3. The invention reads the model of the central server through the virtualized network flow monitoring module, and sets the threshold value according to 70% of the processing capacity of the central server, thereby preventing the uploading of the flow data packet from being blocked or omitted due to the fact that the uploaded flow data packet exceeds the threshold value.
Drawings
Fig. 1 is a schematic structural diagram of a system for detecting and processing abnormal user behavior according to this embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
Referring to fig. 1, the present embodiment provides a system for detecting and processing abnormal user behavior, including: the system comprises a user data processing module M1, a network detection and identification classification module M2, an abnormality identification module M3, a parameter execution aggregation module M4 and a virtualized network traffic monitoring module M5;
the user data processing module M1 and the network detection and identification classification module M2 are located on the terminal servers of the subsystems of each user, and the abnormity identification module M3, the parameter execution aggregation module M4 and the virtualized network traffic monitoring module M5 are located on the central server;
the user data processing module M1: training by using local data of a user, and uploading model parameters obtained by training to a network detection and identification classification module M2 as a traffic data packet;
as an alternative embodiment, user data processing module M1 uses the BilSTM model to train the user's data. The BilSTM model is formed of forward LSTM and backward LSTM. BilSTM can better capture bidirectional semantic correlation, thereby further improving the accuracy of prediction. The exit layer randomly discards some neurons during the training process so that in an iterative process, the corresponding weights of the neurons remain the same as in the previous step and other weights are updated. This mechanism may reduce the number of nodes in the hidden layer. Interaction and overfitting phenomena avoid the algorithmic model entering the locally optimal solution.
Network detection and identification classification module M2: detecting and extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
and detecting and capturing flow data packets by adopting a WireShark tool, wherein the characteristic information contained in each flow data packet comprises a serial number, a timestamp, a source address, a target address, a protocol, a length and data packet information, and marking corresponding characteristic vectors for different types of characteristic information.
The abnormality recognition module M3: identifying whether the traffic data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal traffic data packet, and sending the normal traffic data packet and the abnormal traffic data packet which cannot be processed to the parameter execution aggregation module M4;
the exception identifying module M3 further includes a network exception database, in which various known exception behaviors and a processing mode corresponding to each exception behavior are stored.
As an alternative embodiment, the various abnormal behaviors stored in the network abnormality database are marked in the form of feature vectors.
As an optional implementation, the processing manner includes error reporting and discarding the abnormal traffic data packet.
The WireShark tool automatically identifies whether the flow data packet is abnormal according to the characteristic vector and the characteristic information, compares the characteristic vector of each type of abnormality in the abnormal data packet with the characteristic vector stored in the network abnormality database, if the abnormal characteristic vector exists in the network abnormality database, directly calls a corresponding processing means for processing, and feeds back the processed abnormal characteristic vector to the user corresponding to the abnormal quantity data packet; if the abnormal feature vector does not exist in the network abnormal database, the abnormal traffic data packet cannot be processed, so the abnormal traffic data packet and the normal traffic data packet which cannot be processed are sent to the parameter execution aggregation module M4.
The parameter execution aggregation module M4: and identifying and processing abnormal flow data packets which cannot be processed according to the normal data flow packets, and aggregating model parameters in all the processed normal flow data packets to further form a global model for combined modeling of each user, and sending the global model to each user to ensure the safety in the data interaction process.
The information of the traffic data packet captured by the wirereshark tool also includes the terminal information of the user, namely, the access frequency and the access port of the network. The parameter execution aggregating module M4 identifies whether the traffic packet is an abnormal traffic packet or not based on the terminal information of the user of the abnormal traffic packet which is captured by the wirereshark tool and cannot be processed. If the abnormal flow data packet is determined, corresponding processing is carried out and fed back to a terminal of a user corresponding to the abnormal flow data packet, and the abnormal characteristic vector of the abnormal flow data packet is recorded into a network abnormal database so as to be directly processed next time; if the abnormal flow data packet is determined not to be the normal flow data packet, the normal flow data packet is recovered, model parameters of all the normal flow data packets are aggregated, a global model is further formed and sent to each user, the uploading and downloading process only involves the parameters of the model, no user data is leaked, and the safety in the data interaction process is ensured.
Virtualized network traffic monitoring module M5: and monitoring whether the uploaded flow data packet exceeds a set threshold value. And if so, suspending uploading the flow data packets and waiting for the completion of the processing of the last batch of flow data packets.
The virtualized network traffic monitoring module M5 autonomously reads the model of the central server and sets a threshold according to 70% of the processing capacity.
The embodiment also provides a user abnormal behavior detection and processing method applying the user abnormal behavior detection and processing system, and the method comprises the following steps:
s1: training by adopting local data of a user, uploading model parameters obtained by training as a flow data packet, detecting, extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
s1.1: training local data of a user by adopting a BilSTM model, and uploading model parameters obtained by training as a flow data packet;
s1.2: detecting and extracting and classifying feature information of the uploaded flow data packet, wherein the feature information comprises: the method comprises the steps of numbering, time stamping, source address, target address, protocol, length and data packet information, and corresponding characteristic vectors are marked for different types of characteristic information.
S2: summarizing the flow data packets uploaded by a plurality of users, identifying whether each flow data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal flow data packets, and uploading the normal flow data packets and the abnormal flow data packets which cannot be processed;
s2.1: identifying whether each flow data packet is abnormal or not by a Wireshark tool according to the characteristic vector and the qualification of the characteristic information to obtain an abnormal flow data packet;
s2.2: matching the abnormal characteristic vector in the acquired abnormal traffic data packet with the characteristic vector stored in a network abnormal database, and if the abnormal characteristic vector in the abnormal traffic data packet exists in the network abnormal database, directly calling a processing mode in the abnormal network database for processing; otherwise, uploading the abnormal flow data packet which cannot be processed to a parameter execution aggregation module M4;
s3: and identifying and processing the abnormal flow data packet which cannot be processed according to the terminal information captured by the Wireshark, and aggregating model parameters in the processed normal flow data packet to further form a global model for joint modeling of each user, and sending the global model to each user to ensure the safety in the data interaction process.
S3.1: identifying abnormal flow data packets which cannot be processed according to terminal information captured by Wireshark, if the abnormal flow data packets are determined, correspondingly processing and feeding back to the terminals of the users corresponding to the abnormal flow data packets, and recording abnormal characteristic vectors of the abnormal flow data packets into a network abnormal database so as to be directly processed next time; if the data packet is determined not to be the abnormal flow data packet, the data packet is recovered to be the normal flow data packet;
s3.2: and aggregating the model parameters in the processed normal flow data packet to form a global model for combined modeling of each user, and sending the global model to each user.
The invention provides a system and a method for detecting and processing abnormal user behaviors, which comprises the following specific experiments:
the sample data of the embodiment mainly comes from SEA data sets generated by AT & T Shannon laboratories. User login in the SEA dataset is similar to the following command sequence: { cpp, sh, cpp, sh, xrdb, mkpts. The SEA data set covers a log of the behavior of more than 70 UNIX users. However, SEA data sets suffer from a serious deficiency in the negative sample size of the data set, and the available information extracted therefrom is not sufficient, thus requiring more powerful classification algorithms or enhanced data sets, as can be seen in recent studies of kholdik and KudlAcik. The improvement of our data set is supplemented by inserting a certain number of black samples on the basis of the original data set. Each 50 commands in the data set are divided into a separate command block, one tag for each command block. Of the 50 commands with black samples as independent command blocks, there is a corresponding attack in the instruction log. Randomly inserting data blocks to make them have directory traversal attack, mass reading, file deletion, batch unloading and other specific attack situations, and regarding these modified data blocks as black samples.
S0: the dataset is preprocessed and the initial BilSTM model is deployed on all user servers.
Preprocessing of the data set is primarily done by the marker. The word segmenter is used to vectorize text or convert text into a corresponding sequence. After the Shell command block is entered into the network model, words in the text are first counted using the segmented words to generate a dictionary document. The incoming shell command block will be converted to a vector representation according to the lexicographic order. The input length is not sufficient to fill the length and meet the length requirement.
Due to the privacy requirements of the user data, it is not possible to construct a tokenized bag model by retrieving the user data. Thus, a large bag-of-words model is built using a separate data set of 10,000 command blocks (including 500,000 commands). The large word model contains most types of Shell commands. The bag of words is sent to each child terminal server, then independent functions (e.g., word order and word frequency) are established, and finally vectorization is input to the training network using the embedded layer in Keras. The LSTM algorithm of this embodiment. Framework for LSTM framework, a BiLSTM network is employed so that the LSTM can encode information from beginning to end. It is formed by combining forward LSTM and backward LSTM. BilSTM can better capture bidirectional semantic correlation, thereby further improving the accuracy of prediction. The exit layer randomly discards some neurons during the training process so that in an iterative process, the corresponding weights of the neurons remain the same as in the previous step and other weights are updated.
S1: training by adopting local data of a user, uploading model parameters obtained by training as a flow data packet, detecting, extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
taking N users as an example, the N users are set as N sub-end servers, and each sub-end server has a data set D of the sub-end serveriAt the beginning of each communication round, the central server maps the global model MtAnd sending the data to each sub-end server, and training each sub-end server by using the respective data set. If the model file exists, loading the global model M published by the central servertTraining the file; if the model file is notIf so, a model for training is constructed. The child servers will save the weights obtained from each training that the child servers upload to the central server.
S2: summarizing the flow data packets uploaded by a plurality of users, identifying whether each flow data packet is abnormal or not according to the characteristic vector and the characteristic information, merging, storing and processing the abnormal flow data packets, and uploading the normal flow data packets and the abnormal flow data packets which cannot be processed;
identifying whether the traffic data packet is abnormal according to the characteristic vector and the characteristic information, comparing the characteristic vector of each type of abnormality in the abnormal data packet with a network abnormality database, if the abnormal characteristic vector exists in the network abnormality database, directly calling a corresponding processing means for processing, and feeding back the corresponding user of the abnormal quantity data packet; if the abnormal feature vector does not exist in the network abnormal database, the abnormal traffic data packet and the normal traffic data packet which cannot be processed are uploaded to the parameter execution aggregation module M4.
S3: and identifying and processing abnormal traffic data packets which cannot be processed, aggregating model parameters in all the processed traffic data packets to form a global model for combined modeling of each user, and sending the global model to each user to ensure the safety in the data interaction process.
Identifying abnormal flow data packets which cannot be processed, if the abnormal flow data packets are determined, performing corresponding processing and feeding back to users corresponding to the abnormal flow data packets, and storing abnormal characteristic vectors of the abnormal flow data packets into a network abnormal database; if the abnormal flow data packet is determined not to be the normal flow data packet, the abnormal flow data packet is classified as the normal flow data packet, model parameters of all the flow data packets are aggregated, a global model is further formed and sent to each user, and data encryption interaction is achieved.
The number of child servers is set in the model aggregation algorithm. When the number of models received by the central server reaches the set number, model aggregation may begin. To avoid the performance-degrading server parameters affecting the entire model, we would send a small test data set during the first communication and use that test data set for performance testing after training. If the test result is below the minimum value we set, the parameters are not uploaded. At the same time, we add MD5 checks at upload to ensure the integrity of the model weight parameter information.
Intelligent Intrusion Detection (IID) methods based on deep learning have received strong attention from computer security protection in network security. All of these learning models are trained on a single user server or a centralized server. On the one hand, it is almost impossible to train powerful deep learning models on a single user. On the other hand, if a data set is collected from all user servers, it will run the risk of intrusion on the central server and violate user privacy.
To address these problems, the present invention solves these problems by building a Federal Learning (FL) model. Multiple sub-servers are coordinated by a central server and the user data sets are consolidated to build a common model and benefit collectively. The raw data for each user in the model is stored locally and is not exchanged or transmitted, which does not pose a risk to user data privacy. The data set is adjusted according to the open source SEA data set. The attack scenario is set and the tags on the data set are reset by adding an attack order. Finally, a model performance test is performed using the independent validation dataset. The result shows that the method can comprehensively learn the characteristics of the data sets of the sub-terminal user servers while ensuring the privacy of the user, and has higher classification precision and stronger practicability.
FL is based on distributed machine learning and edge computation, and its weight update method is similar to the principle of distributed machine learning. FL has some similarities compared to distributed machine learning, but each section has complete self-control over local data and can autonomously decide whether to join FL for modeling. Second, the FL emphasizes the model training process. Data privacy protection of data owners is an effective measure to deal with data privacy protection. In the FL-data and the model itself, no transmission takes place. Thus, leakage at the data level is not possible, nor is the increasingly stringent data protection laws violated.
Meanwhile, the invention detects the abnormal behavior of the user in the data uploading process and processes the abnormality in time, thereby ensuring the safety in the data interaction process. And each user encrypts the model data before uploading the model data, and then the central server performs addition aggregation on the ciphertext after receiving the ciphertext and returns a ciphertext result to the user. Assuming that the used homomorphic encryption is partially homomorphic, the user needs to decrypt first and then complete the model updating; if the homomorphic encryption used is full homomorphic encryption, the user does not need to decrypt and can update on the model of the ciphertext. That is, the central service distributes the current encryption model to each user, and the user receiving the model encrypts the model first and then performs biased estimation of parameters on local data. Then, the central server collects the encryption parameter estimation of the user, and carries out aggregation to obtain a new model parameter under the secret state. And finally, the central server uses the aggregated new encryption parameters as the current model.
The embodiments described above are described to facilitate an understanding and use of the invention by those skilled in the art. It will be readily apparent to those skilled in the art that various modifications to these embodiments may be made, and the generic principles described herein may be applied to other embodiments without the use of the inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications within the scope of the present invention based on the disclosure of the present invention.

Claims (10)

1. A user abnormal behavior detection and handling system, comprising: a user data processing module (M1), a network detection and recognition classification module (M2), an anomaly recognition module (M3) and a parameter execution aggregation module (M4);
user data processing module (M1): training by adopting local data of a user, and uploading model parameters obtained by training as a flow data packet;
network detection and identification classification module (M2): detecting and extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
anomaly identification module (M3): identifying whether the traffic data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal traffic data packet, and sending the normal traffic data packet and the abnormal traffic data packet which cannot be processed to a parameter execution aggregation module (M4);
parameter execution aggregation module (M4): and identifying and processing the abnormal flow data packet which cannot be processed, and aggregating model parameters in the processed normal flow data packet to further form a global model for combined modeling of each user, and sending the global model to each user.
2. The system for detecting and processing abnormal user behavior according to claim 1, characterized in that the user data processing module (M1) and the network detection and identification classification module (M2) are located at the terminal server of each user's subsystem, and the abnormal identification module (M3) and the parameter execution aggregation module (M4) are located at the central server.
3. The system of claim 1, wherein the user data processing module (M1) trains data local to the user using a BilSTM model.
4. The system according to claim 1, wherein the characteristic information of each traffic packet includes a number, a timestamp, a source address, a destination address, a protocol, a length, and packet information.
5. The system according to claim 4, wherein the anomaly identification module (M3) comprises a network anomaly database for storing the feature vectors of various anomalies and the processing means corresponding to the feature vector of each anomaly.
6. The system according to claim 2, further comprising a virtualized network traffic monitoring module (M5) for monitoring the uploaded traffic packets and preventing the traffic packets from being blocked from processing.
7. The system of claim 6, wherein the virtualized network traffic monitoring module (M5) suspends the uploading of traffic packets when the number of traffic packets monitored exceeds 70% of the processing capacity of the central server.
8. A user abnormal behavior detection and processing method applying the user abnormal behavior detection and processing system according to any one of claims 1 to 7, characterized by comprising the following steps:
s1: training by adopting local data of a user, uploading model parameters obtained by training as a flow data packet, detecting, extracting and classifying feature information of the uploaded flow data packet, and marking corresponding feature vectors for different types of feature information;
s2: summarizing the flow data packets uploaded by a plurality of users, identifying whether each flow data packet is abnormal according to the characteristic vector and the characteristic information, storing and processing the abnormal flow data packets, and uploading the normal flow data packets and the abnormal flow data packets which cannot be processed;
s3: and identifying and processing abnormal traffic data packets which cannot be processed, aggregating model parameters in all the processed traffic data packets, further forming a safe global model for each user to jointly model, and sending the global model to each user.
9. The method for detecting and handling abnormal user behavior according to claim 8, wherein the specific content of S2 is:
identifying whether each flow data packet is abnormal according to the characteristic vector and the qualification of the characteristic information to obtain an abnormal flow data packet; matching the abnormal characteristic vector in the acquired abnormal traffic data packet with the characteristic vector stored in a network abnormal database, and if the abnormal characteristic vector in the abnormal traffic data packet exists in the network abnormal database, directly calling a processing mode in the abnormal network database for processing; otherwise, the abnormal flow data packet which cannot be processed is uploaded to the parameter execution aggregation module (M4).
10. The method for detecting and processing abnormal user behavior according to claim 9, wherein the specific content of S3 is:
identifying abnormal flow data packets which cannot be processed, if the abnormal flow data packets are determined, performing corresponding processing and feeding back to a user terminal corresponding to the abnormal flow data packets, and inputting abnormal characteristic vectors of the abnormal flow data packets into a network abnormal database so as to be directly processed next time; if the data packet is determined not to be the abnormal flow data packet, the data packet is recovered to be the normal flow data packet; and aggregating the model parameters in the processed normal flow data packet to form a global model for combined modeling of each user, and sending the global model to each user.
CN202111268433.1A 2021-10-29 2021-10-29 System and method for detecting and processing abnormal behaviors of user Active CN113992419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111268433.1A CN113992419B (en) 2021-10-29 2021-10-29 System and method for detecting and processing abnormal behaviors of user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111268433.1A CN113992419B (en) 2021-10-29 2021-10-29 System and method for detecting and processing abnormal behaviors of user

Publications (2)

Publication Number Publication Date
CN113992419A true CN113992419A (en) 2022-01-28
CN113992419B CN113992419B (en) 2023-09-01

Family

ID=79744041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111268433.1A Active CN113992419B (en) 2021-10-29 2021-10-29 System and method for detecting and processing abnormal behaviors of user

Country Status (1)

Country Link
CN (1) CN113992419B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186285A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Parameter aggregation method and device for federal learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809974A (en) * 2018-06-07 2018-11-13 深圳先进技术研究院 A kind of Network Abnormal recognition detection method and device
CN109818793A (en) * 2019-01-30 2019-05-28 基本立子(北京)科技发展有限公司 For the device type identification of Internet of Things and network inbreak detection method
CN112398779A (en) * 2019-08-12 2021-02-23 中国科学院国家空间科学中心 Network traffic data analysis method and system
US20210126931A1 (en) * 2019-10-25 2021-04-29 Cognizant Technology Solutions India Pvt. Ltd System and a method for detecting anomalous patterns in a network
CN112906903A (en) * 2021-01-11 2021-06-04 北京源堡科技有限公司 Network security risk prediction method and device, storage medium and computer equipment
CN112953924A (en) * 2021-02-04 2021-06-11 西安电子科技大学 Network abnormal flow detection method, system, storage medium, terminal and application
CN113408743A (en) * 2021-06-29 2021-09-17 北京百度网讯科技有限公司 Federal model generation method and device, electronic equipment and storage medium
CN113434859A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Intrusion detection method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809974A (en) * 2018-06-07 2018-11-13 深圳先进技术研究院 A kind of Network Abnormal recognition detection method and device
CN109818793A (en) * 2019-01-30 2019-05-28 基本立子(北京)科技发展有限公司 For the device type identification of Internet of Things and network inbreak detection method
CN112398779A (en) * 2019-08-12 2021-02-23 中国科学院国家空间科学中心 Network traffic data analysis method and system
US20210126931A1 (en) * 2019-10-25 2021-04-29 Cognizant Technology Solutions India Pvt. Ltd System and a method for detecting anomalous patterns in a network
CN112906903A (en) * 2021-01-11 2021-06-04 北京源堡科技有限公司 Network security risk prediction method and device, storage medium and computer equipment
CN112953924A (en) * 2021-02-04 2021-06-11 西安电子科技大学 Network abnormal flow detection method, system, storage medium, terminal and application
CN113408743A (en) * 2021-06-29 2021-09-17 北京百度网讯科技有限公司 Federal model generation method and device, electronic equipment and storage medium
CN113434859A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Intrusion detection method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王蓉等: "基于联邦学习和卷积神经网络的入侵检测方法", 《信息网络安全》 *
王蓉等: "基于联邦学习和卷积神经网络的入侵检测方法", 《信息网络安全》, no. 04, 10 April 2020 (2020-04-10) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186285A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Parameter aggregation method and device for federal learning

Also Published As

Publication number Publication date
CN113992419B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN112395159B (en) Log detection method, system, device and medium
CN112235264B (en) Network traffic identification method and device based on deep migration learning
CN109886290B (en) User request detection method and device, computer equipment and storage medium
US20220263860A1 (en) Advanced cybersecurity threat hunting using behavioral and deep analytics
CN112468347B (en) Security management method and device for cloud platform, electronic equipment and storage medium
CN111600919B (en) Method and device for constructing intelligent network application protection system model
CN113645232B (en) Intelligent flow monitoring method, system and storage medium for industrial Internet
CN110046297B (en) Operation and maintenance violation identification method and device and storage medium
CN111866024A (en) Network encryption traffic identification method and device
US20230418943A1 (en) Method and device for image-based malware detection, and artificial intelligence-based endpoint detection and response system using same
CN111464510B (en) Network real-time intrusion detection method based on rapid gradient lifting tree classification model
CN116781347A (en) Industrial Internet of things intrusion detection method and device based on deep learning
CN113923026A (en) Encrypted malicious flow detection model based on TextCNN and construction method thereof
Sethi et al. Robust adaptive cloud intrusion detection system using advanced deep reinforcement learning
Wang et al. An unknown protocol syntax analysis method based on convolutional neural network
CN113282920B (en) Log abnormality detection method, device, computer equipment and storage medium
CN113992419B (en) System and method for detecting and processing abnormal behaviors of user
CN116828087B (en) Information security system based on block chain connection
US20240089279A1 (en) Method and network node for detecting anomalous access behaviours
Rajawat et al. Analysis assaulting pattern for the security problem monitoring in 5G‐enabled sensor network systems with big data environment using artificial intelligence/machine learning
CN114726876A (en) Data detection method, device, equipment and storage medium
Onoda Probabilistic models-based intrusion detection using sequence characteristics in control system communication
KR102559398B1 (en) Security monitoring intrusion detection alarm processing device and method using artificial intelligence
Ramström Botnet detection on flow data using the reconstruction error from Autoencoders trained on Word2Vec network embeddings
CN115622793A (en) Attack type identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant