CN116743646B - Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder - Google Patents

Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder Download PDF

Info

Publication number
CN116743646B
CN116743646B CN202311023612.8A CN202311023612A CN116743646B CN 116743646 B CN116743646 B CN 116743646B CN 202311023612 A CN202311023612 A CN 202311023612A CN 116743646 B CN116743646 B CN 116743646B
Authority
CN
China
Prior art keywords
network
tunnel
data
encoder
network traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311023612.8A
Other languages
Chinese (zh)
Other versions
CN116743646A (en
Inventor
李�浩
李朋
杨路
陆艳铭
陈志涛
李孜
胡皓
马伟任
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Provincial Transportation Planning And Design Research Institute Co ltd
Original Assignee
Yunnan Provincial Transportation Planning And Design Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Provincial Transportation Planning And Design Research Institute Co ltd filed Critical Yunnan Provincial Transportation Planning And Design Research Institute Co ltd
Priority to CN202311023612.8A priority Critical patent/CN116743646B/en
Publication of CN116743646A publication Critical patent/CN116743646A/en
Application granted granted Critical
Publication of CN116743646B publication Critical patent/CN116743646B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/12Network monitoring probes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a tunnel network anomaly detection method based on a domain self-adaptive depth self-encoder, and belongs to the technical field of tunnel network anomaly detection. The method comprises the steps of data acquisition and preprocessing, training and updating of an abnormal detection source domain model, dynamic threshold calculation of abnormal detection, abnormal data detection and the like. The invention can directly perform operations such as preprocessing, abnormality detection and the like on the tunnel network at the edge side, improves the processing speed of the monitoring system and effectively reduces the processing time delay. Meanwhile, based on the normal reference value of the current network state, the abnormal threshold range is reasonably set, and abnormal information missing report, false report and other conditions caused by fixed setting of the threshold are avoided.

Description

Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder
Technical Field
The invention belongs to the technical field of tunnel network anomaly detection, and particularly relates to a tunnel network anomaly detection method based on a domain self-adaptive depth self-encoder, in particular to an edge calculation method for finishing tunnel network anomaly detection based on the domain self-adaptive depth self-encoder.
Background
The electromechanical system in the tunnel is huge, the equipment distribution is complex, and the method is particularly important to the monitoring and management of the network state of the electromechanical equipment in the expressway tunnel. At present, in a single expressway tunnel, an area controller is arranged every 500 meters to control peripheral field devices, and the control cabinets are interconnected through a switch to form an optical fiber ring network in two directions of an opening. The Ethernet switch not only needs to process high-bandwidth data of the video monitoring system, but also needs to be configured into a redundant optical fiber ring network connection area controller to control equipment such as ventilation, illumination, traffic lights and the like in the tunnel. As the number of industrial ethernet devices increases, the structure of industrial ethernet networks is increasingly complex. In the practical application process, the problems of insufficient network topology sensing capability, network storm caused by misoperation, virus infection and the like have become important factors influencing the stability and the reliability of the network. When industrial Ethernet is problematic, the industrial Ethernet tends to instantaneously spread to the whole network, and the influence range is large. In addition, the regional controller in the tunnel is generally responsible for executing functions such as digital quantity, analog quantity input and output, serial port communication and the like, but cannot effectively collect the running state information of network equipment such as a switch and the like. That is, although the tunnel network is actually established, when an abnormality occurs in the network traffic, the system cannot accurately locate the fault location, and cannot generate a corresponding record.
Therefore, the introduction of the edge computing architecture into each electromechanical system within the tunnel is of great research interest. The edge computing technology can directly process data at the tunnel electromechanical equipment end, avoid the transfer of cloud or other data centers, improve the response speed and reduce the requirement on the tunnel network bandwidth. The edge computing nodes are deployed in the tunnel environment, a large number of front-end devices in the environment are managed, the fault of the electromechanical devices in the operation process can be avoided by detecting the state of the tunnel network based on the edge computing, and the reliability and the intelligent level of the tunnel electromechanical system are improved.
However, in order to make the edge calculation play a better role in the tunnel network monitoring, it is necessary to design a reasonable tunnel network anomaly detection edge calculation method. Most of the conventional abnormality detection methods applied to practice still depend on manual detection and analysis, the most theoretical method is a mathematical statistical method, statistical distribution in statistics is generally used as a standard for abnormality judgment, statistical characteristics among samples are calculated, and a set threshold is adopted to realize abnormality detection. The second type of method is based on a classification model, but model training requires good training data and has a large number of labeled datasets to perform model training. The third class is distance-based methods, where outlier samples are considered abnormal, and such algorithms are not well suited for large-volume, high-dimensional data. However, because of the complicated and diversified electromechanical devices in the tunnel, the acquired network data network has higher flow characteristic dimension, high nonlinearity among data and the like, and an effective anomaly detection model is difficult to establish. In addition, because the tunnel network operation environment has the characteristic of dynamic change along with time, the problem of false detection is easy to generate only according to fixed monitoring indexes and by using a mode of a fixed abnormal detection model for all operation samples. Therefore, how to overcome the defects of the prior art is a problem to be solved in the technical field of tunnel network anomaly detection.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a tunnel network anomaly detection method based on a domain self-adaptive depth self-encoder.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a tunnel network anomaly detection method based on a domain adaptive depth self-encoder comprises the following steps:
step 1: collecting historical network flow data of equipment in a tunnel electromechanical system equipment layer through edge computing nodes deployed in a tunnel, analyzing and obtaining corresponding network original data flow, and preprocessing data to obtain corresponding network flow characteristics, namely preprocessed network flow samples;
step 2: the collected historical network flow data is processed in the step 1 to obtain the network flow characteristics corresponding to each network flow data, and then the network flow characteristics are used as a source domain data set; training an anomaly detection source domain model based on a depth self-encoder algorithm by taking a source domain data set as a training set, and deploying the anomaly detection source domain model in a tunnel edge computing node after training is completed;
step 3: after the network flow data acquired in real time are subjected to the preprocessing mode in the step 1 to obtain the corresponding network flow characteristics, constructing a self-adaptive sliding window algorithm to obtain a target domain data set corresponding to the network flow; updating the anomaly detection source domain model obtained in the step 2 according to the corresponding target domain data set;
Step 4: calculating a dynamic threshold for anomaly detection;
step 5: inputting the network flow characteristics after preprocessing the network flow data to be detected, which are acquired in real time, into an updated anomaly detection source domain model, and calculating a reconstruction error of the anomaly detection source domain model;
step 6: and (3) detecting whether the network flow data to be detected acquired in real time is abnormal data or not according to the dynamic threshold value obtained in the step (4) and the reconstruction error obtained in the step (5).
Further, preferably, in step 1, the system architecture for performing tunnel network anomaly detection by using an edge computing node includes a device layer, an edge computing layer, a network layer and a cloud platform layer; the equipment layer, the edge computing layer, the network layer and the cloud platform layer are sequentially connected; each equipment system in the equipment layer comprises a broadcast telephone system, a tunnel monitoring system, a tunnel ventilation lighting system, a tunnel area controller, a tunnel fire protection system, an information release system and a tunnel traffic signal system; the edge computing layer is an edge computing node deployed in the tunnel.
Further, preferably, in step 1, the data preprocessing method includes removing abnormal data, removing meaningless features and normalizing the data;
the network traffic characteristics include data flow duration, number of forward packets, number of reverse packets, total number of bytes of forward packets, total number of bytes of reverse packets, total number of bytes of forward substreams, and total number of bytes of reverse substreams.
Further, preferably, in step 2, the anomaly detection source domain model adopts a depth automatic encoder, including an encoder and a decoder;
the anomaly detection source domain model is provided with three layers of neural networks, namely an input layer, an hidden layer and an output layer, wherein the input isThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing a source domain dataset,/->Indicate->The network traffic samples after pretreatment;
the specific training method of the anomaly detection source domain model is as follows:
step 2.1: the encoder encodes source domain dataVia an activation function->Mapping to obtain hidden layer data:
;
in the method, in the process of the invention,representing implicit layer vectors,/->Indicate->Implicit layer vectors of the preprocessed network traffic samples;
the encoding process is shown as formula (1):
(1)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the encoder, respectively,/->To activate a function, in the present invention, a Sigmoid function;
step 2.2: the decoder activates the functionImplicit layer data->Conversion to the output layer obtains the output variable:
;
in the method, in the process of the invention,output variable representing reconstruction, +.>Indicate->The reconstructed network traffic samples;
via implicit layer vector->The input variable is reconstructed, and the decoding process is shown as a formula (2);
(2)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the decoder, respectively,/- >Is an activation function;
step 2.3: training the anomaly detection source domain model by using a gradient descent algorithm, and obtaining optimal network parameters by taking a minimum reconstructed error as a target; the objective function is shown in formula (3):
(3)
in which a set of network parametersRepresenting the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder, the bias vector of the decoder, respectively,>and->Respectively represent +.>Input and reconstruction output variables of the DAE networks;Mrepresenting the total number of the preprocessed network traffic samples;
step 2.4: and saving the network parameters of the trained anomaly detection source domain model, and deploying the model at the tunnel edge computing node.
Further, it is preferable that the method comprises,is a Sigmoid function.
Further, it is preferable that the specific method of step 3 is:
step 3.1: is assumed to be inAt moment, the network flow data acquired by the edge computing node in real time is preprocessed in the step 1 to obtain the corresponding network flow characteristics, namely, preprocessed network flow samples ∈ ->Defining a sample to be measured; constructing a target domain data set +.>The method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 3.1.1: to be used forTime-of-day pre-processed network traffic sample +. >Expanding the pre-processed network traffic sample with the forward sequence for the right boundary of the sliding window, and attributing the network traffic sample with the time sequence close to the pre-processed network traffic sample into the sliding window, so as to adapt to the sliding window data set +_>Expressed as:
;
in the method, in the process of the invention,indicating length +.>Comprising->Time to->The preprocessed network traffic samples in the moment;Pre-processing network traffic samples for the left boundary of a window, i.e. adaptively sliding the windowTime forward expansion->The network traffic samples after pretreatment;
step 3.1.2:adaptive sliding window in determining whether to expand a preamble sample, it is assumed that the sliding window has been expanded toAt the moment, the sample to be judged whether to fit into the window is +.>Time; first, the Euclidean distance average of the samples at this time and all samples inside the current window is calculated according to the similarity function shown below:
(4)
the ED is calculated by the following steps:
;
in the method, in the process of the invention,representing the slave +.>Time to->Any pre-processed network traffic samples within a time instant,/-or->For the number of pre-processed network traffic samples in the current window,/I>A network traffic sample after the preamble pretreatment to judge whether to incorporate the window is determined;
Step 3.1.3: setting the boundary threshold of the adaptive sliding window according to the similarity function of the step 3.1.2If->The sliding window will incorporate the pre-processed network traffic sample, i.e. the left boundary pre-processed network traffic sample of the window is +.>Otherwise, stop expanding, the left boundary sample is +.>
Let the sliding window data setAs->The target domain data set is expressed as:
;
due toCommon->Strip pre-processed network traffic samples, thus target domain dataset +.>Expressed as:
;
in the method, in the process of the invention,is indicated as including->Data set of network traffic samples after strip pretreatment +.>Interior (I)>The network traffic samples after pretreatment;
step 3.2: utilizing a target domain datasetPerforming domain self-adaptive updating on the anomaly detection source domain model trained in the step 2; the method comprises the following specific steps:
step 3.2.1: first, the source domain data setInputting the source domain model into a trained anomaly detection source domain model, and acquiring an implicit layer vector of source domain data through forward propagation of a formula (1)>
Step 3.2.2: target domain dataAlso input into the anomaly detection source domain model, acquire the hidden layer vector of the target domain data by forward propagation of the following formula >
(5)
Step 3.2.3: taking the maximum average difference distance as an objective function, and the calculation formula is shown as formula (6):
;
(6)
in the method, in the process of the invention,respectively->And->The number of samples in>To find the minimum upper bound function, ++>Refers to any index in the dataset, i.e. +.>And->Respectively indicate->Middle->And->Sample number->And->Respectively representMiddle->And->A sample number;For Gaussian kernel function +.>The calculation method is as follows:
(7)
in the method, in the process of the invention,representing a bandwidth parameter;
step 3.2.4: calculating the difference between the implicit vectors generated by the source domain data and the target domain data according to formulas (6) - (7) to construct a DADAE model; targeting minimizing the DADAE model objective function, the DADAE model objective function is as follows:
(8)
in the method, in the process of the invention,a loss function represented by formula (3);Distance loss function for MMD; network parameter set->Respectively representing the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder and the bias vector of the decoder after the domain adaptive update, +.>Is a balance parameter;
step 3.2.5: and saving the trained network parameters of the new anomaly detection source domain model, and deploying the new anomaly detection source domain model at the tunnel edge computing node.
Further, it is preferable that, in step 4, a dynamic threshold for abnormality detection is calculated, The upper limit of the dynamic anomaly threshold is recorded asThe lower limit is->The method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 4.1: first, the target domain data setRe-executing in the updated abnormality detection source domain model to obtain an output data set after encoder and decoder, denoted +.>The method comprises the steps of carrying out a first treatment on the surface of the Then, the reconstruction error of each piece of target domain data is calculated using the following formula:
(9)
in the method, in the process of the invention,comprises->Elements, denoted->And->Respectively indicate containing->Target domain data sets of individual network traffic and reconstructed output data sets thereof;
step 4.2: calculation ofAverage of (2)Values and standard deviations are calculated as follows:
(10)
(11)
in the method, in the process of the invention,representation->Average value of>Is->Standard deviation of (2); the dynamic threshold range is:
(12)
(13)
in the method, in the process of the invention,is a standard deviation coefficient.
Further, it is preferable that the method comprises,2.
Further, in step 5, preferably, the method for calculating the reconstruction error is as follows:
;
in the method, in the process of the invention,and (3) reconstructing and outputting the network traffic samples subjected to pretreatment to be detected after the network traffic samples are subjected to the updating and then decoded after the abnormal detection source domain model is encoded.
Further, it is preferable that the detection method in step 6 is:
when (when)When the mark is normal; when->Or->When marked as abnormal.
In the present invention,the value of (2) can be selected according to actual conditions, and the invention is not limited to the boundary threshold value.
The technical problems to be solved by the invention are as follows: aiming at the problem of difficult detection of tunnel network abnormality caused by complex Ethernet equipment and higher network transmission flow sampling rate in the prior art, edge computing nodes are introduced into the tunnel network, and network flow is directly processed at the equipment end. However, because of the complicated and diversified electromechanical devices in the tunnel, the acquired network data network has higher flow characteristic dimension, high nonlinearity among data and the like, and an effective anomaly detection model is difficult to establish. In addition, because the tunnel network operation environment has the characteristic of dynamic change along with time, the tunnel network operation environment is changed instantly, the condition that the detection effect continuously decreases along with time is easy to occur only by a model which is unchanged, and the fixed abnormal threshold value and the detection model have no good robustness to the abnormal detection task of the tunnel network.
In view of the above, the present invention proposes a tunnel network anomaly detection method based on a domain adaptive depth self-encoder (Domain Adaptive Deep Autoencoder, DADAE). According to the method, an edge computing architecture is introduced into a tunnel electromechanical system, and network flow characteristics corresponding to different services are acquired and obtained by edge computing nodes, so that an abnormality detection task is completed. For the problem of difficult construction of a tunnel network anomaly detection model, the invention introduces the idea of migration learning, and designs a domain self-adaptive depth self-encoder algorithm to realize real-time update detection of the network state. From the perspective of transfer learning, in the invention, a section of historical network flow generated by a tunnel electromechanical system is regarded as a source domain, the distribution of network flow to be detected acquired in real time is not matched with the historical network flow to a certain extent due to the time-varying characteristic of data, and the network flow has strong correlation in adjacent time periods, so that samples in a certain time window are taken as target domains for the samples of the flow to be detected. The invention aims to directly collect and process data at the tunnel edge, and utilizes the idea of transfer learning to improve the adaptability of an anomaly detection model to time-varying flow samples, improve the processing speed of a monitoring system, effectively reduce the processing time delay and improve the robustness and the accuracy of the anomaly detection model.
Specifically, firstly, aiming at the characteristics of high characteristic dimension and nonlinearity among data of a tunnel network, a source domain automatic encoder model is built by taking historical normal network flow as source domain data so as to initially build a nonlinear fitting relation of the tunnel network. And then, after the network sample to be detected, which is acquired by the edge node in real time, arrives, determining target domain data corresponding to the sample to be detected through a sliding window, and carrying out domain self-adaptive updating on the source domain self-encoder model by using the target domain data. And finally, inputting the sample to be detected into the updated model to calculate the reconstruction loss of the sample, and detecting whether the sample to be detected is an abnormal network flow sample or not by using the constructed abnormal detection module.
Compared with the prior art, the invention has the beneficial effects that:
(1) The tunnel network anomaly detection method of the domain self-adaptive depth self-encoder can directly perform operations such as preprocessing and anomaly detection on the tunnel network at the edge side, so that the processing speed of a monitoring system is improved, and the processing time delay is effectively reduced.
(2) Aiming at the characteristic of the dynamic change of the tunnel network environment along with time, the domain self-adaptive depth self-encoder algorithm provided by the invention can enable the anomaly detection algorithm to be self-adaptively matched with the network traffic sample to be detected, and improve the robustness and accuracy of the anomaly detection algorithm.
(3) The method for dynamically determining the abnormal threshold value provided by the invention has the advantages that the range of the abnormal threshold value is reasonably set based on the normal reference value of the current network state, and the situations of abnormal information missing report, false report and the like caused by fixed setting of the threshold value are avoided.
(4) By applying the method and the system in the expressway tunnel inner edge computing nodes, the robustness and the accuracy of network flow abnormality detection tasks can be improved, and the expressway tunnel operation management cost can be reduced.
Drawings
FIG. 1 is a layer architecture diagram of the present invention for domain-adaptive depth self-encoder based tunnel network anomaly detection;
FIG. 2 is a flow chart of a method of detecting tunnel network anomalies based on a domain adaptive depth self-encoder of the present invention;
fig. 3 is an algorithm diagram of a method for detecting tunnel network anomalies based on a domain-adaptive depth self-encoder according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples.
It will be appreciated by those skilled in the art that the following examples are illustrative of the present invention and should not be construed as limiting the scope of the invention. The specific techniques or conditions are not identified in the examples and are performed according to techniques or conditions described in the literature in this field or according to the product specifications. The materials or equipment used are conventional products available from commercial sources, not identified to the manufacturer.
Embodiment 1 is a tunnel network anomaly detection method based on a domain adaptive depth self-encoder, comprising the following steps:
step 1: collecting historical network flow data of equipment in a tunnel electromechanical system equipment layer through edge computing nodes deployed in a tunnel, analyzing and obtaining corresponding network original data flow, and preprocessing data to obtain corresponding network flow characteristics, namely preprocessed network flow samples;
step 2: the collected historical network flow data is processed in the step 1 to obtain the network flow characteristics corresponding to each network flow data, and then the network flow characteristics are used as a source domain data set; training an anomaly detection source domain model based on a depth self-encoder algorithm by taking a source domain data set as a training set, and deploying the anomaly detection source domain model in a tunnel edge computing node after training is completed;
step 3: after the network flow data acquired in real time are subjected to the preprocessing mode in the step 1 to obtain the corresponding network flow characteristics, constructing a self-adaptive sliding window algorithm to obtain a target domain data set corresponding to the network flow; updating the anomaly detection source domain model obtained in the step 2 according to the corresponding target domain data set;
step 4: calculating a dynamic threshold for anomaly detection;
Step 5: inputting the network flow characteristics after preprocessing the network flow data to be detected, which are acquired in real time, into an updated anomaly detection source domain model, and calculating a reconstruction error of the anomaly detection source domain model;
step 6: and (3) detecting whether the network flow data to be detected acquired in real time is abnormal data or not according to the dynamic threshold value obtained in the step (4) and the reconstruction error obtained in the step (5).
Embodiment 2 is a tunnel network anomaly detection method based on a domain adaptive depth self-encoder, comprising the steps of:
step 1: collecting historical network flow data of equipment in a tunnel electromechanical system equipment layer through edge computing nodes deployed in a tunnel, analyzing and obtaining corresponding network original data flow, and preprocessing data to obtain corresponding network flow characteristics, namely preprocessed network flow samples;
step 2: the collected historical network flow data is processed in the step 1 to obtain the network flow characteristics corresponding to each network flow data, and then the network flow characteristics are used as a source domain data set; training an anomaly detection source domain model based on a depth self-encoder algorithm by taking a source domain data set as a training set, and deploying the anomaly detection source domain model in a tunnel edge computing node after training is completed;
Step 3: after the network flow data acquired in real time are subjected to the preprocessing mode in the step 1 to obtain the corresponding network flow characteristics, constructing a self-adaptive sliding window algorithm to obtain a target domain data set corresponding to the network flow; updating the anomaly detection source domain model obtained in the step 2 according to the corresponding target domain data set;
step 4: calculating a dynamic threshold for anomaly detection;
step 5: inputting the network flow characteristics after preprocessing the network flow data to be detected, which are acquired in real time, into an updated anomaly detection source domain model, and calculating a reconstruction error of the anomaly detection source domain model;
step 6: and (3) detecting whether the network flow data to be detected acquired in real time is abnormal data or not according to the dynamic threshold value obtained in the step (4) and the reconstruction error obtained in the step (5).
In the step 1, a system architecture for detecting tunnel network abnormality by utilizing an edge computing node comprises a device layer, an edge computing layer, a network layer and a cloud platform layer; the equipment layer, the edge computing layer, the network layer and the cloud platform layer are sequentially connected; each equipment system in the equipment layer comprises a broadcast telephone system, a tunnel monitoring system, a tunnel ventilation lighting system, a tunnel area controller, a tunnel fire protection system, an information release system and a tunnel traffic signal system; the edge computing layer is an edge computing node deployed in the tunnel.
In the step 1, the data preprocessing mode comprises abnormal data removal, meaningless characteristic removal and data normalization;
the network traffic characteristics include data flow duration, number of forward packets, number of reverse packets, total number of bytes of forward packets, total number of bytes of reverse packets, total number of bytes of forward substreams, and total number of bytes of reverse substreams.
In the step 2, the anomaly detection source domain model adopts a depth automatic encoder, comprising an encoder and a decoder;
the anomaly detection source domain model is provided with three layers of neural networks, namely an input layer, an implicit layer and an output layer,the input isThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing a source domain dataset,/->Indicate->The network traffic samples after pretreatment;
the specific training method of the anomaly detection source domain model is as follows:
step 2.1: the encoder encodes source domain dataVia an activation function->Mapping to obtain hidden layer data:
;
in the method, in the process of the invention,representing implicit layer vectors,/->Indicate->Implicit layer vectors of the preprocessed network traffic samples;
the encoding process is shown as formula (1):
(1)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the encoder, respectively,/- >To activate a function, in the present invention, a Sigmoid function;
step 2.2: the decoder activates the functionImplicit layer data->Conversion to the output layer obtains the output variable:
;
in the method, in the process of the invention,output variable representing reconstruction, +.>Indicate->The reconstructed network traffic samples;
via implicit layer vector->The input variable is reconstructed, and the decoding process is shown as a formula (2);
(2)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the decoder, respectively,/->Is an activation function;
step 2.3: training the anomaly detection source domain model by using a gradient descent algorithm, and obtaining optimal network parameters by taking a minimum reconstructed error as a target; the objective function is shown in formula (3):
(3)
in which a set of network parametersRepresenting the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder, the bias vector of the decoder, respectively,>and->Respectively represent +.>Input and reconstruction output variables of the DAE networks;Mrepresenting the total number of the preprocessed network traffic samples;
step 2.4: and saving the network parameters of the trained anomaly detection source domain model, and deploying the model at the tunnel edge computing node.
Is a Sigmoid function.
The specific method of the step 3 is as follows:
step 3.1: is assumed to be in At moment, the network flow data acquired by the edge computing node in real time is preprocessed in the step 1 to obtain the corresponding network flow characteristics, namely, preprocessed network flow samples ∈ ->Defining a sample to be measured; constructing a target domain data set +.>The method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 3.1.1: to be used forTime-of-day pre-processed network traffic sample +.>Expanding the pre-processed network traffic sample with the forward sequence for the right boundary of the sliding window, and attributing the network traffic sample with the time sequence close to the pre-processed network traffic sample into the sliding window, so as to adapt to the sliding window data set +_>Expressed as:
;
in the method, in the process of the invention,indicating length +.>Comprising->Time to->The preprocessed network traffic samples in the moment;Pre-processing network traffic samples for the left boundary of a window, i.e. adaptively sliding the windowTime forward expansion->The network traffic samples after pretreatment;
step 3.1.2: adaptive sliding window in determining whether to expand a preamble sample, it is assumed that the sliding window has been expanded toAt the moment, the sample to be judged whether to fit into the window is +.>Time; first, the Euclidean distance average of the samples at this time and all samples inside the current window is calculated according to the similarity function shown below:
(4)
The ED is calculated by the following steps:
;
in the method, in the process of the invention,representing the slave +.>Time to->Any pre-processed network traffic samples within a time instant,/-or->For the number of pre-processed network traffic samples in the current window,/I>A network traffic sample after the preamble pretreatment to judge whether to incorporate the window is determined;
step 3.1.3: setting the boundary threshold of the adaptive sliding window according to the similarity function of the step 3.1.2If->The sliding window will incorporate the pre-processed network traffic sample, i.e. the left boundary pre-processed network traffic sample of the window is +.>Otherwise, stop expanding, the left boundary sample is +.>
Let the sliding window data setAs->The target domain data set is expressed as:
;
due toCommon->Strip pre-processed network traffic samples, thus target domain dataset +.>Expressed as:
;
in the method, in the process of the invention,is indicated as including->Data set of network traffic samples after strip pretreatment +.>Interior (I)>The network traffic samples after pretreatment;
step 3.2: utilizing a target domain datasetPerforming domain self-adaptive updating on the anomaly detection source domain model trained in the step 2; the method comprises the following specific steps:
step 3.2.1: first, the source domain data set Inputting the source domain model into a trained anomaly detection source domain model, and acquiring an implicit layer vector of source domain data through forward propagation of a formula (1)>
Step 3.2.2: target domain dataAlso input into the anomaly detection source domain model, acquire the hidden layer vector of the target domain data by forward propagation of the following formula>
(5)
Step 3.2.3: taking the maximum average difference distance as an objective function, and the calculation formula is shown as formula (6):
;
(6)
in the method, in the process of the invention,respectively->And->The number of samples in>To find the minimum upper bound function, ++>Refers to any index in the dataset, i.e. +.>And->Respectively indicate->Middle->And->Sample number->And->Respectively representMiddle->And->A sample number;For Gaussian kernel function +.>The calculation method is as follows:
(7)
in the method, in the process of the invention,representing a bandwidth parameter;
step 3.2.4: calculating the difference between the implicit vectors generated by the source domain data and the target domain data according to formulas (6) - (7) to construct a DADAE model; targeting minimizing the DADAE model objective function, the DADAE model objective function is as follows:
(8)
in the method, in the process of the invention,a loss function represented by formula (3);Distance loss function for MMD; network parameter set->Respectively representing the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder and the bias vector of the decoder after the domain adaptive update, +. >Is a balance parameter;
step 3.2.5: and saving the trained network parameters of the new anomaly detection source domain model, and deploying the new anomaly detection source domain model at the tunnel edge computing node.
In step 4, a dynamic threshold for anomaly detection is calculated, and the upper limit of the dynamic anomaly threshold is recorded asThe lower limit is marked asThe method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 4.1: first, the target domain data setRe-executing in the updated abnormality detection source domain model to obtain an output data set after encoder and decoder, denoted +.>The method comprises the steps of carrying out a first treatment on the surface of the Then, the reconstruction error of each piece of target domain data is calculated using the following formula:
(9)
in the method, in the process of the invention,comprises->Elements, denoted->And->Respectively indicate containing->Target domain data sets of individual network traffic and reconstructed output data sets thereof;
step 4.2: calculation ofThe mean and standard deviation of (2) are calculated as follows:
(10)
(11)
in the method, in the process of the invention,representation->Average value of>Is->Standard deviation of (2); the dynamic threshold range is:
(12)
(13)
in the method, in the process of the invention,is a standard deviation coefficient.
2.
In step 5, the method for calculating the reconstruction error is as follows:
;
in the method, in the process of the invention,and (3) reconstructing and outputting the network traffic samples subjected to pretreatment to be detected after the network traffic samples are subjected to the updating and then decoded after the abnormal detection source domain model is encoded.
The detection method in the step 6 is as follows:
when (when)When the mark is normal; when->Or->When marked as abnormal.
Embodiment 3 as shown in fig. 1, the invention provides a tunnel network anomaly detection method based on a domain adaptive depth self-encoder due to difficult recognition and inaccurate positioning of network traffic anomalies in a tunnel and introduction of an edge computing architecture. The edge computing architecture of the tunnel electromechanical system is divided into a device layer, an edge computing layer, a network layer and a cloud platform layer. The equipment layer mainly comprises tunnel sensing equipment and control equipment, such as a broadcast telephone system, a tunnel monitoring system, a tunnel ventilation lighting system, a tunnel area controller, a tunnel fire protection system, an information release system, a tunnel traffic signal system and the like. The method solves the problem that in the prior art, the Ethernet equipment is complex, the tunnel network abnormality detection is difficult due to the high network transmission flow sampling rate, edge computing nodes are arranged in the tunnel, a large number of front-end equipment on the periphery is managed, and data acquisition and processing are performed, including the functions of model training, domain self-adaptive updating, network abnormality detection and the like. Because of the problems of high dimensionality of the acquired network flow characteristics, high nonlinearity among data and the like caused by the complexity and diversity of electromechanical equipment in the tunnel, an effective anomaly detection model is difficult to establish.
As shown in the flowchart of fig. 2 and the algorithm chart of fig. 3, taking the detection of network traffic anomalies of the tunnel monitoring system as an example, the method for detecting tunnel network anomalies based on the domain adaptive depth self-encoder according to the present embodiment specifically includes the following steps:
step 1: taking the network traffic anomaly detection of the tunnel monitoring system as an example, collecting historical network traffic data of the tunnel monitoring system through edge computing nodes deployed in a tunnel, analyzing and obtaining corresponding network original data streams by using an existing conventional mode, and preprocessing data to obtain network traffic characteristics corresponding to the service.
The data preprocessing mode comprises abnormal data removal, meaningless characteristic removal and data normalization. The abnormal data removing operation is performed in the collected network flow of the monitoring system, and normal historical data is reserved for subsequent modeling through manual judgment. Removing nonsensical features includes removing nonsensical features such as IP addresses, port numbers, timestamps, etc., and converting various feature data of the tunnel network into processable data. The network traffic characteristics include basic characteristics of data flow, content characteristics of protocol connection, time-based traffic statistics characteristics, and connection characteristics. Optionally, several features including, but not limited to, data stream duration, number of forward packets, number of reverse packets, total number of bytes of forward packets, total number of bytes of reverse packets, total number of bytes of forward packet header, total number of bytes of reverse packet header, total number of bytes of forward substream, and total number of bytes of reverse substream are used for modeling and anomaly detection. And the data normalization is performed by taking the maximum and minimum values of the flow characteristics as references, and the maximum and minimum normalization is performed on the data, so that the value range of all the data is ensured to be in the [0,1] interval.
Step 2: the step is an offline part, and the collected historical network flow data of the tunnel monitoring system is used as a source domain data set after the corresponding characteristics of each flow sample are obtained through the preprocessing method in the step 1. And training an anomaly detection source domain model based on a Deep Auto Encoder (DAE) algorithm by taking the source domain data set as a training set, and deploying the DAE in a tunnel edge computing node after training is completed. The depth automatic encoder is an unsupervised neural network model comprising an encoder and a decoder, and can learn implicit characteristics (the encoder) of network traffic input data of the tunnel monitoring system, reconstruct the input characteristics (the decoder) by using the learned implicit characteristics, and the principle of the DAE is that the output of the decoder is restored to be input as much as possible. Assume that the source domain dataset for training is represented as:
wherein,representing a source domain dataset,/->Indicate->And (5) pre-processing the network traffic samples.
The invention does not limit the number of hidden layers and the number of hidden layer neurons of the DAE algorithm, such as a self-encoder structure diagram in an algorithm diagram of FIG. 3, and takes the network flow of the tunnel monitoring system as an example, an anomaly detection source domain model based on the DAE algorithm is provided with three layers of neural networks, namely an input layer, a hidden layer and an output layer, and the input is The specific training pattern of the DAE is as follows:
step 2.1: the encoder encodes source domain dataOne by one via an activation function>Mapping to obtain hidden layer data:
;
in the method, in the process of the invention,implicit layer vector representing DAE, +.>Indicate->Implicit layer vectors of individual network traffic samples;
the encoding process is shown as formula (1):
(1)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the encoder, respectively,/->To activate a function, in the present invention, it is a Sigmoid function. />
Step 2.2: the decoder activates the functionImplicit layer data->Conversion to the output layer obtains the output variable:
;
in the method, in the process of the invention,output variable representing reconstruction, +.>Indicate->The reconstructed network traffic samples; in this step->Via implicit layer vector->The input variable is reconstructed, and the decoding process is shown as a formula (2);
(2)
in the method, in the process of the invention,and->Representing the network weights and offset vectors of the decoder, respectively,/->To activate a function, in the present invention, it is a Sigmoid function.
Step 2.3: training the DAE by using a gradient descent algorithm, and obtaining optimal network parameters by minimizing a reconstruction error; the objective loss function required to be optimized in the training process is shown in the formula (3):
(3)
in the middle ofNetwork parameter setRepresenting the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder, the bias vector of the decoder, respectively, >And->Respectively represent +.>Input and reconstruction output variables of the DAE networks;Mrepresenting the total number of network traffic samples;
step 2.4: and saving the trained DAE network parameters, and deploying the model at a tunnel edge computing node to serve as a source domain model for anomaly detection.
It should be noted that, in the step 2, the network traffic generated by the tunnel monitoring system in real time can be detected abnormally by using the DAE model trained by the source domain data, but in the network environment of the terminal device in the tunnel, the network traffic dynamically changes with time, and the robustness of anomaly detection is lower to a certain extent by adopting a constant model.
Step 3: the step is online part, and domain self-adaptive updating is carried out based on the DAE model deployed on the tunnel edge computing node. In this example, when the edge computing node acquires the network traffic generated by the tunnel monitoring system in real time and needs to detect an anomaly, the DAE model established in the step 2 is updated by the domain adaptive update strategy constructed by the invention, and the algorithm is defined as a domain adaptive encoder (Domain Adaptive Deep Autoencoder, DADAE). Specifically, the DADAE is updated as follows:
Step 3.1: step 3.1: is assumed to be inAt moment, network flow samples generated by tunnel monitoring system and collected by edge computing nodes in real time pass through stepsThe preprocessing described in step 1 results in corresponding network traffic characteristics, and the samples are expressed asThat is->The pre-processed network traffic sample to be detected abnormally of the tunnel monitoring system is defined as a sample to be detected in this example.
Since domain-adaptive updating requires updating the source domain model with the target domain, which requires updating with the target (i.e.)) The updated model can be matched with the target only if the data structure and the characteristics have higher similarity. According to the strong correlation characteristic of network traffic in adjacent time periods, the concept of a sliding window is introduced, and an adaptive sliding window algorithm is constructed to obtain +.>The corresponding target domain data set comprises the following specific steps:
step 3.1.1: to be used forTime tunnel monitoring system network flow sample +.>Expanding the pre-processed network traffic sample of the forward sequence for the right boundary of the sliding window, finding out that the network traffic with the proper time sequence is close to the right boundary of the sliding window, and then self-adapting the data set of the sliding window +. >Can be expressed as: />
;
In the method, in the process of the invention,indicating length +.>Comprising->Time to->Network traffic samples within a time of day; the samples are normal data after abnormality detection.Pre-processed network traffic samples for the left boundary of the window, i.e. adaptive sliding window +.>Time forward expansion->A network traffic sample.
Step 3.1.2: adaptive sliding window in determining whether to expand a preamble sample, it is assumed that the sliding window has been expanded toAt the moment, the sample to be judged whether to fit into the window is +.>Time; first, the Euclidean distance (Euclidean Distance, ED) average of the samples at this time and all samples inside the current window is calculated according to the similarity function shown below:
(4)
the ED is calculated by the following steps:
;
in the method, in the process of the invention,representing the slave +.>Time to->Any network traffic sample within a time instant +.>For the number of network traffic samples in the current window, < + >>A preamble sample to be judged whether to include a window or not;
step 3.1.3: setting the boundary threshold of the adaptive sliding window according to the similarity function of the step 3.1.2If->The sliding window will incorporate the network traffic sample, i.e. the left boundary sample of the window is +. >Otherwise, stop expanding, the left boundary sample is +.>
Through the step 3.1, the obtained self-adaptive sliding window network flow and the sample to be testedHas strong correlation property, and makes sliding window data set +.>As->The target domain data set is expressed as:
;
due toCommon->Strip network traffic sample, thus target domain dataset +.>Can also be expressed as:
;
in the method, in the process of the invention,is indicated as including->Data set of strip sample->Interior (I)>A network traffic sample.
Step 3.2: utilizing a target domain datasetAnd (3) performing domain self-adaptive updating on the DAE trained in the step (2). Domain adaptation can be described simply as an inter-domain knowledge transfer of model similarity between source and target domains in order to discover and attenuate differences between the two domains. Thus, under the dynamically changing network environment of the tunnel terminal equipment, the method is constructedThe DADAE can be adaptively matched with a network traffic sample to be detected, and the accuracy and the robustness of anomaly detection are improved. The method comprises the following specific steps:
step 3.2.1: as shown in fig. 3, first, the source domain data set is acquiredInputting into the trained DAE, obtaining hidden layer vector of source domain data through forward propagation of formula (1) >
Step 3.2.2: target domain dataAlso input into the DAE, acquires the hidden layer vector of the target domain data by forward propagation of the following formula>
(5)/>
Step 3.2.3: the maximum average difference (Maximum Mean Difference, MMD) distance is introduced into the objective function of the DAE to calculate the data difference between the source domain and the target domain. Wherein,and->The MMD calculation between (a) and (b) is shown as formula (6):
;
(6)
in the method, in the process of the invention,respectively->And->The number of samples in>To find the minimum upper bound function, ++>Any index in the dataset is referred to in the formulas. MMD aims to measure the distance between two domains in regenerated hilbert space (Reproducing Kernel Hilbert Space), which is a method of kernel learning, the smaller the MMD distance the higher the similarity between two data domains.For Gaussian kernel function +.>The calculation method is as follows:
(7)
in the method, in the process of the invention,the bandwidth parameter is represented, the value of the bandwidth parameter is proportional to the width of the Gaussian kernel function, and the value of the bandwidth parameter is always 1.
Step 3.2.4: gathering target domain dataDirectly inputting the obtained DAE obtained through offline training in the step 2, introducing the MMD distance into an objective function of the DAE, and calculating the difference between the implicit vectors generated by the source domain data and the objective domain data according to formulas (6) - (7) to construct a DADAE model.
Because of the network parameters trained based on the previous steps, only a small number of iterations (i.e., network weight fine-tuning) are required in this step to achieve domain-adaptive updating of the model. The DADAE model objective function (with the minimum objective function as the objective) constructed by the invention is as follows:
(8)
wherein the objective function consists of two partial losses, namely the objective function losses of the DAEAnd MMD distance loss between the implicit vector of the source domain data and the implicit vector of the target domain data; network parameter set->Respectively representing the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder and the bias vector of the decoder after the domain adaptive update, +.>For the balance parameter between DAE loss and inter-domain MMD distance loss, the value of the balance parameter is generally 0.5, and the balance parameter can be finely adjusted up and down in the implementation process, the invention does not aim at->Is used as a constraint. The training mode is still a gradient descent algorithm.
Step 3.2.5: and saving the trained network parameters of the new anomaly detection source domain model, and deploying the new anomaly detection source domain model at the tunnel edge computing node.
It can be seen that the objective function constructed by the invention not only fully utilizes the source domain model information, but also solves the problem that the static model can not adapt to the dynamically changed network environment of the tunnel electromechanical device to a certain extent by minimizing the objective function so that the updated network weight and bias tend to the characteristics of the objective domain data.
In step 4, calculating a dynamic threshold value for anomaly detection based on the target domain data set of the network traffic to be detected, wherein the dynamic anomaly threshold valueThe upper limit of (2) is defined asThe lower limit is->
Because the network traffic dynamically changes with time in the tunnel electromechanical system network environment, the state of the normal traffic is also updated continuously along with relevant factors such as the network environment. Therefore, for the abnormal judgment of the network flow of the tunnel monitoring system acquired by the edge computing node, the normal reference value of the current network state should be based. The target domain data set determined based on the adaptive sliding window according to said step 3.1 has a strong temporal correlation, the network state of which is less affected by temporal variations. Therefore, the specific steps for determining the dynamic threshold range based on the target domain data of the network traffic to be detected are as follows:
step 4.1: first, the target domain data setExecuting again in the updated DADAE model, obtaining an output data set after encoder and decoder, recorded as +.>. Then, the reconstruction error of each piece of target domain data is calculated using the following formula:
(9)
in the method, in the process of the invention,comprises->An element, which can be expressed as->And->Respectively indicate containing->Target domain data sets of individual network traffic and reconstructed output data sets.
Step 4.2: calculation ofThe mean and standard deviation of (2) are calculated as follows:
(10)
(11)
in the method, in the process of the invention,representation->Average value of>Is->Standard deviation of (2). The dynamic threshold range set by the present invention is:
(12)
(13)
in the method, in the process of the invention,is the standard deviation coefficient, theThe invention is not limited by->For example +.>May be 2.
In step 5, the sample to be measured is input into a DADAE model for reasoning, and the reconstruction error is calculated by the following calculation method:
;
in the method, in the process of the invention,and (5) reconstructing and outputting the samples to be detected after DADADAE encoding.
Step 6: according to the reconstruction error of the sample to be measuredAnd judging whether the network flow of the real-time tunnel monitoring system to be detected is abnormal data or not according to the dynamic error threshold range. The judgment criteria are as follows:
when (when)When the mark is normal; when->Or->When marked as abnormal.
Step 7: after the current sample to be detected finishes the abnormal detection, after the edge computing node at the next moment acquires the new network flow to be detected, carrying out data preprocessing according to the steps, re-determining a sliding window data set, carrying out domain self-adaptive updating on the DAE model by utilizing target domain data, calculating a dynamic threshold range, detecting whether the network flow to be detected is abnormal or not and the like.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (8)

1. The tunnel network anomaly detection method based on the domain adaptive depth self-encoder is characterized by comprising the following steps of:
step 1: collecting historical network flow data of equipment in a tunnel electromechanical system equipment layer through edge computing nodes deployed in a tunnel, analyzing and obtaining corresponding network original data flow, and preprocessing data to obtain corresponding network flow characteristics, namely preprocessed network flow samples;
step 2: the collected historical network flow data is processed in the step 1 to obtain the network flow characteristics corresponding to each network flow data, and then the network flow characteristics are used as a source domain data set; training an anomaly detection source domain model based on a depth self-encoder algorithm by taking a source domain data set as a training set, and deploying the anomaly detection source domain model in a tunnel edge computing node after training is completed;
Step 3: after the network flow data acquired in real time are subjected to the preprocessing mode in the step 1 to obtain the corresponding network flow characteristics, constructing a self-adaptive sliding window algorithm to obtain a target domain data set corresponding to the network flow; updating the anomaly detection source domain model obtained in the step 2 according to the corresponding target domain data set;
step 4: calculating a dynamic threshold for anomaly detection;
step 5: inputting the network flow characteristics after preprocessing the network flow data to be detected, which are acquired in real time, into an updated anomaly detection source domain model, and calculating a reconstruction error of the anomaly detection source domain model;
step 6: detecting whether the network flow data to be detected acquired in real time is abnormal data or not according to the dynamic threshold value obtained in the step 4 and the reconstruction error obtained in the step 5;
the specific method of the step 3 is as follows:
step 3.1: assume that at time q, the network traffic data collected by the edge computing node in real time is preprocessed in step 1 to obtain a corresponding network traffic characteristic, namely a preprocessed network traffic sample x q Defining a sample to be measured; construction of target domain dataset X using adaptive sliding window algorithm t The method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 3.1.1: pre-processed network traffic sample x at time q-1 q-1 Expanding the pre-processed network traffic sample with the forward sequence for the right boundary of the sliding window, and attributing the network traffic sample with the time sequence close to the pre-processed network traffic sample into the sliding window, so as to adapt to the sliding window data set D W Expressed as:
D W ={x q-N ,x t-N+1 ,...,x q-1 };
wherein D is W A sliding window with a length of N is represented, and the sliding window comprises preprocessed network traffic samples from q-N time to q-1 time; x is x q-N Expanding N preprocessed network traffic samples forward by the self-adaptive sliding window at the time of q-1 for the network traffic samples preprocessed by the left boundary of the window;
step 3.1.2: when determining whether to expand the preamble sample, the self-adaptive sliding window assumes that the sliding window is expanded to q-n time, and the sample to be judged whether to be included in the window is q-n-1 time; first, the Euclidean distance average of the samples at this time and all samples inside the current window is calculated according to the similarity function shown below:
the ED is calculated by the following steps:
ED(x q-n-1 ,x w )=||x q-n-1 -x w || 2
wherein x is w Representing any preprocessed network traffic samples from q-n time to q-1 time in a current sliding window, wherein n is the number of preprocessed network traffic samples in the current window, and x is the number of the preprocessed network traffic samples in the current window q-n-1 A network traffic sample after the preamble pretreatment to judge whether to incorporate the window is determined;
Step 3.1.3: setting the boundary threshold delta of the adaptive sliding window according to the similarity function of the step 3.1.2 w If S w ≥δ w The sliding window will incorporate the pre-processed network traffic sample, i.e. the left boundary pre-processed network traffic sample of the window is x q-n-1 Otherwise, stopping expanding, wherein the left boundary sample is x q-n
Let the sliding window dataset D W As x q The target domain data set is expressed as:
X t =D W ={x q-N ,x t-N+1 ,...,x q-1 };
due to D W There are N pre-processed network traffic samples in total, thus the target domain data set X t Expressed as:
wherein x is ti Data set X representing network traffic samples after containing N pre-processes t An ith pre-processed network traffic sample;
step 3.2: utilizing a target domain dataset X t Performing domain self-adaptive updating on the anomaly detection source domain model trained in the step 2; the method comprises the following specific steps:
step 3.2.1: first, source domain dataset X s Inputting the source domain data into a trained anomaly detection source domain model, and acquiring an implicit layer vector H of the source domain data through forward propagation according to the following formula s
H s =f(WX s +b)
Wherein, W and b respectively represent the network weight and the bias vector of the encoder, and f is an activation function;
step 3.2.2: to target domain data X t Also input into the anomaly detection source domain model, acquires the hidden layer vector H of the target domain data through forward propagation of the following formula t
H t =f(WX t +b)
Step 3.2.3: taking the maximum average difference distance as an objective function, the calculation formula is shown as follows:
wherein N, M are each H t And H s The number of samples in the data set, sup {. J, is the minimum upper bound function, i, j in the formula refers to any index in the data set, i.e., h ti And h tj Respectively represent H t Samples i and j, h si And h sj Respectively represent H s Samples i and j; g (·) is a Gaussian kernel function, G (h si ,h sj ) The calculation method is as follows:
wherein σ represents the bandwidth parameter;
step 3.2.4: calculating the difference between the implicit vectors generated by the source domain data and the target domain data according to the formula shown in the step 3.2.3 to construct a DADAE model; targeting minimizing the DADAE model objective function, the DADAE model objective function is as follows:
wherein J is DAE Is the above-mentioned maleA loss function represented by formula (I); lamda MMD (H) s ,H t ) Distance loss function for MMD; network parameter setRespectively representing the network weight of the encoder, the bias vector of the encoder, the network weight of the decoder and the bias vector of the decoder after the domain self-adaptive updating, wherein lambda is a balance parameter; x is x si And->Respectively representing input and reconstruction output variables of the ith DAE network; m represents the total number of the preprocessed network traffic samples;
step 3.2.5: and saving the trained network parameters of the new anomaly detection source domain model, and deploying the new anomaly detection source domain model at the tunnel edge computing node.
2. The method for detecting tunnel network anomalies based on the domain adaptive depth self-encoder according to claim 1, wherein in step 1, the system architecture for detecting tunnel network anomalies by using edge computing nodes comprises a device layer, an edge computing layer, a network layer and a cloud platform layer; the equipment layer, the edge computing layer, the network layer and the cloud platform layer are sequentially connected; each equipment system in the equipment layer comprises a broadcast telephone system, a tunnel monitoring system, a tunnel ventilation lighting system, a tunnel area controller, a tunnel fire protection system, an information release system and a tunnel traffic signal system; the edge computing layer is an edge computing node deployed in the tunnel.
3. The method for detecting the tunnel network anomaly based on the domain adaptive depth self-encoder according to claim 1, wherein in the step 1, the data preprocessing mode comprises removing anomaly data, removing nonsensical features and normalizing data;
the network traffic characteristics include data flow duration, number of forward packets, number of reverse packets, total number of bytes of forward packets, total number of bytes of reverse packets, total number of bytes of forward substreams, and total number of bytes of reverse substreams.
4. The method for detecting the tunnel network anomaly based on the domain adaptive depth self-encoder according to claim 1, wherein in the step 2, the anomaly detection source domain model adopts a depth self-encoder, and comprises an encoder and a decoder;
the anomaly detection source domain model is provided with three layers of neural networks, namely an input layer, an hidden layer and an output layer, wherein the input is X s ,X s ={x s1 ,x s2 ,...,x sM -a }; wherein X is s Representing a source domain dataset, x sM Representing an Mth preprocessed network traffic sample;
the specific training method of the anomaly detection source domain model is as follows:
step 2.1: the encoder encodes source domain data X s Obtaining hidden layer data through the mapping of an activation function f:
H s ={h s1 ,h s2 ,...,h sM };
wherein H is s Represents an implicit layer vector, h sM An implicit layer vector representing the mth preprocessed network traffic sample;
the encoding process is shown in the following formula:
H s =f(WX s +b)
wherein W and b respectively represent the network weight and bias vector of the encoder, and f is an activation function, in the present invention, a Sigmoid function;
step 2.2: the decoder will implicit layer data H by activating function f s Conversion to the output layer obtains the output variable:
in the method, in the process of the invention,output variable representing reconstruction, +.>Representing an mth reconstructed network traffic sample;
via implicit layer vector H s The input variable is reconstructed, and the decoding process is shown in the following formula;
in the method, in the process of the invention,and->Respectively representing network weights and offset vectors of the decoder, f being an activation function;
step 2.3: training the anomaly detection source domain model by using a gradient descent algorithm, and obtaining optimal network parameters by taking a minimum reconstructed error as a target; the objective function is shown in the following formula:
in which a set of network parametersRespectively representing the network weight of the encoder, the offset vector of the encoder, the network weight of the decoder, the offset vector of the decoder, x si And->Respectively representing input and reconstruction output variables of the ith DAE network; m represents the total number of the preprocessed network traffic samples;
step 2.4: and saving the network parameters of the trained anomaly detection source domain model, and deploying the model at the tunnel edge computing node.
5. The method for detecting tunnel network anomalies based on domain adaptive depth self-encoder as claimed in claim 1, wherein in step 4, a dynamic threshold for anomaly detection is calculated, and an upper limit of the dynamic anomaly threshold is denoted as U e The lower limit is denoted as L e The method comprises the steps of carrying out a first treatment on the surface of the The specific method comprises the following steps:
step 4.1: first, the target domain data set X t Re-executing in the updated abnormality detection source domain model to obtain an output data set after being processed by the encoder and the decoder, and recording the output data set as Then, the reconstruction error of each piece of target domain data is calculated using the following formula:
wherein E is t Contains N elements, expressed asX t And->Respectively representing target domain data sets containing N network flows and reconstructing output data sets;
step 4.2: calculation E t The mean and standard deviation of (2) are calculated as follows:
in the method, in the process of the invention,representation E t Average value of S t For E t Standard deviation of (2); the dynamic threshold range is:
wherein beta is a standard deviation coefficient.
6. The method for detecting tunnel network anomalies based on the domain adaptive depth self-encoder of claim 5, wherein β is 2.
7. The method for detecting tunnel network anomalies based on the domain adaptive depth self-encoder as claimed in claim 5, wherein in step 5, the calculation method of the reconstruction error is as follows:
in the method, in the process of the invention,and (3) reconstructing and outputting the network traffic samples subjected to pretreatment to be detected after the network traffic samples are subjected to the updating and then decoded after the abnormal detection source domain model is encoded.
8. The method for detecting tunnel network anomalies based on the domain adaptive depth self-encoder as claimed in claim 6, wherein the detection method in step 6 is as follows:
when L e ≤e q ≤U e When the mark is normal; when e q >U e Or e q <L e When marked as abnormal.
CN202311023612.8A 2023-08-15 2023-08-15 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder Active CN116743646B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311023612.8A CN116743646B (en) 2023-08-15 2023-08-15 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311023612.8A CN116743646B (en) 2023-08-15 2023-08-15 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder

Publications (2)

Publication Number Publication Date
CN116743646A CN116743646A (en) 2023-09-12
CN116743646B true CN116743646B (en) 2023-12-19

Family

ID=87904783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311023612.8A Active CN116743646B (en) 2023-08-15 2023-08-15 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder

Country Status (1)

Country Link
CN (1) CN116743646B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858509A (en) * 2018-11-05 2019-06-07 杭州电子科技大学 Based on multilayer stochastic neural net single classifier method for detecting abnormality
CN109948117A (en) * 2019-03-13 2019-06-28 南京航空航天大学 A kind of satellite method for detecting abnormality fighting network self-encoding encoder
CN110992354A (en) * 2019-12-13 2020-04-10 华中科技大学 Abnormal region detection method for countering self-encoder based on introduction of automatic memory mechanism
CN111585997A (en) * 2020-04-27 2020-08-25 国家计算机网络与信息安全管理中心 Network flow abnormity detection method based on small amount of labeled data
CN112994940A (en) * 2019-05-29 2021-06-18 华为技术有限公司 Network anomaly detection method and device
CN114372530A (en) * 2022-01-11 2022-04-19 北京邮电大学 Abnormal flow detection method and system based on deep self-coding convolutional network
CN114742165A (en) * 2022-04-15 2022-07-12 哈尔滨工业大学 Aero-engine gas circuit performance abnormity detection system based on depth self-encoder
CN114783524A (en) * 2022-06-17 2022-07-22 之江实验室 Path abnormity detection system based on self-adaptive resampling depth encoder network
CN115169430A (en) * 2022-04-27 2022-10-11 北京理工大学 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding
CN115242556A (en) * 2022-09-22 2022-10-25 中国人民解放军战略支援部队航天工程大学 Network anomaly detection method based on incremental self-encoder
KR102510060B1 (en) * 2022-07-28 2023-03-14 주식회사 어니언소프트웨어 An obtaining method abnormality data through deep learning pump simulation and an abnormality detection model establishment method based on auto-encoder and a system thereof
CN116055413A (en) * 2023-03-07 2023-05-02 云南省交通规划设计研究院有限公司 Tunnel network anomaly identification method based on cloud edge cooperation
CN116385935A (en) * 2023-04-08 2023-07-04 苏州海裕鸿智能科技有限公司 Abnormal event detection algorithm based on unsupervised domain self-adaption

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3885989A1 (en) * 2020-03-26 2021-09-29 Another Brain Anomaly detection based on an autoencoder and clustering
US20230082899A1 (en) * 2021-09-14 2023-03-16 Eduardo CORRAL-SOTO Devices, systems, methods, and media for domain adaptation using hybrid learning

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858509A (en) * 2018-11-05 2019-06-07 杭州电子科技大学 Based on multilayer stochastic neural net single classifier method for detecting abnormality
CN109948117A (en) * 2019-03-13 2019-06-28 南京航空航天大学 A kind of satellite method for detecting abnormality fighting network self-encoding encoder
CN112994940A (en) * 2019-05-29 2021-06-18 华为技术有限公司 Network anomaly detection method and device
CN110992354A (en) * 2019-12-13 2020-04-10 华中科技大学 Abnormal region detection method for countering self-encoder based on introduction of automatic memory mechanism
CN111585997A (en) * 2020-04-27 2020-08-25 国家计算机网络与信息安全管理中心 Network flow abnormity detection method based on small amount of labeled data
CN114372530A (en) * 2022-01-11 2022-04-19 北京邮电大学 Abnormal flow detection method and system based on deep self-coding convolutional network
CN114742165A (en) * 2022-04-15 2022-07-12 哈尔滨工业大学 Aero-engine gas circuit performance abnormity detection system based on depth self-encoder
CN115169430A (en) * 2022-04-27 2022-10-11 北京理工大学 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding
CN114783524A (en) * 2022-06-17 2022-07-22 之江实验室 Path abnormity detection system based on self-adaptive resampling depth encoder network
KR102510060B1 (en) * 2022-07-28 2023-03-14 주식회사 어니언소프트웨어 An obtaining method abnormality data through deep learning pump simulation and an abnormality detection model establishment method based on auto-encoder and a system thereof
CN115242556A (en) * 2022-09-22 2022-10-25 中国人民解放军战略支援部队航天工程大学 Network anomaly detection method based on incremental self-encoder
CN116055413A (en) * 2023-03-07 2023-05-02 云南省交通规划设计研究院有限公司 Tunnel network anomaly identification method based on cloud edge cooperation
CN116385935A (en) * 2023-04-08 2023-07-04 苏州海裕鸿智能科技有限公司 Abnormal event detection algorithm based on unsupervised domain self-adaption

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Anomaly Detection Using LSTM-Based Variational Autoencoder in Unsupervised Data in Power Grid;Dibyajyoti Guha等;《IEEE Systems Journal》;第17卷(第03期);4313-4323 *
基于生成对抗网络与自编码器的网络流量异常检测模型;郭森森等;《信息网络安全》(第12期);7-15 *
基于自编码器的异常检测算法研究;蔚焘;《中国优秀硕士学位论文全文数据库》;I140-985 *
结合二次特征提取和LSTM-Autoencoder的网络流量异常检测方法;孙旭日等;《北京交通大学学报》(第02期);21-30 *

Also Published As

Publication number Publication date
CN116743646A (en) 2023-09-12

Similar Documents

Publication Publication Date Title
CN113271225B (en) Network reliability evaluation method based on in-band network telemetry technology
CN106912067B (en) WSN wireless communication module fault diagnosis method based on fuzzy neural network
CN116055413B (en) Tunnel network anomaly identification method based on cloud edge cooperation
CN114499979B (en) SDN abnormal flow cooperative detection method based on federal learning
CN112132430B (en) Reliability evaluation method and system for distributed state sensor of power distribution main equipment
CN111782491B (en) Disk failure prediction method, device, equipment and storage medium
CN111884874B (en) Programmable data plane-based ship network real-time anomaly detection method
CN113822337A (en) Industrial control abnormity detection method based on multi-dimensional sequence
CN109508788A (en) A kind of SDN method for predicting based on arma modeling
CN117708550B (en) Automatic data analysis and model construction method for electric power big data
CN116170208A (en) Network intrusion real-time detection method based on semi-supervised ISODATA algorithm
CN116684878A (en) 5G information transmission data safety monitoring system
CN117914003B (en) Intelligent monitoring auxiliary method and system for box-type transformer based on cloud edge cooperation
CN115277102A (en) Network attack detection method and device, electronic equipment and storage medium
CN111740998A (en) Network intrusion detection method based on stacked self-encoder
CN116743646B (en) Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder
CN108399415B (en) Self-adaptive data acquisition method based on life cycle stage of equipment
CN116094758B (en) Large-scale network flow acquisition method and system
CN109768995B (en) Network flow abnormity detection method based on cyclic prediction and learning
CN112422546A (en) Network anomaly detection method based on variable neighborhood algorithm and fuzzy clustering
CN115936473A (en) Unsupervised KPI (Key performance indicator) abnormity detection method combining prediction and reconstruction
CN115987643A (en) Industrial control network intrusion detection method based on LSTM and SDN
CN110244563B (en) Neural network internal model controller model mismatch identification and online updating method
US20210184960A1 (en) Anomaly detector, anomaly detection network, method for detecting an abnormal activity, model determination unit, system, and method for determining an anomaly detection model
US6741568B1 (en) Use of adaptive resonance theory (ART) neural networks to compute bottleneck link speed in heterogeneous networking environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 650000 Yunnan province Kunming City Road No. 9 Xiang Shi Tuo

Applicant after: Yunnan Provincial Transportation Planning and Design Research Institute Co.,Ltd.

Address before: 650041 No. 9 Shijiaxiang, Tuodong Road, Kunming City, Yunnan Province

Applicant before: BROADVISION ENGINEERING CONSULTANTS

GR01 Patent grant
GR01 Patent grant