CN115589608B

CN115589608B - Internet of things data anomaly detection model training method, anomaly detection method and system

Info

Publication number: CN115589608B
Application number: CN202211545236.4A
Authority: CN
Inventors: 乔焰; 袁新宇; 张本初; 胡荣耀; 赵培
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2022-12-05
Filing date: 2022-12-05
Publication date: 2023-03-07
Anticipated expiration: 2042-12-05
Also published as: CN115589608A

Abstract

The invention relates to the technical field of data detection, in particular to a training method, an abnormality detection method and a system for an Internet of things data abnormality detection model, and solves the problems that in the prior art, when the Internet of things abnormal data is detected, the detection effect on high-dimensional data is not ideal, the data processing burden is increased, and real-time detection cannot be realized. According to the training method for the data anomaly detection model of the Internet of things, a second recognition network for ensuring the cycle consistency is additionally arranged, and the anomaly data is iterated by adopting an approximate gradient method so as to gradually eliminate the anomaly components. In the abnormal detection stage, the reconstruction error and the identification error are considered at the same time, and an evaluation value calculation function for evaluating the abnormal degree of the data is designed, so that the abnormal detection model trained by the method has better F1 score, and the accuracy and the recall rate of the model are remarkably superior to those of the prior art.

Description

Internet of things data anomaly detection model training method, anomaly detection method and system

Technical Field

The invention relates to the technical field of data detection, in particular to a training method, an anomaly detection method and a system for an Internet of things data anomaly detection model.

Background

The anomaly detection is a research field which is extremely active in the past years, and various anomaly detection schemes appear in various fields and can be roughly divided into two genres, namely a statistical type and a deep learning type. Conventional anomaly detection statistical-based methods include distance-based methods, SVM-based methods, and (PCA) -based methods. The bottleneck of the mainstream methods is exposed when high-dimensional data is processed, and the calculation time and accuracy are far inferior to those of the low-dimensional data anomaly detection scene.

With the rapid development of sensor technology, the application of the internet of things is increasingly applied to industries such as industry, agriculture, medical treatment, health and the like. The sensor nodes are scattered in the target area, environmental parameters (temperature, humidity, CO2 concentration and the like) in the target area are collected through various sensor nodes, and the change condition of the internal environment of the scattered area can be monitored in real time. In order to timely and accurately find out an emergency in a natural environment, monitor the health condition of a sensor network and improve the reliability of data of the internet of things, it is very important to perform anomaly detection on the data acquired by the sensor.

In the application scene of the internet of things with increasingly complex and high-dimensional data, the scheme based on deep learning shows great superiority. In recent years, more and more deep learning methods have emerged in the solution to the problem of complex high-dimensional data anomaly detection. The core idea of the method is to automatically learn the features of the common data first and then find out the abnormal data by comparing the features. In the current deep learning model, the competitive power of the generation network (GAN) is strongest, and a plurality of top-level achievements are achieved in the field of anomaly detection.

However, due to the non-negligible effect of abnormal data on learning, almost all of these methods require a non-contaminated data set for training to accurately learn the true distribution or potential features of data. However, it is much more difficult to collect a completely pollution-free data set from the internet of things platform than to collect a batch of anomaly-free images: on one hand, under an extreme environment, the condition that the sensor of the internet of things uploads abnormal data cannot be avoided; on the other hand, the current technology cannot take all known or unknown anomalies into account.

In the last decade, a number of scholars have been working on processing contaminated training data, such as Zhou et al, which proposed a Robust Deep auto encoder (RDA) that filters abnormal instances before learning data features, and then Najari et al, which promoted the performance of the RDA through a projection strategy. However, in the real application of the internet of things, the anomaly detection process should be performed simultaneously with the model training process, that is, the network detects and learns the anomaly in real time. Therefore, the anomaly detection method is not suitable for the actual situation of the Internet of things. Finally, du et al propose fganomally method for detecting multiple time series abnormal data. The method firstly creates a pseudo label for each data instance, and then filters abnormal data with positive pseudo labels in a training process. However, because fganomally is designed for sequence-to-sequence scenarios, it has not yet been adapted to most high-dimensional anomaly detection cases with internet of things data.

Disclosure of Invention

The invention provides a training method for an Internet of things data anomaly detection model, and the anomaly detection model obtained by the invention solves the problems that the detection effect on high-dimensional data is not ideal, the data processing load is increased and the real-time detection cannot be realized in the Internet of things anomaly data detection in the prior art.

The invention adopts the following technical scheme:

a training method for an Internet of things data anomaly detection model comprises the following steps:

st1, acquiring a plurality of sensor detection data samples acquired in a continuous time period as historical data X, wherein X comprises normal data L and abnormal data S, namely X = Lau S;

constructing and initializing a basic model, wherein the basic model is a neural network model and comprises a coding network, a generating network, a first identification network and a second identification network; the coding network is used for coding the received data and outputting coded data; the generation network is used for reconstructing the received data and outputting reconstructed data; the first authentication network and the second authentication network are used for labeling the received data respectively and outputting corresponding labeled values;

st2, taking a value from the normal data L and giving the value to the sample data X, wherein the initial value of the normal data L is equal to the historical data X;

st3, annotating the data tuples (x, E (x) and (G (z), z) by means of the first authentication network, the corresponding annotation values of (x, E (x) being denoted D _xz The labeled values of (x, E (x)), (G (z), z) are denoted as D _xz (G (z), z); e (x) represents the output of the coding network when the input is x; g (Z) represents the output of the generation network when the input is Z, and Z represents the value from the data space Z conforming to normal distribution;

marking the data tuples (x, x) and (x, G (E (x))) by a second identification network, wherein the corresponding marked values of (x, x) are marked as D _xx The labeled values for (x, x), (x, G (E (x))) are denoted as D _xx (x, G (E (x))); g (E (x)) represents the output of the generation network when the input is E (x);

st4, converting D _xz (x,E(x))、D _xz (G(z),z)、D _xx (x, x) and D _xx Substituting (x, G (E (x))) into the set first objective function to calculate a first objective function value, and updating parameters of the coding network, the generating network, the first authentication network and the second authentication network in the basic model according to the first objective function value;

st5, judging whether the updating times of the basic model reach a first set value n1; if not, returning to the step St2, and accumulating the updating times of the basic model again; if yes, the following step St6 is executed;

st6, judging whether the updating times of the normal data L reach a second set value n2 or not; otherwise, the following step St7 is executed; if yes, fixing the parameters of the basic model and extracting the coding net from the parameters of the basic modelConnecting, generating a network and a first identification network, and combining the three extracted networks with a set evaluation function to form an abnormality detection model; evaluation function for binding D _xz (x, E (x)) and D _xz (G (E (x), E (x))) calculating an evaluation value a (x) for evaluating whether the data x is normal or not; g (E (x)) represents the output of the generation network when the input is E (x);

st7, updating the abnormal data S by combining the latest updated basic model and the set abnormal data calculation model, and updating the normal data L by combining the updated abnormal data S, wherein L = X-S; and then returns to step St2.

Preferably, the first objective function is:

expressed to maximize V _ALICE (D _xz ,D _xx E, G) updating the parameters of the first and second authentication networks for the purpose of minimizing V when the parameters of the first and second authentication networks are fixed _ALICE (D _xz ,D _xx E, G) updating the coding network and the generating network for the target; v _ALICE (D _xz ,D _xx E, G) are transition terms;

V _ALICE (D _xz ,D _xx ,E,G)=V(D _xz ,E,G)+V _CE (D _xx ,E,G)；

V(D _xz ,E,G)=E1+E2；

E1=Æ _x~px [Æ _E(x)~pE(.|x) [log _k D _xz (x,E(x))]]；

E2=Æ _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]；

V _CE (D _xx ,E,G)=Æ _x~px [log _k D _xx (x,x)]+Æ _x~px [1-log _k D _xx (x,G(E(x)))]；

wherein, V (D) _xz ,E,G)、V _CE (D _xx E, G), E1 and E2 all represent transition parameters; k denotes a base number, and k>1; let the output distribution of the coding module when the input is x be pE (| x), 198 _E(x)~pE(.|x) [log _k D _xz (x,E(x))]Is expressed log _k D _xz (x, E (x)) a desire when E (x) follows a pE (| x) distribution; 198 _x~px [Æ _E(x)~pE(.|x) [log _k D _xz (x,E(x))]]Representation (198) _E(x)~pE(.|x) [log _k D _xz (x,E(x))]Expectation when x obeys px distribution, px distribution represents a true distribution of the sample data x in the normal data L;

let the output distribution of the generation module when the input is z be pG (| z), 198 _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Is expressed log _k (1-D _xz (G (z, z)) when G (z) follows a pG (| z) distribution; 198 _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]A representation of (198) _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Expectation when z follows a normal distribution;

Æ _x~px [log _k D _xx (x,x)]is expressed log _k D _xx (x, x) expectation when x follows px distribution; 198 _x~px [1-log _k D _xx (x,G(E(x)))]Represents 1-log _k D _xx (x, G (E (x))) when x follows a px distribution.

Preferably, in St4, according to the first objective function value, the first authentication network and the second authentication network are first subjected to gradient update, so that the first objective function value is maximized when the encoding network and the generating network are fixed; the encoding network and the generating network are then gradient updated such that the first objective function value is minimized when the first discrimination network and the second discrimination network are stationary.

Preferably, the abnormal data calculation model in St7 is:

let ζ be ^(g-1) =X-G(E(L ^(g-1) ))；

ζ ^(g-1) ={ζ _ij ^(g-1) } _{1≦i≦I,1≦j≦J} ；

s _ij ^(g) =Pa(ζ _ij ^(g-1) )；

S ^(g) ={s _ij ^(g) } _{1≦i≦I,1≦j≦J} ；

Wherein L is ^(g-1) Represents normal data L, L after g-1 iterations ^(g-1) =X-S ^(g-1) ，E(L ^(g-1) ) Indicating the encoded network input as normal data L ^(g-1) Output of time, G (E (L) ^(g-1) ) Represents the generation module input as E (L) ^(g-1) ) An output of time; ζ represents a transition parameter, ζ ^(g-1) Represents L ^(g-1) A corresponding transition parameter ζ; the data dimensionality of the transition parameter zeta is the same as that of the abnormal data S; i represents the number of samples in the abnormal data S, and J represents the data dimension of each sample in the abnormal data; zeta _ij ^(g-1) Represents ζ ^(g-1) Data in the jth dimension of the ith sample;

s _ij ^(g) represents abnormal data S ^(g) Data in the jth dimension of the ith sample, pa representing a set function; s ^(g) Representing abnormal data after g times of updating;

st7, S is first calculated ^(g) Then calculate L ^(g) =X-S ^(g) (ii) a Then let L = L ^(g) And returns to step St2.

Preferably, the function Pa is:

||ζ _i ^(g-1) || ₂ >when a, s _ij ^(g) =ζ _ij ^(g-1) -a×ζ _ij ^(g-1) /||ζ _i ^(g-1) || ₂ ；

||ζ _i ^(g-1) || ₂ At ≦ a, ζ _ij ^(g) =0；

；

a is a set value, ζ _i ^(g-1) Represents the transition parameter ζ ^(g-1) The ith sample, | | ζ _i ^(g-1) || ₂ Represents ζ _i ^(g-1) The two norms of (a).

Preferably, the merit function is:

A(x)=λ||x-G(E(x))|| ₂ +(1-λ)×σ[D _xz (x,E(x)),1]+σ[D _xz (G(E(x),E(x))),1]；

wherein, A (x) represents the evaluation value corresponding to the sample data x; lambda is a set constant; | | x-G (E (x)) | non-woven phosphor ₂ Is a two-norm of x-G (E (x)); sigma [ 2 ]]Is a cross entropy validation function.

Preferably, σ [ D ] _xz (x,E(x)),1]=-log _m [1/(1+e ^-Dxz(x,E(x)) ]；

σ[D _xz (G(E(x),E(x))),1]=-log _m [1/(1+e ^{-Dxz(G(E(x),E(x)))} ]；

m is a set constant, and m >1.

An Internet of things data anomaly detection method comprises the following steps:

sq1, acquiring test data x of sensor to be detected _test The anomaly detection model is obtained by adopting the training method of the Internet of things data anomaly detection model;

sq2, mixing x _test Substituting the abnormal detection model; coding network pair x _test Encoding and outputting E (x) _test ) Generating a network pair E (x) _test ) Performs reconstruction and outputs G (E (x) _test ) ); first authentication network respectively pair data tuple (x) _test ，E(x _test ) And (G (E (x)) _test ))，E(x _test ) Is labeled, the corresponding labeled value D is output _xz (x _test ，E(x _test ) And D) _xz (G(E(x _test ))，E(x _test ) ); evaluation function combined with labeled value D _xz (x _test ，E(x _test ) ) and D _xz (G(E(x _test ))，E(x _test ) Computing for determining sensor test data x _test Evaluation value A (x) of whether or not there is an abnormality _test )；

Sq3, and evaluating value A (x) _test ) Comparing with a set abnormal threshold value theta; if A (x) _test ) If the value is larger than the set abnormal threshold value theta, judging x _test The state is normal; if A (x) _test ) If the value is less than or equal to the abnormal threshold value theta, judging x _test The state is abnormal.

The data anomaly detection system for the Internet of things comprises a memory, wherein a computer program and an anomaly detection model are stored in the memory, and the computer program is used for realizing the data anomaly detection method for the Internet of things when being executed.

Preferably, the data anomaly detection method further comprises a processor, the processor is connected with the memory, and the processor is used for executing the computer program so as to realize the data anomaly detection method for the internet of things.

The invention has the advantages that:

(1) According to the training method for the data anomaly detection model of the Internet of things, a second identification network for ensuring cycle consistency is additionally arranged, and the abnormal data is iterated by adopting an approximate gradient method so as to gradually eliminate abnormal components. In the abnormal detection stage, the reconstruction error and the identification error are considered at the same time, and an evaluation value calculation function for evaluating the abnormal degree of the data is designed, so that the abnormal detection model trained by the method has better F1 score, and the accuracy and the recall rate of the model are remarkably superior to those of the prior art.

(2) Due to various complex factors existing in the actual environment, the abnormal conditions of the data collected by the sensor are difficult to artificially mark. In the model training process, samples do not need to be marked, an unsupervised model is realized, and the data dimension reduction processing and classification feasibility are ensured.

(3) According to the method for detecting the data abnormality of the Internet of things, firstly, an antagonistic network training abnormality detection model is generated on the basis of two directions, the training method enables a basic model to automatically select normal data from a data set for learning, accuracy, robustness and instantaneity of the abnormality detection model on high-dimensional data abnormality detection can be guaranteed under the condition that the space and time complexity of the data of the Internet of things are greatly reduced, and the problem of dimension disaster caused by the high-dimensional data is solved.

(4) The Internet of things data anomaly detection system provided by the invention provides a carrier for the Internet of things data anomaly detection method provided by the invention, and facilitates popularization of the Internet of things data anomaly detection method.

Drawings

FIG. 1 is a schematic diagram of an anomaly detection model module;

FIG. 2 is a flow chart of a training method of an Internet of things data anomaly detection model;

FIG. 3 is a flow chart of a method for detecting data anomaly in the Internet of things;

FIG. 4 is a graph of the test effect of the data set FCT on three models;

FIG. 5 is a graph of the effect of the test on three models using the data set GAS;

FIG. 6 is a graph of the effect of the test on three models using a data set DSA;

FIG. 7 is a graph of the effect of the test on three models using the data set HAR.

Detailed Description

Internet of things data anomaly detection model

The data anomaly detection model of the internet of things provided by the embodiment can be used for detecting various data anomalies of the internet of things, namely data distortion conditions, especially the distortion conditions of sensor detection data, such as temperature detection data, humidity detection data and weight detection data.

Referring to fig. 1, the data anomaly detection model of the internet of things provided by the embodiment includes a coding network, a generating network, a first identifying network, and an evaluation function.

The coding network is used for coding the received data, the generating network is used for reconstructing the received data, and the first identifying network is used for marking the received data with real probability. The input of the coding network is the input of the anomaly detection model, the input of the generating network is the output of the coding network, and the input end of the first identification network is respectively connected with the input end of the coding network, the output end of the coding network and the output end of the generating network.

Taking sample data x as an example, when the sample data is input into the input end of the abnormality detection model, the coding network codes the sample data x and outputs coded data E (x); the generating network reconstructs the encoded data E (x), and generates network output data which is marked as G (E (x)); the first identifying network identifies data tuples (x, E (x)) and (G (E (x), E (x))) respectively, and marks the data tuples asThe label value of the probability of the real data (x, E (x)) is denoted as D _xz The labeled values of (x, E (x)), (G (E (x), E (x))) are denoted as D _xz (G(E(x),E(x)))。

Evaluation function for binding D _xz (x, E (x)) and D _xz (G (E (x), E (x))) an evaluation value A (x) for evaluating whether the data x is normal or not is calculated.

The evaluation function is:

wherein, A (x) represents the evaluation value corresponding to the sample data x; lambda is a set constant; | | x-G (E (x)) | light calculation ₂ Is a two-norm of x-G (E (x)); sigma [ 2 ]]Is a cross entropy validation function;

σ[D _xz (x,E(x)),1]=-log _m [1/(1+e ^-Dxz(x,E(x)) ]；

σ[D _xz (G(E(x),E(x))),1]=-log _m [1/(1+e ^{-Dxz(G(E(x),E(x)))} ]；

m is a set constant, and m >1.

In the anomaly detection model according to the present embodiment, the evaluation function is a set function; the encoding network, the generating network and the first identifying network can be updated by learning the same type of historical data of the data to be tested. The parameter updates of the encoding network, the generating network, and the first authentication network are explained below.

Training method for data anomaly detection model of Internet of things

Referring to fig. 2, the above-described training method of the abnormality detection model includes the following steps St1 to St7.

St1, acquiring a plurality of sensor data samples of continuous time periods of acquisition time as historical data X, wherein X comprises normal data L and abnormal data S, namely X = L ≧ U ^ S;

constructing a basic model consisting of a coding network, a generating network, a first identifying network and a second identifying network, and initializing the basic model, namely initializing parameters of the coding network, the generating network, the first identifying network and the second identifying network;

the coding network is used for coding the received data and outputting coded data; the generation network is used for reconstructing the received data and outputting reconstructed data; the first authentication network and the second authentication network are used for labeling the received data and outputting a labeling value corresponding to each received data; the annotation value output by the first authentication network is used for indicating the probability that the sample data corresponding to the data received by the first authentication network is the real data, and the annotation value output by the second authentication network is used for indicating the probability that the sample data corresponding to the data received by the second authentication network is the real data. In the model training process, when the sample data comes from the historical data X, the sample data is represented as real data.

St2, taking values from the normal data L and giving sample data x; the initial value of the normal data L is equal to the history data X.

St3, data tuples (x, E (x) and (G (z), z) are marked by the first authentication network, and the corresponding marked values of (x, E (x) are marked as D _xz The labeled values of (x, E (x)), (G (z), z) are denoted as D _xz (G (z), z); e (x) represents the output of the coding network when the input is x; g (Z) represents the output of the generation network when the input is Z, and Z represents the value from the data space Z conforming to normal distribution;

marking the data tuples (x, x) and (x, G (E (x))) by a second identification network, wherein the corresponding marked values of (x, x) are marked as D _xx The labeled values corresponding to (x, x), (x, G (E (x))) are denoted as D _xx (x, G (E (x))); g (E (x)) represents the output of the generation network when the input is E (x).

St4, calculating a first objective function, and updating parameters of each network in the basic model according to the first objective function;

the first objective function is:

V _ALICE (D _xz ,D _xx ,E,G)=V(D _xz ,E,G)+V _CE (D _xx ,E,G)

V(D _xz ,E,G)=E1+E2

E1=Æ _x~px [Æ _E(x)~pE(.|x) [log _k D _xz (x,E(x))]]

E2=Æ _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]

V _CE (D _xx ,E,G)=Æ _x~px [log _k D _xx (x,x)]+Æ _x~px [1-log _k D _xx (x,G(E(x)))]

wherein, V _ALICE (D _xz ,D _xx ,E,G)、V(D _xz ,E,G)、V _CE (D _xx E, G), E1 and E2 all represent transition parameters; k denotes a base number, and k>1；

E (x) denotes the output of the coding network when the input is x, D _xz (x, E (x)) represents the output of the first authentication network when the input is (x, E (x)); z represents the value in the normal distribution Z, G (Z) represents the output of the generation network when the input is Z, D _xz (G (z), z) represents the output of the first authentication network when the input is (x (z), z); d _xx (x, x) represents the output of the second authentication network when the input is (x, x); g (E (x)) represents the output of the generating network when the input is E (x), D _xx (x, G (E (x)) represents the output of the second authentication network when the input is (x, G (E (x));

let the output distribution of the coding module when the input is x be pE (·| x), E (x) is sampled at pE (·| x), i.e. E (x) is E pE (· | x); 198 _E(x)~pE(.|x) [log _k D _xz (x,E(x))]Is expressed log _k D _xz (x, E (x)) when E (x) follows a pE (| x) distribution; 198 _x~px [Æ _E(x)~pE(.|x) [log _k D _xz (x,E(x))]]A representation of (198) _E(x)~pE(.|x) [log _k D _xz (x,E(x))]Expectation when X obeys px distribution, px distribution represents the true distribution of sample data X in normal data L, i.e., the probability that sample data X is from historical data X;

enabling the output distribution of the generation module when the input is z to be pG (| z), and sampling G (z) in pG (| z), namely G (z) belongs to pG (| z); 198 _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Is expressed log _k (1-D _xz (G (z, z)) obeys pG (| z) at G (z)Expectation in distribution; 198 _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]Representation (198) _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Expectation when z follows a normal distribution;

The first objective function is expressed to maximize V _ALICE (D _xz ,D _xx E, G) updating the parameters of the first and second authentication networks for the purpose of minimizing V when the parameters of the first and second authentication networks are fixed _ALICE (D _xz ,D _xx E, G) updating the coding network and the generating network for the target; in St4, according to the first objective function value, gradient update is performed on the first authentication network and the second authentication network, so that the first objective function value is maximized when the encoding network and the generating network are fixed; the encoding network and the generating network are then gradient updated such that the first objective function value is minimized when the first discrimination network and the second discrimination network are stationary.

St5, judging whether the updating times of the basic model reach a first set value n1 or not; if not, returning to the step St2, and accumulating the updating times of the basic model again; if yes, the following step St6 is executed. In specific implementation, n1 takes a value between the intervals [100,200 ].

St6, judging whether the updating times of the normal data L reach a second set value n2 or not; otherwise, the following step St7 is executed; the method comprises the steps of fixing basic model parameters, extracting a coding network, a generating network and a first identifying network from the basic model parameters, and combining the three extracted networks with a set evaluation function to form the anomaly detection model. In specific implementation, n1 takes a value between the intervals [10,20 ].

The update process for the abnormal data S in St7 is as follows:

let ζ be ^(g-1) =X-G(E(L ^(g-1) ))；

ζ ^(g-1) ={ζ _ij ^(g-1) } _{1≦i≦I,1≦j≦J} ；

s _ij ^(g) =Pa(ζ _ij ^(g-1) ) (ii) a Namely:

||ζ _i ^(g-1) || ₂ At ≦ a, ζ _ij ^(g) =0；

S ^(g) ={s _ij ^(g) } _{1≦i≦I,1≦j≦J} ；

Wherein L is ^(g-1) Represents normal data L, L after g-1 iterations ^(g-1) =X-S ^(g-1) ，E(L ^(g-1) ) Indicating encoded network input as normal data L ^(g-1) Output of time, G (E (L) ^(g-1) ) Represents the generation module input as E (L) ^(g-1) ) An output of time; ζ represents a transition parameter, ζ ^(g-1) Represents L ^(g-1) A corresponding transition parameter ζ; the data dimensionality of the transition parameter zeta is the same as that of the abnormal data S; i represents the number of samples in the abnormal data S, and J represents the data dimension of each sample in the abnormal data; zeta _ij ^(g-1) Represents ζ ^(g-1) Data in the j dimension of the ith sample;

s _ij ^(g) represents abnormal data S ^(g) Data in the jth dimension of the ith sample, pa representing a set function; s. the ^(g) Representing abnormal data after g times of updating;

a is a set value, ζ _i ^(g-1) Represents the transition parameter ζ ^(g-1) The ith sample, | | ζ _i ^(g-1) || ₂ Means ζ _i ^(g-1) Is a model ofCounting;

。

the constant data S updates the normal data L, L = X-S; and then returns to step St2.

In this step St7, first, according to L ^(g-1) Calculate ζ ^(g-1) Then combine ζ ^(g-1) Calculating S ^(g) Update L ^(g) =X-S ^(g) (ii) a Let L = L again ^(g) And returns to step St2 to implement the g-th update of the normal data L.

In the above anomaly detection model training process, D _xz (x,E(x))、D _xx (x, x) and D _xx (X, G (E (X))) are all used to evaluate the probability that data X is from historical data X, D _xz (G (z), z) for evaluating the probability that the data z is from the historical data X; during the training process, with D _xz (x,E(x))、D _xx (x, x) and D _xx (x, G (E (x))) tends to 1 and D _xz (G (z), z) tends to 0 as the training target, and the anomaly detection model thus trained can be combined with the output of the first identification network to determine whether the data is real sensor detection data.

Internet of things data anomaly detection method

Referring to fig. 3, the abnormality detection method in the present embodiment includes the steps of:

sq1, acquiring test data x of sensor to be detected _test And the above-described abnormality detection model;

sq2, mixing x _test Substituting the abnormal detection model; coding network pair x _test Encoding and outputting E (x) _test ) Generating network pair E (x) _test ) Performs reconstruction and outputs G (E (x) _test ) ); the first authentication network respectively pairs data tuples (x) _test ，E(x _test ) And (G (E (x)) and (G) _test ))，E(x _test ) Is labeled, first identifies the network output (x) _test ，E(x _test ) ) corresponding annotation value D _xz (x _test ，E(x _test ) And (G (E (x)) and (G) _test ))，E(x _test ) Correspond toIs marked with a value D _xz (G(E(x _test ))，E(x _test ) ); evaluation function combined with labeled value D _xz (x _test ，E(x _test ) And D) _xz (G(E(x _test ))，E(x _test ) Computing for determining sensor test data x _test Evaluation value A (x) of whether or not there is an abnormality _test )；

Examples

In this embodiment, in order to verify the performance of the anomaly detection model provided by the present invention, the anomaly detection model provided by the present invention is compared with two other existing models in combination with different data sets.

In this example, the data sets in 4 shown in table 1 below were used.

Table 1: four data sets

The data sets GAS array drift (GAS sensor array drift) and FCT (Forest Cover Type) record environment data which periodically change along with time, and the data sets HAR (Human Activity Recognition; human Activity Recognition) and DSA (Daily and Sports Activities) record irregular Human behavior data. The four data sets are real world internet of things data sets.

The two existing models selected in this embodiment are an Auto Encoder (AE) anomaly detection model and a generation countermeasure network (AnoGAN) anomaly detection model, respectively.

The AE anomaly detection model directly detects noise using the most primitive self-encoder. The AE anomaly detection model consists of an encoder and a decoder, and is used for learning hidden features of training data and then carrying out anomaly detection according to reconstruction errors of test data.

The AnoGAN anomaly detection model generates a countermeasure network by adopting an original data training standard, and then judges whether data is abnormal or not based on a reconstruction error.

For the convenience of distinguishing, the anomaly detection model provided by the invention is recorded as an RBIGAN model, the RBIGAN model is trained according to the training method provided by the invention, and in the training process, n1=150, n2=15, λ =0.5, θ =0.5, k =2, and m =2. The AE anomaly detection model is referred to as an AE model for short, and the AnoGAN anomaly detection model is referred to as an AnoGAN model for short.

In this embodiment, the method for verifying different models with respect to a data set includes: firstly, dividing a data set into a preliminary training set and a test set, wherein the data proportion of the preliminary training set and the test set is 8; then replacing part of data in the preliminary training set with random noise sampled from Gaussian distribution, and taking the replaced preliminary training set as a training set; and respectively training the RBIGAN model, the AE model and the anoGAN model by combining the training set and the test set.

In this embodiment, when the data sets in table 1 are used, the influence of different noise ratios on the F1 score of the model is shown in fig. 4 to 7. The noise ratio is the ratio of the data in the preliminary training set that is replaced by random noise sampled in the gaussian distribution. And F1 scoring gives consideration to the accuracy and recall rate of the classification model, and the performance of the model can be objectively evaluated.

As can be seen from fig. 4-7, the accuracy of all models is inversely proportional to the noise ratio, especially reflected on the high-dimensional data set. The accuracy of the RBIGAN model in the invention is better than that of an AE model and that of an AnoGAN model under any noise ratio of any data set. Meanwhile, as can be seen from fig. 4 to 7, the higher the noise ratio is, the greater the difference between the accuracies of the AE model and the AnoGAN model with respect to the RBIGAN model is, and it can be seen that the RBIGAN model exhibits stronger robustness, and the performance of the RBIGAN model under the high pollution condition is far better than that of the prior art.

The present invention is not limited to the above embodiments, and any modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A training method for an Internet of things data anomaly detection model is characterized by comprising the following steps:

marking the data tuples (x, x) and (x, G (E (x))) by a second identification network, wherein the corresponding marked values of (x, x) are marked as D _xx The labeled values corresponding to (x, x), (x, G (E (x))) are denoted as D _xx (x, G (E (x))); g (E (x)) represents the output of the generation network when the input is E (x);

st4 preparation of D _xz (x,E(x))、D _xz (G(z),z)、D _xx (x, x) and D _xx Substituting (x, G (E (x))) into the set first objective function to calculate a first objective function value, updating the coding network in the basic model according to the first objective function value, and generatingParameters of the network, the first authentication network and the second authentication network;

st6, judging whether the updating times of the normal data L reach a second set value n2; if not, the following step St7 is executed; if so, fixing basic model parameters, extracting a coding network, a generating network and a first identifying network from the basic model parameters, and combining the three extracted networks with a set evaluation function to form an abnormality detection model; evaluation function for binding D _xz (x, E (x)) and D _xz (G (E (x), E (x))) calculating an evaluation value a (x) for evaluating whether the data x is normal or not; g (E (x)) represents the output of the generation network when the input is E (x);

2. The internet of things data anomaly detection model training method as claimed in claim 1, wherein the first objective function is:

V _ALICE (D _xz ,D _xx ,E,G)=V(D _xz ,E,G)+V _CE (D _xx ,E,G)；

V(D _xz ,E,G)=E1+E2；

E1=Æ _x~px [Æ _E(x)~pE(.|x) [log _k D _xz (x,E(x))]]；

E2=Æ _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]；

let the output distribution of the generation module when the input is z be pG (| z), 198 _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Is expressed log _k (1-D _xz (G (z, z)) a desire when G (z) follows a pG (| z) distribution; 198 _z~pz [Æ _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]]Representation (198) _G(z)~pG(.|z) [log _k (1-D _xz (G(z),z))]Expectation when z follows a normal distribution;

3. The training method for the data anomaly detection model of the internet of things as claimed in claim 1, wherein in St4, according to the first objective function value, the first authentication network and the second authentication network are firstly subjected to gradient updating, so that the first objective function value is maximized when the encoding network and the generating network are fixed; the encoding network and the generating network are then gradient updated such that the first objective function value is minimized when the first discrimination network and the second discrimination network are stationary.

4. The training method for the data anomaly detection model of the Internet of things as claimed in claim 1, wherein the anomaly data calculation model in St7 is as follows:

let ζ be ^(g-1) =X-G(E(L ^(g-1) ))；

ζ ^(g-1) ={ζ _ij ^(g-1) } _{1≦i≦I,1≦j≦J} ；

s _ij ^(g) =Pa(ζ _ij ^(g-1) )；

S ^(g) ={s _ij ^(g) } _{1≦i≦I,1≦j≦J} ；

Wherein L is ^(g-1) Represents the normal data L, L after g-1 iterations ^(g-1) =X-S ^(g-1) ，E(L ^(g-1) ) Indicating the encoded network input as normal data L ^(g-1) Output of time, G (E (L) ^(g-1) ) The input of the generation module is E (L) ^(g-1) ) An output of time; ζ represents a transition parameter, ζ ^(g-1) Represents L ^(g-1) A corresponding transition parameter ζ; the data dimensionality of the transition parameter zeta is the same as that of the abnormal data S; i represents the number of samples in the abnormal data S, and J represents the data dimension of each sample in the abnormal data; zeta _ij ^(g-1) Represents ζ ^(g-1) Data in the j dimension of the ith sample;

in St7, S is first calculated ^(g) Then calculate L ^(g) =X-S ^(g) (ii) a Let L = L again ^(g) And returns to step St2.

5. The Internet of things data anomaly detection model training method as claimed in claim 4, wherein the function Pa is:

||ζ _i ^(g-1) || ₂ At ≦ a, ζ _ij ^(g) =0；

；

a is a set value, ζ _i ^(g-1) Represents a transition parameter ζ ^(g-1) The ith sample, | | ζ _i ^(g-1) || ₂ Represents ζ _i ^(g-1) The two norms of (a).

6. The internet of things data anomaly detection model training method as claimed in claim 1, wherein the evaluation function is:

wherein, A (x) represents the evaluation value corresponding to the sample data x; lambda is a set constant; | | x-G (E (x)) | non-woven phosphor ₂ Is a two-norm of x-G (E (x)); sigma 2]Is a cross entropy validation function.

7. The Internet of things data anomaly detection model training method of claim 6,

σ[D _xz (x,E(x)),1]=-log _m [1/(1+e ^-Dxz(x,E(x)) ]；

σ[D _xz (G(E(x),E(x))),1]=-log _m [1/(1+e ^{-Dxz(G(E(x),E(x)))} ]；

m is a set constant, and m >1.

8. An Internet of things data anomaly detection method is characterized by comprising the following steps:

sq1, acquiring test data x of sensor to be detected _test The anomaly detection model is obtained by adopting the training method of the Internet of things data anomaly detection model according to any one of claims 1-7;

sq2, x _test Substituting the abnormal detection model; coding network pair x _test Encoding and outputting E (x) _test ) Generating network pair E (x) _test ) Performs reconstruction and outputs G (E (x) _test ) ); first authentication network respectively pair data tuple (x) _test ，E(x _test ) And (G (E (x)) and (G) _test ))，E(x _test ) Is labeled, the corresponding labeled value D is output _xz (x _test ，E(x _test ) And D) _xz (G(E(x _test ))，E(x _test ) ); evaluation function combined with labeled value D _xz (x _test ，E(x _test ) And D) _xz (G(E(x _test ))，E(x _test ) Computing for determining sensor test data x _test Evaluation value A (x) of whether or not there is an abnormality _test )；

9. An internet-of-things data anomaly detection system, characterized by comprising a memory, wherein a computer program and an anomaly detection model are stored in the memory, and the computer program is used for realizing the internet-of-things data anomaly detection method in claim 8 when being executed.

10. The system for detecting data abnormality in the internet of things as claimed in claim 9, further comprising a processor connected to the memory, the processor being configured to execute the computer program to implement the method for detecting data abnormality in the internet of things as claimed in claim 8.