CN112257917B - Time sequence abnormal mode detection method based on entropy characteristics and neural network - Google Patents

Time sequence abnormal mode detection method based on entropy characteristics and neural network Download PDF

Info

Publication number
CN112257917B
CN112257917B CN202011116876.4A CN202011116876A CN112257917B CN 112257917 B CN112257917 B CN 112257917B CN 202011116876 A CN202011116876 A CN 202011116876A CN 112257917 B CN112257917 B CN 112257917B
Authority
CN
China
Prior art keywords
sequence
score
sample
differential rate
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011116876.4A
Other languages
Chinese (zh)
Other versions
CN112257917A (en
Inventor
苏维均
牛雨晴
于重重
赵霞
韩璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN202011116876.4A priority Critical patent/CN112257917B/en
Publication of CN112257917A publication Critical patent/CN112257917A/en
Application granted granted Critical
Publication of CN112257917B publication Critical patent/CN112257917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Agronomy & Crop Science (AREA)
  • Animal Husbandry (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Mining & Mineral Resources (AREA)
  • Primary Health Care (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a time sequence abnormal mode detection method based on entropy characteristics and a neural network, which comprises the following steps: 1) Extracting a second-order differential rate sample entropy feature sequence from the time sequence in the training data set; 2) Training to generate an countermeasure network model to obtain a generator and a corresponding discriminator; 3) Calculating the abnormal score of the feature sequence and constructing a threshold value; 4) And carrying out abnormality judgment on the input data to be detected according to the threshold value. The method has the advantages that the characteristic extraction is carried out on the time sequence data by utilizing the differential rate sample entropy, so that the abnormal mode is more obvious; a new anomaly score calculation method is established, the accuracy and generalization of model identification are improved, and the model identification method has higher practicability and application value.

Description

Time sequence abnormal mode detection method based on entropy characteristics and neural network
Technical Field
The invention relates to prediction of a coal mine thermodynamic composite disaster, in particular to a time sequence abnormal mode detection method based on entropy characteristics and a neural network, and belongs to the field of emergency safety.
Background
Coal is used as main energy to occupy irreplaceable important position in the energy structure of China, the left area after the coal mine is mined is a goaf, ventilation in the goaf is poor, coal is more, and the coal is continuously oxidized to generate combustible gas so as to easily cause coal thermodynamic disasters such as spontaneous combustion, gas explosion and the like. The concentration change of the released combustible gas shows a certain rule along with the development of time, the inflection points of the monitoring data in different stages are effectively detected, and when the gas concentration is greatly changed, the abnormal mode can be considered to be entered, so that the possibility of disasters such as spontaneous combustion of coal is indicated. Different coal mine gases have different production contents, and if the size of the gas content value is used as a judging standard for disaster occurrence, a large error can be caused when the gas content value is applied to other coal mines, so that the detection of an abnormal mode can improve the generalization of disaster judgment, and a new idea is provided for the detection of coal composite disasters.
Along with research and penetration of artificial intelligence theory, a time sequence prediction method is applied to predict coal and gas to become a new trend, and the method is introduced into quantitative evaluation and analysis of coal and gas disasters and is combined with theory researches such as computer technology, support vector machines, artificial neural networks and the like to study, but the prediction methods are difficult to apply to complex data, have the problem of easy sinking into local minimum values, have the fitting phenomenon, have low accuracy and have large limitation.
With the improvement of information technology, the problem of abnormality detection in time series has become a recent research hotspot. Time series anomalies generally refer to a series of data that is significantly different from other data, and such anomalies do not refer to random deviations, but rather differences due to different mechanisms. The abnormal mode of the gas time sequence data is detected, so that a theoretical basis can be provided for the coal mine thermodynamic disasters. If the time sequence data has an abnormal mode, the change trend of the data is greatly changed, and the time sequence data can be used as a judging basis for disaster occurrence.
The prior art method (CN 201910809956.9) utilizes GAN to carry out anomaly detection on a time sequence, mainly builds an anomaly detection model by using an optimized GAN generator and a discriminator, and uses the generated residual error and discrimination loss output by the model as the judgment basis for judging anomaly data. However, most of the time sequence changes are not obvious, and the direct time sequence is used as input data of the GAN, so that the characteristics are not obvious enough; meanwhile, more effective judgment criteria are obtained by using the generated residual error and the discrimination loss output by the model, and how to improve the accuracy and universality of abnormal judgment is still to be researched.
Disclosure of Invention
The invention aims to realize a time sequence abnormal mode detection method based on entropy characteristics and a neural network. The method of the invention is divided into 4 stages: extracting a second-order differential rate sample entropy feature sequence from the time sequence in the training data set; training to generate an countermeasure network model to obtain a generator and a corresponding discriminator; calculating the abnormal score of the feature sequence and constructing a threshold value; and carrying out abnormality judgment on the input data to be detected according to the threshold value. Specifically, the method of the present invention comprises the steps of:
A. extracting a second-order differential rate sample entropy feature sequence from a time sequence in a training data set, wherein the method is concretely realized as follows:
A1. dividing the training data set into two sets, namely a training data set 1 and a training data set 2;
the training data set 1 is all normal data, and the training data set 2 comprises normal data and abnormal data;
A2. for training data set 1 time series
Figure BDA0002730598460000021
Segmentation is performed by sliding the window size W and the step size d through the method 1 to obtain a sequence segment set W with the length L, wherein the ith time sequence segment is marked as s i
s i =[x 1+(i-1)d ,x 2+(i-1)d ,…,x 1+(i-1)d+w ](1)
Said T train Representing the number of time series of training data sets, 1 xT train Representing a training dataset time series dimension;
A3. performing differential rate operation on each sequence segment in the sequence segment set W to obtain a second-order differential rate sequence of all the sequence segments, wherein the second-order differential rate sequence is specifically realized as follows:
A3.1. for sequence segment s i Calculating a second-order differential rate sequence G= { G by using the method 2 1 ,g 2 ,…,g w′ -and find its standard deviation std;
Figure BDA0002730598460000022
the said process
Figure BDA0002730598460000023
E-order difference value for u time point, +.>
Figure BDA0002730598460000024
The e-level difference value is the u-1 time point;
A3.2. dividing a second-order differential rate sequence with w 'data points by taking m time sequence data points as one sub-segment, and marking the total w' -m+1 sub-sequence segments as K2 i ={q 1 ,q 2 ,…,q w′-m+1 };
A4. Sample entropy feature extraction is carried out on the second-order differential rate sequences of all the sequence segments, so that the second-order differential rate sample entropy feature sequences of all the sequence segments are obtained, and the method is concretely realized as follows:
A4.1. calculation of arbitrary two subsequence fragments q a And q b Distance D [ q ] a ,q b ]The distance is determined by the maximum difference of the corresponding position elements in the two sub-sequence segments;
A4.2. calculation of the subsequence fragment q a The similarity probability with other subsequence segments is obtained through a formula 3, the duty ratio of the subsequence segments with the distance between the subsequence segments smaller than a threshold value is obtained through a formula 4, and the average similarity probability of the second-order differential rate sequence is obtained;
Figure BDA0002730598460000031
Figure BDA0002730598460000032
r is a similarity threshold;
A4.3. according to steps A4.1-A4.2, the average similarity probability B is recalculated by taking m+1 as the length of the subsequence m+1 (r) obtaining a second-order differential rate sample entropy feature SE through a formula 5;
Figure BDA0002730598460000033
A5. carrying out segment average pretreatment on the differential rate sample entropy sequence to obtain the differential rate sample entropy sequence, wherein the method is specifically realized as follows:
A5.1. from X t (t=1, 2..t-w), and removing a sequence segment S of length w t ={X t ,X t+1 ,...,X w+t-1 } 1×t Summing according to formula 6, and then averaging according to formula 7;
sum t =X t +X t +1...X w+t-1 (6)
sum t =sum t /w; (7)
A5.2. Repeating the step A4.1, taking t-w sequence segments out, and adding sum t Composing a new differential rate sample entropy sequence S t '={sum 1 ,sum 2 ,…,sum t-w } 1×t
B. Training and generating an countermeasure network model to obtain a generator and a corresponding discriminator, wherein the method is concretely realized as follows:
B1. random sampling noise data z= { Z i I=1, 2, …, n, where n corresponds to the number of samples. The generator model G is a plurality of LSTM memory units, the number of the memory units is set, Z is input into the generator model G, and reconstructed sample sequence data G (Z) is generated;
B2. entropy sequence S of new differential rate sample t The' and the generated reconstructed sample sequence data G (Z) are input into a built discriminator model D;
B3. updating model parameters by using a random gradient descent algorithm according to the value of the loss function, updating parameters of a discriminator firstly, and then updating parameters of a generator by using an Adam optimization algorithm according to noise data;
B4. saving model parameters, repeating the steps B1-B3 for cyclic iteration, and finally obtaining a trained generator model G which can generate a normal time sequence and a corresponding discriminant model D;
C. calculating the anomaly score of the feature sequence and constructing a threshold, wherein the anomaly score is specifically realized as follows:
C1. using time sequences in training dataset 2
Figure BDA0002730598460000041
Repeating steps A2-A5, and extracting features to obtain new feature sequence->
Figure BDA0002730598460000042
C2. Will randomly sample noise data Z val Is input into a training completion generator G to generate a reconstructed sample G (Z val ) Calculating the generation anomaly score R of the input sample by using the generation error score The method is concretely realized as follows:
C2.1. for a reconstructed sample of length n G (Z val ) New feature sequence with training dataset 2
Figure BDA0002730598460000043
The elements in the absolute error E of (2) are sequenced from small to large to obtain sequenced absolute error E i ′={e′ 1 ,e′ 2 ,…,e′ n Sequence and absolute error E i ′={e′ 1 ,e′ 2 ,…,e′ n An average value M;
C2.2. will E' i The extracted element is compared with the average value M, and E 'is taken out' i In { e' k ,e′ k+1 ,…,e′ n -data elements larger than the average value M, the number n-k+1; initializing a weight sequence W i ′={w′ 1 ,w′ 2 ,…,w′ n } T ,w′ 1~n-2 =0, set x' n Corresponding weight w' n Is lambda, x' n-1 Corresponding weight w' n-1 For 1-lambda, update weight sequence W i ' size of element in W is defined as W by 8 i ' update;
Figure BDA0002730598460000044
C2.3. using updated weight sequences W i 'and ordered sample E' i Calculating the generated anomaly score R of the training sample set 2 through a method 9 score
R score =E i ′·W i ' A (9)
C3. Generating a sample and a new feature sequence by using D output of the discriminator trained in the step B
Figure BDA0002730598460000046
Calculating the similarity probability P of the number of the abnormal scores D score 1-P;
C4. using the discrimination anomaly score D score And generating an anomaly score R score The anomaly score O is calculated by means of the method 10, and a threshold is established according to the training data set 2, specifically implemented as follows:
O=W D ×D score +W G ×R score (10)
The W is D And W is G Generating weights of the anomaly scores for the discrimination anomaly scores and the samples respectively;
C4.1. training data set
Figure BDA0002730598460000045
The maximum abnormal score and the minimum abnormal score in the result are used as the maximum boundary and the minimum boundary, the maximum abnormal score and the minimum abnormal score are divided averagely, and the abnormal score of the q-th training data set 2 is calculated through a formula 11;
Figure BDA0002730598460000051
C4.2. the abnormal score corresponding to the maximum F1 score is used as a threshold value, and the calculation mode of F1 is shown as formula 12;
Figure BDA0002730598460000052
Figure BDA0002730598460000053
the Pre is the proportion of positive samples predicted to be positive in all samples predicted to be positive, and the Rec is the proportion of positive samples predicted to be positive in all positive samples; TP is a positive sample predicted to be positive by the model; FP is the negative sample predicted to be positive by the model; FN is positive samples predicted negative by the model;
D. the method comprises the steps of carrying out abnormality judgment on input data to be detected according to a threshold value, and specifically realizing the following steps:
D1. inputting a time series of data sets to be detected
Figure BDA0002730598460000054
Repeating the steps A1-A5, and extracting entropy features of the differential rate sample to obtain a new time sequence +.>
Figure BDA0002730598460000055
D2. Repeating steps C1-C4, and
Figure BDA0002730598460000056
inputting the data into a trained generation countermeasure network, and calculating an anomaly score O of the data to be detected by using a formula 10 real
D3. Anomaly score O obtained by calculation real C, comparing the data with the threshold value calculated in the step C, if the abnormal score is larger than the threshold value, judging that the data to be detected contains an abnormal mode, otherwise, judging that the data to be detected does not contain the abnormal mode.
The method has the advantages that the characteristic extraction is carried out on the time sequence data by utilizing the differential rate sample entropy, so that the abnormal mode is more obvious; a new anomaly score calculation method is established, the accuracy and generalization of time sequence anomaly mode detection are improved, and the method has higher practicability and application value.
Drawings
Fig. 1: overall flow chart for abnormal pattern detection
Detailed Description
The present invention will be further described as an embodiment by performing CO time-series prediction on experimental data and performing a description of a time-series abnormal pattern detection method based on a differential rate entropy feature and a generation countermeasure network according to a time-series data amount, an input-output dimension, and the like, with reference to the accompanying drawings.
The overall flow chart of the method is shown in fig. 1. The method comprises the following steps: 1) Extracting a second-order differential rate sample entropy feature sequence from the time sequence in the training data set; 2) Training to generate an countermeasure network model to obtain a generator and a corresponding discriminator; 3) Calculating the abnormal score of the feature sequence and constructing a threshold value; 4) And carrying out abnormality judgment on the input data to be detected according to the threshold value. The invention is further described in the following steps, in connection with examples:
A. extracting a second-order differential rate sample entropy feature sequence from a time sequence in a training data set, wherein the method is concretely realized as follows:
A1. selecting experimental data, wherein a research object is a one-dimensional time sequence of CO gas concentration, selecting a training data set, dividing the training data set into two sets, and respectively marking the two sets as a training data set 1 and a training data set 2;
the training data set 1 is all normal data, and the training data set 2 comprises normal data and abnormal data;
A2. setting the sliding window size of a sequence section to be 10 for a training data set 1 which is all normal data, and segmenting the training data set by sliding with the step length of 1;
A3. performing differential rate operation on each sequence segment in the sequence segment set to obtain a second-order differential rate sequence of all the sequence segments, wherein the second-order differential rate sequence is specifically realized as follows:
A3.1. for 348 pieces of data in total of CO gas concentration sequence, the formula is utilized
Figure BDA0002730598460000061
The second-order differential rate sequence of 345 partial data is obtained, and G= { G is shown in table 2 1 ,g 2 ,…,g w′ And its standard deviation std was found to be 0.11, part of the data is as follows:
Figure BDA0002730598460000062
Figure BDA0002730598460000071
A3.2. dividing a second-order differential rate sequence with 345 data points by taking 6 time sequence data points as one sub-segment, and marking the total 340 sub-sequence segments as K2 i ={q 1 ,q 2 ,…,q w′-m+1 Partial data is as follows:
Figure BDA0002730598460000072
A4. sample entropy feature extraction is carried out on the second-order differential rate sequences of all the sequence segments, so as to obtain second-order differential rate sample entropy feature sequences of all the sequence segments, and the method is concretely realized as follows;
A4.1. calculating the second-order differential rate sample entropy characteristic of each sequence segment, and finally obtaining a complete second-order differential rate sample entropy sequence, wherein partial data are as follows:
Figure BDA0002730598460000073
A5. carrying out segment average pretreatment on the differential rate sample entropy sequence to obtain the differential rate sample entropy sequence, wherein the method is specifically realized as follows:
A5.1. from X t (t=1, 2..t-w), and removing a sequence segment S of length w t ={X t ,X t+1 ,...,X w+t-1 } 1×t Summing and then averaging;
A5.2. repeating the step A4.1, taking t-w sequence segments out, and adding sum t Composition of a novel sequence S t '={sum 1 ,sum 2 ,…,sum t-w } 1×t The partial data are as follows:
Figure BDA0002730598460000081
B. training and generating an countermeasure network model to obtain a generator and a corresponding discriminator, wherein the method is concretely realized as follows:
B1. random sampling noise data z= { Z i I=1, 2, …, n }, where n is 330. The generator model is a plurality of LSTM memory units, the number of the memory units is set, Z is input into the built generator model, and reconstructed sample sequence data G (Z) is generated;
B2. entropy S of new differential rate sample t The' and the generated reconstructed sample sequence data G (Z) are input into the constructed discriminant model D, and part of the parameter data are as follows:
Figure BDA0002730598460000082
B3. updating model parameters by using a random gradient descent algorithm according to the value of the loss function, updating parameters of a discriminator firstly, and then updating parameters of a generator by using an Adam optimization algorithm according to noise data;
B4. saving model parameters, returning to B2 for 1000 times of cyclic iteration, setting the learning rate to 0.1, and finally obtaining a trained generator model G and a trained discriminant model D;
C. calculating the anomaly score of the feature sequence and constructing a threshold, wherein the anomaly score is specifically realized as follows:
C1. first repeating steps A2-A5 for a time series of training data sets 2 comprising normal data and abnormal data
Figure BDA0002730598460000091
Extracting features to obtain new feature sequence->
Figure BDA0002730598460000092
The partial data are as follows:
Figure BDA0002730598460000093
/>
C2. using the discrimination anomaly score D score And sample generation anomaliesScore R score Calculating an anomaly score O;
C2.1. training data set
Figure BDA0002730598460000094
The maximum anomaly score and the minimum anomaly score in the result are used as the maximum and minimum boundaries, and the maximum anomaly score and the minimum anomaly score are divided evenly to obtain the anomaly score of the training data set 2 of the q-th section
Figure BDA0002730598460000095
C2.2. The maximum F1 score is 0.8916, and the corresponding anomaly score O is used as a threshold value to obtain a threshold value of 0.375;
D. the method comprises the steps of carrying out abnormality judgment on input data to be detected according to a threshold value, and specifically realizing the following steps:
D1. inputting time series samples of a data set to be detected
Figure BDA0002730598460000096
Repeating the steps A2-A5, and extracting entropy features of differential rate samples to obtain a new time sequence +.>
Figure BDA0002730598460000097
The partial data are as follows:
Figure BDA0002730598460000098
D2. repeating steps C1-C4, and
Figure BDA0002730598460000099
inputting the data into a trained generation countermeasure network, and calculating an anomaly score O of an actual data sample real 0.572;
D3. anomaly score O obtained by calculation real C, comparing the abnormal score with the threshold value calculated in the step C, and judging that the sample is an abnormal sample if the abnormal score is larger than the threshold value, wherein the actual processing result of the whole sample is as follows:
Figure BDA0002730598460000101
the method realizes a time sequence abnormal mode detection method based on the difference rate entropy characteristics and the generation countermeasure network, and can detect whether the sequence section contains an abnormal mode, thereby achieving the purpose of providing a judgment basis for occurrence of coal mine thermodynamic disasters; a new anomaly score calculation method is established, the accuracy and generalization of model identification are improved, and the model identification method has higher application value.
Finally, it should be noted that the examples are disclosed for the purpose of aiding in the further understanding of the present invention, but those skilled in the art will appreciate that: various alternatives and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the invention should not be limited to the disclosed embodiments, but rather the scope of the invention is defined by the appended claims.

Claims (6)

1. A time sequence abnormal mode detection method based on entropy characteristics and a neural network comprises the following steps:
A. extracting a second-order differential rate sample entropy feature sequence from a time sequence in a training data set, wherein the method is concretely realized as follows:
A1. dividing the training data set into two sets, namely a training data set 1 and a training data set 2;
the training data set 1 is all normal data, and the training data set 2 comprises normal data and abnormal data;
A2. for training data set 1 time series
Figure QLYQS_1
Segmenting by sliding with a window size W and a step size d to obtain a sequence segment set W with a length L, wherein the ith time sequence segment is marked as s i The calculation formula is as follows:
s i =[x 1+(i-1)d ,x 2+(i-1)d ,…,x 1+(i-1)d+w ]
said T train Representing the number of time series of training data sets, 1 xT train Representing a training dataset time series dimension;
A3. performing differential rate operation on each sequence segment in the sequence segment set W to obtain a second-order differential rate sequence of all the sequence segments;
A4. sample entropy feature extraction is carried out on the second-order differential rate sequences of all the sequence segments, so that the second-order differential rate sample entropy feature sequences of all the sequence segments are obtained;
A5. carrying out sectional average pretreatment on the differential rate sample entropy sequence to obtain a differential rate sample entropy sequence;
B. training and generating an countermeasure network model to obtain a generator and a corresponding discriminator, wherein the method is concretely realized as follows:
B1. random sampling noise data z= { Z i I=1, 2, …, n }, where n corresponds to the number of samples, the generator model G is a plurality of LSTM memory cells, and the number of memory cells is set, Z is input into the generator model G, generating reconstructed sample sequence data G (Z);
B2. entropy sequence S of new differential rate sample t The' and the generated reconstructed sample sequence data G (Z) are input into a built discriminator model D;
B3. updating model parameters by using a random gradient descent algorithm according to the value of the loss function, updating parameters of a discriminator firstly, and then updating parameters of a generator by using an Adam optimization algorithm according to noise data;
B4. saving model parameters, repeating the steps B1-B3 for cyclic iteration, and finally obtaining a trained generator model G which can generate a normal time sequence and a corresponding discriminant model D;
C. calculating the anomaly score of the feature sequence and constructing a threshold, wherein the anomaly score is specifically realized as follows:
C1. using time sequences in training dataset 2
Figure QLYQS_2
Repeating the steps A2-A5, extracting the characteristics to obtain a new characteristic sequence/>
Figure QLYQS_3
C2. Will randomly sample noise data Z val Is input into a training completion generator G to generate a reconstructed sample G (Z val ) Calculating the generation anomaly score R of the input sample by using the generation error score
C3. Generating a sample and a new feature sequence by using D output of the discriminator trained in the step B
Figure QLYQS_4
Calculating the similarity probability P of the number of the abnormal scores D score 1-P;
C4. using the discrimination anomaly score D score And generating an anomaly score R score The anomaly score O is calculated, a threshold value is established according to the training data set 2, and a calculation formula is as follows:
O=W D ×D score +W G ×R score
the W is D And W is G Generating weights of the anomaly scores for the discrimination anomaly scores and the samples respectively;
D. the method comprises the steps of carrying out abnormality judgment on input data to be detected according to a threshold value, and specifically realizing the following steps:
D1. inputting a time series of data sets to be detected
Figure QLYQS_5
Repeating the steps A1-A5, and extracting entropy features of the differential rate sample to obtain a new time sequence +.>
Figure QLYQS_6
D2. Repeating steps C1-C4, and
Figure QLYQS_7
inputting the data into a trained generation countermeasure network, and calculating an anomaly score O of the data to be detected by using a formula 10 real
D3. Anomaly score O obtained by calculation real C, comparing the data with the threshold value calculated in the step C, if the abnormal score is larger than the threshold value, judging that the data to be detected contains an abnormal mode, otherwise, judging that the data to be detected does not contain the abnormal mode.
2. The method for detecting abnormal patterns of time series based on entropy features and neural network as claimed in claim 1, wherein the differential rate operation is performed on each sequence segment in the sequence segment set W to obtain a second-order differential rate sequence of all sequence segments, which is specifically implemented as follows:
A3.1. for sequence segment s i Calculate its second order differential rate sequence g= { G 1 ,g 2 ,…,g w′ And solving the standard deviation std, wherein the calculation formula is as follows:
Figure QLYQS_8
the said process
Figure QLYQS_9
E-order difference value for u time point, +.>
Figure QLYQS_10
The e-level difference value is the u-1 time point;
A3.2. dividing a second-order differential rate sequence with w 'data points by taking m time sequence data points as one sub-segment, and marking the total w' -m+1 sub-sequence segments as K2 i ={q 1 ,q 2 ,…,q w′-m+1 }。
3. The method for detecting abnormal modes of time series based on entropy features and neural network as claimed in claim 1, wherein the sample entropy feature extraction is performed on the second order differential rate sequences of all the sequence segments to obtain the second order differential rate sample entropy feature sequences of all the sequence segments, and the specific implementation steps are as follows:
A4.1. calculation of arbitrary two subsequence fragments q a And q b Distance D [ q ] a ,q b ]The distance is composed of two sub-componentsDetermining the maximum difference value of the corresponding position element in the sequence segment;
A4.2. calculation of the subsequence fragment q a The average similarity probability of the second-order differential rate sequence is used as a second-order differential rate sample entropy by using the ratio of the subsequence segments with the distance between the subsequence segments smaller than the threshold value as the similarity probability of the rest subsequence segments, and the calculation formula is as follows:
Figure QLYQS_11
Figure QLYQS_12
r is a similarity threshold;
A4.3. according to steps A4.1-A4.2, the average similarity probability B is recalculated by taking m+1 as the length of the subsequence m+1 (r), second-order differential rate sample entropy feature SE, the calculation mode is:
Figure QLYQS_13
4. the method for detecting abnormal patterns of time series based on entropy features and neural network as claimed in claim 1, wherein the step of performing segment average pretreatment on the differential rate sample entropy sequence to obtain the differential rate sample entropy sequence is specifically implemented as follows:
A5.1. from X t (t=1, 2..t-w), and removing a sequence segment S of length w t ={X t ,X t+1 ,...,X w+t-1 } 1×t Summing and then averaging, wherein the calculation formula is as follows:
sum t =X t +X t +1...X w+t-1
sum t =sum t /w;
A5.2. repeating the step A4.1, taking t-w sequence segments out, and adding sum t Composing a new differential rate sample entropy sequence S t '={sum 1 ,sum 2 ,…,sum t-w } 1×t
5. The method for detecting abnormal patterns in time series based on entropy features and neural network as claimed in claim 1, wherein the noise data Z is randomly sampled val Is input into a training completion generator G to generate a reconstructed sample G (Z val ) Calculating the generation anomaly score R of the input sample by using the generation error score The method is concretely realized as follows:
C2.1. for a reconstructed sample of length n G (Z val ) New feature sequence with training dataset 2
Figure QLYQS_14
The elements in the absolute error E of (2) are sequenced from small to large to obtain sequenced absolute error E i ′={e′ 1 ,e′ 2 ,…,e′ n Sequence and find the absolute error E' i ={e′ 1 ,e′ 2 ,…,e′ n An average value M;
C2.2. will E' i The extracted element is compared with the average value M, and E 'is taken out' i In { e' k ,e′ k+1 ,…,e′ n -data elements larger than the average value M, the number n-k+1; initializing a weight sequence W i ′={w′ 1 ,w′ 2 ,…,w′ n } T ,w′ 1~n-2 =0, set x' n Corresponding weight w' n Is lambda, x' n-1 Corresponding weight w' n-1 For 1-lambda, update weight sequence W i The size of the element in' is calculated as:
Figure QLYQS_15
C2.3. using updated weight sequences W i 'and ordered sample E' i Calculating the generation anomaly score R of the training sample set 2 score The calculation formula is as follows:
R score =E i ′·W i ′。
6. the method for detecting a time-series abnormal pattern based on entropy features and neural network according to claim 1, wherein the discrimination abnormality score D is used score And generating an anomaly score R score The anomaly score O is calculated by means of the method 10, and a threshold is established according to the training data set 2, specifically implemented as follows:
C4.1. training data set
Figure QLYQS_16
The maximum anomaly score and the minimum anomaly score in the result are used as the maximum boundary and the minimum boundary, the maximum anomaly score and the minimum anomaly score are divided averagely, the anomaly score of the q-th section training data set 2 is calculated, and the calculation formula is as follows:
Figure QLYQS_17
C4.2. the anomaly score corresponding to the maximum F1 score is used as a threshold value, and the calculation formula of F1 is as follows:
Figure QLYQS_18
Figure QLYQS_19
the Pre is the proportion of positive samples predicted to be positive in all samples predicted to be positive; rec is the proportion of positive samples predicted to be positive in all positive samples, TP is the positive sample predicted to be positive by the model; FP is the negative sample predicted to be positive by the model; FN is positive samples predicted negative by the model.
CN202011116876.4A 2020-10-19 2020-10-19 Time sequence abnormal mode detection method based on entropy characteristics and neural network Active CN112257917B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011116876.4A CN112257917B (en) 2020-10-19 2020-10-19 Time sequence abnormal mode detection method based on entropy characteristics and neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011116876.4A CN112257917B (en) 2020-10-19 2020-10-19 Time sequence abnormal mode detection method based on entropy characteristics and neural network

Publications (2)

Publication Number Publication Date
CN112257917A CN112257917A (en) 2021-01-22
CN112257917B true CN112257917B (en) 2023-05-12

Family

ID=74244702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011116876.4A Active CN112257917B (en) 2020-10-19 2020-10-19 Time sequence abnormal mode detection method based on entropy characteristics and neural network

Country Status (1)

Country Link
CN (1) CN112257917B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127705B (en) * 2021-04-02 2022-08-05 西华大学 Heterogeneous bidirectional generation countermeasure network model and time sequence anomaly detection method
CN114386454B (en) * 2021-12-09 2023-02-03 首都医科大学附属北京友谊医院 Medical time sequence signal data processing method based on signal mixing strategy
CN114844796A (en) * 2022-04-29 2022-08-02 济南浪潮数据技术有限公司 Method, device and medium for detecting abnormity of time-series KPI
CN115600116B (en) * 2022-12-15 2023-07-21 西南石油大学 Dynamic detection method, system, storage medium and terminal for time sequence abnormality

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001092990A2 (en) * 2000-06-01 2001-12-06 Variagenics, Inc. Structure-based methods for assessing amino acid variances
CN103886405A (en) * 2014-02-20 2014-06-25 东南大学 Boiler combustion condition identification method based on information entropy characteristics and probability nerve network
CN109035488A (en) * 2018-08-07 2018-12-18 哈尔滨工业大学(威海) Aero-engine time series method for detecting abnormality based on CNN feature extraction
CN110071913A (en) * 2019-03-26 2019-07-30 同济大学 A kind of time series method for detecting abnormality based on unsupervised learning
CN110211114A (en) * 2019-06-03 2019-09-06 浙江大学 A kind of scarce visible detection method of the vanning based on deep learning
CN110598851A (en) * 2019-08-29 2019-12-20 北京航空航天大学合肥创新研究院 Time series data abnormity detection method fusing LSTM and GAN

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001092990A2 (en) * 2000-06-01 2001-12-06 Variagenics, Inc. Structure-based methods for assessing amino acid variances
CN103886405A (en) * 2014-02-20 2014-06-25 东南大学 Boiler combustion condition identification method based on information entropy characteristics and probability nerve network
CN109035488A (en) * 2018-08-07 2018-12-18 哈尔滨工业大学(威海) Aero-engine time series method for detecting abnormality based on CNN feature extraction
CN110071913A (en) * 2019-03-26 2019-07-30 同济大学 A kind of time series method for detecting abnormality based on unsupervised learning
CN110211114A (en) * 2019-06-03 2019-09-06 浙江大学 A kind of scarce visible detection method of the vanning based on deep learning
CN110598851A (en) * 2019-08-29 2019-12-20 北京航空航天大学合肥创新研究院 Time series data abnormity detection method fusing LSTM and GAN

Also Published As

Publication number Publication date
CN112257917A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
CN112257917B (en) Time sequence abnormal mode detection method based on entropy characteristics and neural network
CN110018670B (en) Industrial process abnormal working condition prediction method based on dynamic association rule mining
CN113434357B (en) Log anomaly detection method and device based on sequence prediction
CN107194524B (en) RBF neural network-based coal and gas outburst prediction method
CN112529341B (en) Drilling well leakage probability prediction method based on naive Bayesian algorithm
CN109918505B (en) Network security event visualization method based on text processing
CN111898639B (en) Dimension reduction-based hierarchical time memory industrial anomaly detection method and device
CN112199670B (en) Log monitoring method for improving IFOREST (entry face detection sequence) to conduct abnormity detection based on deep learning
CN112761628B (en) Shale gas yield determination method and device based on long-term and short-term memory neural network
CN114281864A (en) Correlation analysis method for power network alarm information
CN109976806B (en) Java statement block clone detection method based on byte code sequence matching
CN113431635B (en) Semi-supervised shield tunnel face geological type estimation method and system
CN108280289B (en) Rock burst danger level prediction method based on local weighted C4.5 algorithm
CN112380274A (en) Control process-oriented anomaly detection system
CN113806889A (en) Processing method, device and equipment of TBM cutter head torque real-time prediction model
CN116090819A (en) Power distribution network risk situation prediction method based on association rule
CN115017206A (en) Mine CO abnormal disturbance intelligent identification and coal spontaneous combustion early warning value determination method
Li et al. A rockburst prediction model based on extreme learning machine with improved Harris Hawks optimization and its application
CN114021620A (en) Electrical submersible pump fault diagnosis method based on BP neural network feature extraction
CN106919650A (en) A kind of textural anomaly detection method of increment parallel type Dynamic Graph
CN110991363B (en) Method for extracting CO emission characteristics of coal mine safety monitoring system in different coal mining processes
CN116050285B (en) Slurry balance shield machine shield tail sealing grease consumption prediction method and system
CN116707918A (en) Network security situation assessment method based on CBAM-EfficientNet anomaly detection
CN113722230B (en) Integrated evaluation method and device for vulnerability mining capability of fuzzy test tool
CN116361640A (en) Multi-variable time sequence anomaly detection method based on hierarchical attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant