CN112766429B - Method, device, computer equipment and medium for anomaly detection - Google Patents

Method, device, computer equipment and medium for anomaly detection Download PDF

Info

Publication number
CN112766429B
CN112766429B CN202110380361.3A CN202110380361A CN112766429B CN 112766429 B CN112766429 B CN 112766429B CN 202110380361 A CN202110380361 A CN 202110380361A CN 112766429 B CN112766429 B CN 112766429B
Authority
CN
China
Prior art keywords
time sequence
sequence data
historical
monitoring time
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110380361.3A
Other languages
Chinese (zh)
Other versions
CN112766429A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Real AI Technology Co Ltd
Original Assignee
Beijing Real AI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Real AI Technology Co Ltd filed Critical Beijing Real AI Technology Co Ltd
Priority to CN202110380361.3A priority Critical patent/CN112766429B/en
Publication of CN112766429A publication Critical patent/CN112766429A/en
Application granted granted Critical
Publication of CN112766429B publication Critical patent/CN112766429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The application provides a method, a device, computer equipment and a medium for anomaly detection, wherein the method comprises the following steps: acquiring a training sample set used for training; for each training sample, inputting a historical monitoring time sequence data set in the training sample into a stack-sharing variational self-coder model to be trained to obtain a historical reference time sequence data set, and training the stack-sharing variational self-coder model to be trained by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set; inputting monitoring time sequence data acquired from target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data; determining the abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data; and sending abnormal alarm information according to the abnormal condition of the target equipment.

Description

Method, device, computer equipment and medium for anomaly detection
Technical Field
The present application relates to the field of data processing, and in particular, to a method, an apparatus, a computer device, and a medium for anomaly detection.
Background
In modern society, along with scientific progress, machine equipment, computer systems and the like are widely applied to production and life of people, productivity is greatly liberated in each link of life, and production efficiency is improved. Common equipment comprises a wind driven generator, a dam, a computing cluster and the like in an industrial production scene, the equipment is not permanent, and the conditions of abrasion, part damage, program breakdown and the like can occur in the case of long-term operation.
Disclosure of Invention
In view of the above, an object of the present application is to provide a method, an apparatus, a computer device, and a medium for anomaly detection, which are used to solve the problem of low operation efficiency of a device anomaly detection model in the prior art.
In a first aspect, an embodiment of the present application provides an anomaly detection method, including:
acquiring a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
for each training sample, inputting a historical monitoring time sequence data set in the training sample into a stack-sharing variational self-coder model to be trained to obtain a historical reference time sequence data set, and training the stack-sharing variational self-coder model to be trained by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
inputting monitoring time sequence data acquired from each channel of target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data of each monitoring time sequence data;
determining an abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data;
and sending abnormal alarm information according to the abnormal condition of the target equipment.
Optionally, for each training sample, inputting a historical monitoring time sequence data set in the training sample to a stack-shared variational self-encoder model to be trained to obtain a historical reference time sequence data set, and training the stack-shared variational self-encoder model to be trained by using a difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and historical reference time sequence data of a corresponding channel in the historical reference time sequence data set, including:
and aiming at the historical monitoring time sequence data corresponding to each channel in each training sample, inputting the historical monitoring time sequence data into a to-be-trained stacking-sharing variational self-encoder model to obtain historical reference time sequence data, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data.
Optionally, the inputting the historical monitoring time sequence data into a stacking-sharing variational self-coder model to be trained to obtain historical reference time sequence data includes:
encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and decoding the historical hidden state through a decoder of a stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
Optionally, the encoding the historical monitoring time series data by an encoder of a stack-shared variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time series data includes:
calculating the probability distribution of the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained;
and obtaining the historical hidden state of the historical monitoring time sequence data from the probability distribution through random sampling.
Optionally, determining an abnormal condition of the target device according to a difference between each of the monitoring time series data and the corresponding reference time series data, including:
calculating the mean-error variance between the monitoring data of each moment of the monitoring time series data and the reference data of the corresponding moment in the reference time series data aiming at each monitoring time series data;
and determining the abnormal condition of the target equipment according to the square error corresponding to each moment and a preset abnormal threshold.
In a second aspect, an embodiment of the present application provides an apparatus for anomaly detection, including:
the first acquisition module is used for acquiring a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
the training module is used for inputting the historical monitoring time sequence data set in the training samples to a to-be-trained stacking-sharing variational self-encoder model to obtain a historical reference time sequence data set aiming at each training sample, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
the second acquisition module is used for inputting the monitoring time sequence data acquired from each channel of the target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data of each monitoring time sequence data;
the judging module is used for determining the abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data;
and the alarm module is used for sending abnormal alarm information according to the abnormal condition of the target equipment.
Optionally, the training module includes:
and the training unit is used for inputting the historical monitoring time sequence data to a to-be-trained stacking-sharing variational self-encoder model aiming at the historical monitoring time sequence data corresponding to each channel in each training sample to obtain historical reference time sequence data, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data.
Optionally, the training unit includes:
the encoding subunit is used for encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and the decoding subunit is used for decoding the historical hidden state through a decoder of the stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, performs the steps of the above method.
The method for detecting the abnormality comprises the steps of firstly, obtaining a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device; then, for each training sample, inputting a historical monitoring time sequence data set in the training sample to a stack-sharing variational self-encoder model to be trained to obtain a historical reference time sequence data set, and training the stack-sharing variational self-encoder model to be trained by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set; secondly, inputting monitoring time sequence data acquired from each channel of the target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data of each monitoring time sequence data; thirdly, determining the abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data; and finally, sending abnormal alarm information according to the abnormal condition of the target equipment.
In some embodiments, the historical monitoring time sequence data corresponding to each channel in the historical monitoring time sequence data set is used for training the stacking-sharing variational self-encoder model to be trained, parameter sharing of an encoder and a decoder in the stacking-sharing variational self-encoder model to be trained of each channel is achieved, and the similarity among the channels of different monitoring data is achieved, so that according to the historical monitoring time sequence data of a certain channel, the model can learn the data operation condition including the channel and the similar channels, the total parameter number of the model is reduced, and the operation efficiency of the model is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic flowchart of a method for anomaly detection according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a method for determining an abnormal condition according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an anomaly detection apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
In production and life, equipment in a working state is not continuously and normally operated, some loss or damage can occur under the condition of continuous working, and the condition of abnormal working can occur, so that if a manual discovery mode is adopted for the abnormal working condition, time delay can exist, and the accuracy is low, therefore, a non-manual intelligent abnormal detection method gradually occurs, a general abnormal detection method is based on a statistical model, the related parameters are huge, the model is complex, the training efficiency is low, and the accuracy of the obtained detection result is low.
An embodiment of the present application provides an anomaly detection method, as shown in fig. 1, including:
s101, acquiring a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
s102, inputting a historical monitoring time sequence data set in the training samples to a to-be-trained stacking-sharing variational self-encoder model aiming at each training sample to obtain a historical reference time sequence data set, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
s103, inputting monitoring time sequence data acquired from a target channel of target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data corresponding to each monitoring time sequence data;
s104, determining the abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data;
and S105, sending abnormal alarm information according to the abnormal condition of the target equipment.
In the above step S101, the training sample set includes at least one training sample; each training sample comprises a historical monitoring time sequence data set, wherein the historical monitoring time sequence data set comprises a plurality of historical monitoring time sequence data, each historical monitoring time sequence data is obtained from a corresponding channel, the historical monitoring time sequence data not only comprises historical monitoring data, but also comprises historical monitoring time, the historical monitoring time can be in a timestamp or time index mode, and the historical monitoring data in the historical monitoring time sequence data are sorted according to the historical monitoring time. The data length of the historical monitoring time sequence data is determined by a preset time length which is set manually. The target device type is a device type of a device that needs to perform abnormality detection, and the target device type may be divided according to any one or more of the following types: a function type, a runtime environment type, a runtime modality type, and the like. The function types are divided according to the execution functions, and the devices with the same function are divided into the same device type, for example, an excavator and a bulldozer which are used in the process of transporting building materials. The operation environment types are divided according to the operation environment, and the devices in the same operation environment are divided into the same device type, such as an intelligent robot and a self-service cash dispenser which work in a bank. The operation time types are divided according to the operation time, and the devices with the same operation time are divided into the same device type, for example, if the computer device for controlling the machine tool and the machine tool operate simultaneously, the computer device for controlling the machine tool and the machine tool are divided into the same device type. The operation form types are divided according to the operation forms of the equipment, the hardware equipment is divided into the same equipment type, and the software equipment is divided into the same equipment type. The device cluster includes at least one device. The historical monitoring time series data set can be obtained from a plurality of channels in the same device, or can be obtained from a channel corresponding to each device in a plurality of devices. The monitoring data in the historical monitoring time sequence data acquired by applying the scheme to different devices are also different, for example, taking a device in a dam as an example, the historical monitoring time sequence data can be but is not limited to environmental quantities such as temperature, humidity and the like, and can also be but not limited to physical quantities such as horizontal displacement, vertical displacement, osmotic pressure, cracks, stress and the like. The monitoring data in the historical monitoring time series data in the hardware equipment can be acquired through sensors, and each channel corresponds to one sensor. Historical monitoring time sequence data in the software equipment is acquired through network nodes, and each channel corresponds to one network node. The historical monitoring time sequence data set comprises historical monitoring time sequence data of the equipment in the equipment cluster in a normal mode.
In step S102, the historical reference time series data set includes historical reference time series data corresponding to each historical monitoring time series data in the historical monitoring time series data set, and the historical reference time series data is used to represent normal fluctuation data of the historical monitoring time series data within a preset time length.
The stacking-sharing variational self-encoder model (also called a variational self-encoder model) is a variational self-encoder model which can realize specific functions and can realize multi-channel parameter sharing. The stack-share variational self-coder model is a bayesian probabilistic model. The existing methods based on variational self-encoders are divided into two types, one is based on a Recurrent Neural Network (RNN) architecture, multi-channel data acquired at the same time is used as a system state, a plurality of different variational self-encoders are used for carrying out sequence reconstruction modeling on multi-channel time sequence data in a preset time period moment by moment, and the pain point of the method is that the model training consumes long time, is easily interfered by abnormal points in training data and is easy to overfit; and the other method does not distinguish the number of channels and the number of time steps of the multi-channel time sequence data in a preset time period, directly uses a variational self-encoder with large volume parameters to reconstruct the multi-channel time sequence data in the whole time period at one time, and has the pain that the processing mode is rough and the model parameters are large. Thus, we propose a stack-sharing variational self-encoder model to address the pain points of the two schemes described above.
In specific implementation, the historical monitoring time sequence data corresponding to each channel in the historical monitoring time sequence data set is used for training the stacking-sharing variational self-encoder model to be trained, parameter sharing of an encoder and a decoder in the stacking-sharing variational self-encoder model to be trained of each channel is achieved, and the similarity existing among the channels of different monitoring data is utilized, so that according to the historical monitoring time sequence data of a certain channel, the model can learn the data operation condition including the channel and the similar channel. In the training process of the stack-shared variational self-encoder model to be trained, parameters of an encoder and a decoder in the stack-shared variational self-encoder model to be trained can be adjusted by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set, the parameters of the encoder and the decoder in the stack-shared variational self-encoder model to be trained can be adjusted at the same time by minimizing the difference between the historical monitoring time sequence data and the corresponding historical reference time sequence data thereof, for example, the parameters of the encoder and the decoder in the variational self-encoder can be adjusted by adopting a gradient descent algorithm and the like.
In step S103, the target device is a device or a device cluster that needs to perform anomaly detection, and the device type of the target device is consistent with the target device type. The number of channels of the target device is multiple, the number of channels of the target device is consistent with the number of channels of the device cluster of the target device type mentioned in step S101, and the data acquired by the corresponding channels belong to the same type. The reference time sequence data is normal fluctuation data of the monitoring time sequence data within a preset time length. The preset time period is set manually.
Specifically, the monitoring time series data acquired from the multiple channels of the target device are input to the trained stacking-sharing variational self-encoder model, and the reference time series data corresponding to the monitoring time series data of each channel can be obtained. The method is consistent with the method for calculating historical reference time sequence data in the training process of the stacking-sharing variational self-encoder model, and reference time sequence data corresponding to the monitoring time sequence data can be calculated through an encoder in the trained stacking-sharing variational self-encoder model.
In step S104, based on the difference between the monitoring time-series data and the reference time-series data, the abnormal condition of the target device within the preset time period may be determined, where the larger the difference between the monitoring time-series data and the reference time-series data is, the higher the abnormal degree of the target device is, and the smaller the difference between the monitoring time-series data and the reference time-series data is, the lower the abnormal degree of the target device is.
In the step S105, abnormality warning information is sent according to the abnormality of the target device. Different abnormal alarm information can be generated according to the abnormal degree of the abnormal condition, the higher the abnormal degree is, the stronger the reminding strength of the abnormal alarm information is, and the lower the abnormal degree is, the weaker the reminding strength of the abnormal alarm information is.
For example, when the degree of abnormality is relatively low, the abnormality alarm information only includes an alarm bell prompt, and when the degree of abnormality is relatively high, the abnormality alarm information includes an alarm bell prompt and a short message prompt.
In the five steps, the historical monitoring time sequence data corresponding to each channel in the historical monitoring time sequence data set is used for training the stacking-sharing variational self-encoder model to be trained, parameter sharing of an encoder and a decoder in the stacking-sharing variational self-encoder model to be trained of each channel is realized, and similarity existing among channels of different monitoring data is realized, so that according to the historical monitoring time sequence data of a certain channel, the model can learn the data operation condition including the channel and the similar channels, the total parameter number of the model is reduced, and the operation efficiency of the model is improved.
In the training process of the stack-sharing variational self-encoder model, in order to reduce the easy overfitting between data, each historical monitoring time sequence data in the historical monitoring time sequence data set needs to be trained separately. Thus, step S102 includes:
step 1021, aiming at the historical monitoring time sequence data corresponding to each channel in each training sample, inputting the historical monitoring time sequence data into the stacking-sharing variational self-encoder model to be trained to obtain historical reference time sequence data, and training the stacking-sharing variational self-encoder model to be trained by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data.
Specifically, the detailed training process comprises the following steps:
in step 1021, inputting the historical monitoring time sequence data into a stacking-sharing variational self-encoder model to be trained to obtain historical reference time sequence data, including:
step 10211, encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and step 10212, decoding the historical hidden state by a decoder of the stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
In the above step 10211, the history hidden state is compressed data of the history monitoring time-series data, and the main waveform feature in the history monitoring time-series data is retained.
In specific implementation, the historical monitoring time sequence data is encoded through an encoder of a stacking-sharing variational self-encoder model to be trained, namely, the historical monitoring time sequence data is compressed, main waveform characteristics in the historical monitoring time sequence data are reserved, and noise interference in the historical monitoring time sequence data is eliminated to a certain extent.
Specifically, the calculation of the history hidden state further includes the following steps, that is, step 10211 includes:
step 102111, calculating the probability distribution of the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained;
and 102112, obtaining the historical hidden state of the historical monitoring time sequence data from the probability distribution through random sampling.
In the above step 102111, in a specific implementation, a period of time data with a length W is collected by a plurality of channels in a preset time period, and a historical monitoring time data set composed of the time data corresponding to the plurality of channels is recorded as
Figure 880060DEST_PATH_IMAGE001
Figure 309904DEST_PATH_IMAGE002
Wherein,
Figure 551530DEST_PATH_IMAGE003
is a vector and represents the value of the nth channel containing the continuous W moments of the preset time period.
In the process of calculating the probability distribution of the historical monitoring time sequence data, the historical monitoring time sequence data of 1-nth channels are sequentially coded, and the nth channel is coded
Figure 826653DEST_PATH_IMAGE005
The stack-sharing variational autoencoder model encoding process learns from
Figure 657075DEST_PATH_IMAGE006
The mapping to the corresponding historical hidden state is denoted as f. The encoding process of the encoder can be calculated by the following formula:
Figure 992242DEST_PATH_IMAGE007
wherein,
Figure 721163DEST_PATH_IMAGE008
representing a function or family of functions defined by the encoder,
Figure 737661DEST_PATH_IMAGE009
the probability distribution of the historical hidden state, namely the occurrence probability of the historical hidden state when the event occurs, represents the historical monitoring time sequence data,
Figure 235638DEST_PATH_IMAGE011
and representing the historical hidden state of the historical monitoring time series data.
In the step 102112, a random variable z (i.e., the historical hidden variable) is introduced in advance to represent the compression state of the historical time series data x, and a mapping relationship between the two is established (i.e., the f) so that the stack-shared variational self-encoder can correctly map x to the hidden state representation thereof, i.e., the hidden variable z, in the case of tolerating a certain noise, the compression process is mainly realized in a random sampling manner, the realization process retains important shape characteristics of the historical monitoring time series data, and ignores abnormal noise of the historical monitoring time series data to a certain extent.
In the step 10212, the decoding process of the stacking-sharing variational self-encoder corresponds to the encoding process, the hidden states of the historical monitoring time series data of the 1 st to the nth channels are decoded in sequence, and the hidden state corresponding to the nth channel is decoded
Figure 7285DEST_PATH_IMAGE011
Decoding process using
Figure 957923DEST_PATH_IMAGE011
Generating
Figure 27380DEST_PATH_IMAGE006
Corresponding normal mode
Figure 379863DEST_PATH_IMAGE013
It is recorded as
Figure 322412DEST_PATH_IMAGE015
Figure 698029DEST_PATH_IMAGE016
Wherein,
Figure 118646DEST_PATH_IMAGE013
representing reference time series data, i.e. nth historical monitoring time series data
Figure 591216DEST_PATH_IMAGE006
The corresponding reference time series data is stored in the memory,
Figure 891616DEST_PATH_IMAGE017
representing a function or family of functions defined by the decoder.
In the process of applying the trained stacking-sharing variational self-coding model, the method for judging the abnormal condition is substantially the same as the method for training the model, and the application provides a method for judging the abnormal condition, as shown in fig. 2, step 104 includes:
s1041, calculating a square error between the monitoring data of each moment of the monitoring time series data and the reference data of the corresponding moment in the reference time series data aiming at each monitoring time series data;
and S1042, determining the abnormal condition of the target equipment according to the square error corresponding to each moment and a preset abnormal threshold.
In the above step S1041 and step S1042, for the monitoring time series data corresponding to each channel, a square error between the monitoring data at each time of the monitoring time series data and the reference data at the corresponding time in the reference time series data needs to be calculated, an abnormal score (that is, a square error) of an abnormal condition of the device at each time is calculated by using the square error corresponding to each channel at each time, the abnormal score at each time is compared with a preset difference threshold, if the difference score is within the preset difference threshold, the abnormal condition of the target device does not exist, and if the difference score is not within the preset difference threshold, the abnormal condition of the target device exists.
The preset difference threshold is predetermined, and the target device is detected only after the preset difference threshold is determined, and in order to more accurately determine the difference threshold capable of determining the device abnormality, the preset difference threshold may be determined by using the verification data acquired from the target device. The process of determining the preset difference threshold comprises the following steps:
step 201, obtaining a verification data set and an alternative difference threshold; the verification data set comprises a plurality of verification samples, and the verification samples comprise verification data and verification identifications; the verification identification comprises an abnormal identification or an abnormal identification; the verification data is historical verification monitoring time sequence data acquired from target equipment;
step 202, inputting historical verification monitoring time sequence data in verification samples into a trained stacking-sharing variational self-encoder model aiming at each verification sample to obtain a verification square error corresponding to the historical verification monitoring time sequence data;
step 203, determining a first number of verification samples corresponding to each alternative difference threshold according to a verification square error corresponding to the historical verification monitoring time sequence data of each verification sample and each alternative difference threshold;
step 204, counting a second number of abnormal identifications according to the verification identification of each verification sample aiming at each alternative difference threshold, and determining the accuracy of the alternative difference threshold according to the ratio of the second number to the first number;
step 205, according to the accuracy of each candidate difference threshold, determining the candidate difference threshold with the highest accuracy as a preset difference threshold.
In step 201, the alternative difference threshold includes a plurality of difference thresholds, and the alternative difference threshold is prepared according to working experience. The verification data is historical verification monitoring time sequence data acquired from the target device, the historical verification monitoring time sequence data is historical monitoring time sequence data acquired from a channel of the target device, and the historical verification monitoring time sequence data can be historical monitoring time sequence data acquired from the channel of the target device at different time periods. The verification mark is used for representing whether the target equipment is abnormal or not, the abnormal mark represents that the target equipment is abnormal, and the non-abnormal mark represents that the target equipment is not abnormal.
In step 202, for each verification sample, the historical verification monitoring time sequence data is input to the trained stack-shared variational self-encoder model, historical verification reference time sequence data corresponding to the historical verification monitoring time sequence data can be obtained through calculation, and a verification square error is calculated by using the historical verification monitoring time sequence data and the historical verification reference time sequence data.
Specifically, the historical verification monitoring time series data includes historical verification monitoring data acquired by each channel at a plurality of continuous moments within a preset time period, and therefore, a verification square error can be calculated according to a difference between the historical verification monitoring data at each moment and corresponding historical verification reference monitoring data. If the target equipment is determined to possibly have an abnormal condition at a specific moment, the abnormal condition is determined according to the sum of square errors between the historical verification monitoring data and the historical reference monitoring data corresponding to a plurality of channels at the same moment, namely, the sum of square errors between the historical verification monitoring data and the historical verification reference monitoring data corresponding to each moment of a preset time period of the historical verification monitoring time sequence data is calculated, and whether the target equipment has an abnormal condition in the preset time period is determined according to the comparison between the sum of square errors and a preset threshold value. The process of calculating the sum of the verification squared errors includes the following steps:
step 2021, verify the monitoring timing data for the nth channel
Figure 816847DEST_PATH_IMAGE006
And the corresponding verification square error at each moment is recorded as
Figure 41155DEST_PATH_IMAGE018
Figure 102652DEST_PATH_IMAGE019
Wherein,
Figure 324686DEST_PATH_IMAGE020
historical verification reference monitoring data corresponding to the t-th time in the historical verification reference time sequence data,
Figure 737212DEST_PATH_IMAGE021
history verification monitoring data corresponding to the t-th time in the history verification monitoring time sequence data,
Figure 765211DEST_PATH_IMAGE022
and verifying the square error of the nth channel corresponding to the t-th time.
Step 2022, the target device is at
Figure 130815DEST_PATH_IMAGE024
The abnormal situation at the moment is the summation of the verification square errors of all channels:
Figure 320488DEST_PATH_IMAGE025
wherein,
Figure 220311DEST_PATH_IMAGE027
is a target device in
Figure 724104DEST_PATH_IMAGE029
The verification square error at time, N is the number of channels,
Figure 25773DEST_PATH_IMAGE022
and verifying the square error of the nth channel corresponding to the t-th time.
In the step S1041, a square error between the monitoring data at each time of the monitoring time series data and the reference data at the corresponding time of the reference time series data is calculated, and the step 2021 of calculating a verification square error may be referred to.
In step 203, according to the verification square error corresponding to each channel at each time of the historical verification monitoring time series data, the verification square error corresponding to each time in the target device is calculated and summed, and the process of calculating the verification square error summation may refer to step 2022. Then, according to the sum of the verification square errors of each historical verification monitoring time series data and each alternative difference threshold, determining a first number of verification samples corresponding to each alternative difference threshold, namely, if the sum of the verification square errors of the historical verification monitoring time series data corresponding to the verification samples is matched with one alternative difference threshold, associating the verification samples with the matched alternative difference threshold, and then counting the number of the verification samples associated with the alternative difference threshold for each alternative difference threshold.
In step 204, for each candidate difference threshold, a second number of abnormal identifiers is determined according to the verification identifier of the verification sample associated with the candidate difference threshold, and the accuracy of the candidate difference threshold can be calculated by calculating the ratio of the second number to the first number. The accuracy rate represents the accuracy rate of whether the target device is abnormal or not can be determined by using the alternative difference threshold value.
In step 205, according to the accuracy of each candidate difference threshold, the candidate difference threshold with the highest accuracy is determined as the preset difference threshold. And whether the target equipment is abnormal or not is predicted by utilizing the determined preset difference threshold, so that the accuracy of determining the abnormality is improved.
In a second aspect, an embodiment of the present application provides an apparatus for anomaly detection, as shown in fig. 3, including:
a first obtaining module 301, configured to obtain a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
a training module 302, configured to, for each training sample, input a historical monitoring time sequence data set in the training sample into a to-be-trained stack-shared variational self-encoder model to obtain a historical reference time sequence data set, and train the to-be-trained stack-shared variational self-encoder model by using a difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
a second obtaining module 303, configured to input the monitoring time series data obtained from each channel of the target device into a trained stacking-sharing variational self-encoder model, so as to obtain reference time series data of each monitoring time series data;
a determining module 304, configured to determine an abnormal condition of the target device according to a difference between each monitored time series data and the corresponding reference time series data;
and an alarm module 305, configured to send abnormal alarm information according to the abnormal condition of the target device.
Optionally, the training module includes:
and the training unit is used for inputting the historical monitoring time sequence data to a to-be-trained stacking-sharing variational self-encoder model aiming at the historical monitoring time sequence data corresponding to each channel in each training sample to obtain historical reference time sequence data, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data.
Optionally, the training unit includes:
the encoding subunit is used for encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and the decoding subunit is used for decoding the historical hidden state through a decoder of the stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
Optionally, the coding subunit includes:
the calculating subunit is used for calculating the probability distribution of the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained;
and the sampling subunit is used for obtaining the historical hidden state of the historical monitoring time sequence data from the probability distribution through random sampling.
Optionally, the determining module includes:
a second square error calculation unit configured to calculate, for each of the monitoring time series data, a square error between the monitoring data at each time of the monitoring time series data and reference data at a corresponding time in the reference time series data;
and the abnormality determining unit is used for determining the abnormal condition of the target equipment according to the square error corresponding to each moment and a preset abnormality threshold value.
Corresponding to the method of anomaly detection in fig. 1, an embodiment of the present application further provides a computer device 400, as shown in fig. 4, the device includes a memory 401, a processor 402, and a computer program stored in the memory 401 and executable on the processor 402, where the processor 402 implements the method of anomaly detection when executing the computer program.
Specifically, the memory 401 and the processor 402 can be general memories and processors, which are not limited in particular, and when the processor 402 runs a computer program stored in the memory 401, the method for detecting the abnormality can be executed, so that the problem of low running efficiency of the device abnormality detection model in the prior art is solved.
Corresponding to the method of anomaly detection in fig. 1, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to perform the steps of the above-mentioned method of anomaly detection.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, when a computer program on the storage medium is run, the method for detecting the abnormality can be executed, and the problem of low running efficiency of an equipment abnormality detection model in the prior art is solved, in the present application, historical monitoring time sequence data corresponding to each channel in a historical monitoring time sequence data set is used for training a stack-sharing variational self-encoder model to be trained, parameter sharing of an encoder and a decoder in the stack-sharing variational self-encoder model to be trained of each channel is realized, similarity existing among channels using different monitoring data is realized, so that according to the historical monitoring time sequence data of a certain channel, the model can learn data running conditions including the channel and similar channels thereof, the total parameter number of the model is reduced, and the operation efficiency of the model is improved.
In the embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of anomaly detection, comprising:
acquiring a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
for each training sample, inputting a historical monitoring time sequence data set in the training sample into a stack-sharing variational self-coder model to be trained to obtain a historical reference time sequence data set, and training the stack-sharing variational self-coder model to be trained by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
inputting monitoring time sequence data acquired from each channel of target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data of each monitoring time sequence data;
determining an abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data;
and sending abnormal alarm information according to the abnormal condition of the target equipment.
2. The method according to claim 1, wherein for each training sample, inputting a historical monitoring time sequence data set in the training sample into a stack-shared variational self-encoder model to be trained to obtain a historical reference time sequence data set, and training the stack-shared variational self-encoder model to be trained by using a difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set, the method comprising:
and aiming at the historical monitoring time sequence data corresponding to each channel in each training sample, inputting the historical monitoring time sequence data to a to-be-trained stacking-sharing variational self-encoder model to obtain historical reference time sequence data, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data of the corresponding channel.
3. The method of claim 2, wherein inputting the historical monitoring timing data into a stack-sharing variational self-encoder model to be trained, obtaining historical reference timing data, comprises:
encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and decoding the historical hidden state through a decoder of a stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
4. The method of claim 3, wherein encoding the historical monitoring time series data by an encoder of a stack-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time series data comprises:
calculating the probability distribution of the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained;
and obtaining the historical hidden state of the historical monitoring time sequence data from the probability distribution through random sampling.
5. The method of claim 1, wherein determining an abnormal condition of the target device based on a difference between each of the monitored time series data and the corresponding reference time series data comprises:
calculating a square error between the monitoring data of each moment of the monitoring time series data and the reference data of the corresponding moment in the reference time series data aiming at each monitoring time series data;
and determining the abnormal condition of the target equipment according to the square error corresponding to each moment and a preset abnormal threshold.
6. An apparatus for anomaly detection, comprising:
the first acquisition module is used for acquiring a training sample set used for training; the set of training samples comprises at least one training sample; the training sample comprises a historical monitoring time sequence data set obtained from each channel of an equipment cluster of a target equipment type, wherein the historical monitoring time sequence data set comprises historical monitoring time sequence data of equipment in the equipment cluster in a normal mode; the device cluster comprises at least one device;
the training module is used for inputting the historical monitoring time sequence data set in the training samples to a to-be-trained stacking-sharing variational self-encoder model to obtain a historical reference time sequence data set aiming at each training sample, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between each historical monitoring time sequence data in the historical monitoring time sequence data set and the historical reference time sequence data of a corresponding channel in the historical reference time sequence data set;
the second acquisition module is used for inputting the monitoring time sequence data acquired from each channel of the target equipment into a trained stacking-sharing variational self-encoder model to obtain reference time sequence data of each monitoring time sequence data;
the judging module is used for determining the abnormal condition of the target equipment according to the difference between each monitoring time sequence data and the corresponding reference time sequence data;
and the alarm module is used for sending abnormal alarm information according to the abnormal condition of the target equipment.
7. The apparatus of claim 6, wherein the training module comprises:
and the training unit is used for inputting the historical monitoring time sequence data to a to-be-trained stacking-sharing variational self-encoder model aiming at the historical monitoring time sequence data corresponding to each channel in each training sample to obtain historical reference time sequence data, and training the to-be-trained stacking-sharing variational self-encoder model by utilizing the difference between the historical monitoring time sequence data and the historical reference time sequence data of the corresponding channel.
8. The apparatus of claim 7, wherein the training unit comprises:
the encoding subunit is used for encoding the historical monitoring time sequence data through an encoder of a stacking-sharing variational self-encoder model to be trained to obtain a historical hidden state of the historical monitoring time sequence data;
and the decoding subunit is used for decoding the historical hidden state through a decoder of the stacking-sharing variational self-encoder model to be trained to obtain the historical reference time sequence data.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of the preceding claims 1-5 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1-5.
CN202110380361.3A 2021-04-09 2021-04-09 Method, device, computer equipment and medium for anomaly detection Active CN112766429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110380361.3A CN112766429B (en) 2021-04-09 2021-04-09 Method, device, computer equipment and medium for anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110380361.3A CN112766429B (en) 2021-04-09 2021-04-09 Method, device, computer equipment and medium for anomaly detection

Publications (2)

Publication Number Publication Date
CN112766429A CN112766429A (en) 2021-05-07
CN112766429B true CN112766429B (en) 2021-06-29

Family

ID=75691372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110380361.3A Active CN112766429B (en) 2021-04-09 2021-04-09 Method, device, computer equipment and medium for anomaly detection

Country Status (1)

Country Link
CN (1) CN112766429B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468820A (en) * 2021-07-21 2021-10-01 上海眼控科技股份有限公司 Data training method, device, equipment and storage medium
CN113780387B (en) * 2021-08-30 2024-10-01 桂林电子科技大学 Time sequence anomaly detection method based on shared self-encoder
CN114137421B (en) * 2021-11-30 2023-09-19 章鱼博士智能技术(上海)有限公司 Battery abnormality detection method, device, equipment and storage medium
CN115309736B (en) * 2022-10-10 2023-03-24 北京航空航天大学 Time sequence data anomaly detection method based on self-supervision learning multi-head attention network
WO2024152381A1 (en) * 2023-01-20 2024-07-25 Siemens Aktiengesellschaft Method for detecting data drift, computer device, and computer-readable storage medium
CN117874687B (en) * 2024-03-12 2024-05-31 深圳市格瑞邦科技有限公司 Data interaction method of industrial tablet personal computer

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10977574B2 (en) * 2017-02-14 2021-04-13 Cisco Technology, Inc. Prediction of network device control plane instabilities
CN111258863B (en) * 2018-12-03 2023-09-22 北京嘀嘀无限科技发展有限公司 Data anomaly detection method, device, server and computer readable storage medium
CN111949496B (en) * 2019-05-15 2022-06-07 华为技术有限公司 Data detection method and device
CN112101554B (en) * 2020-11-10 2024-01-23 北京瑞莱智慧科技有限公司 Abnormality detection method and apparatus, device, and computer-readable storage medium

Also Published As

Publication number Publication date
CN112766429A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN112987675B (en) Method, device, computer equipment and medium for anomaly detection
CN112766429B (en) Method, device, computer equipment and medium for anomaly detection
CN111914873B (en) Two-stage cloud server unsupervised anomaly prediction method
CN111475804A (en) Alarm prediction method and system
CN109241997B (en) Method and device for generating training set
CN114528190B (en) Single index abnormality detection method and device, electronic equipment and readable storage medium
CN116823227A (en) Intelligent equipment management system and method based on Internet of things
CN108052092A (en) A kind of subway electromechanical equipment abnormal state detection method based on big data analysis
US10909322B1 (en) Unusual score generators for a neuro-linguistic behavioral recognition system
CN114997313A (en) Anomaly detection method for ocean online monitoring data
CN115184054A (en) Mechanical equipment semi-supervised fault detection and analysis method, device, terminal and medium
CN116665421A (en) Early warning processing method and device for mechanical equipment and computer readable storage medium
CN116595857A (en) Rolling bearing multistage degradation residual life prediction method based on deep migration learning
CN116842520A (en) Anomaly perception method, device, equipment and medium based on detection model
CN114861774A (en) False data identification method and system in power grid
CN115186762A (en) Engine abnormity detection method and system based on DTW-KNN algorithm
CN113469247B (en) Network asset abnormity detection method
KR101960755B1 (en) Method and apparatus of generating unacquired power data
CN112380073B (en) Fault position detection method and device and readable storage medium
CN117792750A (en) Information physical system anomaly detection method based on generation countermeasure network
CN110738403B (en) Data processing method, device and computer storage medium
CN112418529A (en) Outdoor advertisement on-line collapse prediction method based on LSTM neural network
CN111885084A (en) Intrusion detection method and device and electronic equipment
CN112632469A (en) Method and device for detecting abnormity of business transaction data and computer equipment
CN109933615A (en) A kind of label vector sequence variation detection method based on difference matrix

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant