CN109492767A - A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder - Google Patents
A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder Download PDFInfo
- Publication number
- CN109492767A CN109492767A CN201811330477.0A CN201811330477A CN109492767A CN 109492767 A CN109492767 A CN 109492767A CN 201811330477 A CN201811330477 A CN 201811330477A CN 109492767 A CN109492767 A CN 109492767A
- Authority
- CN
- China
- Prior art keywords
- data
- self
- encoding encoder
- threshold value
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Hardware Design (AREA)
- Biophysics (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
The present invention provides a kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder, belongs to abnormality detection technical field, and legacy data is carried out unsupervised training by self-encoding encoder using the neural network function in self-encoding encoder by the present invention.Obtained model can be used to compress the data newly inputted, and compressed data are used for and compressed training data () is compared before.If compressed error is more than threshold value, it is judged as abnormal data.Data after compressed encoding can more embody the substantive characteristics of data, can catch the feature mode of data, therefore more accurate.
Description
Technical field
The present invention relates to abnormality detection technology more particularly to a kind of exceptions applied to unsupervised field based on self-encoding encoder
Detection method.
Background technique
When handling a large amount of high dimensional datas, on the one hand, because data volume is big, variable is more, time cost is very high;It is another
Aspect, because variable is excessive, certain key variables features may be covered by other a large amount of characteristics of variables, be eventually led to
The Partial key characteristics of variables of progress abnormality processing can not play the role of due.
Abnormality detection is a kind of algorithm being in daily use.It is mainly used to detect whether a data is abnormal data.Abnormal inspection
The algorithm of survey has very much.
Abnormality detection is a research direction with very broad prospect of application, is examined in the failure of some engineering fields
It surveys, the intrusion detection of the fraud detection of financial field, security fields suffers from extraordinary application scenarios.Abnormality detection is detection
Data undesirably, behavior, but Internet era now, the complicated multiplicity of various information, possible a certain item data just have
Hundreds of variable causes the difficulty of abnormality detection to increase at geometric multiple.Time cost is very high, this locates us in time
It is significantly unfavorable to manage the abnormal conditions generated, it is possible to cause very big loss.
Self-encoding encoder (autoencoder) is a kind of unsupervised deep learning method, is also often used to compressed data.With
Classical PCA(pivot in a column) analysis difference, self-encoding encoder is a kind of nonlinear compression method, can be extracted non-linear in data
Information.In the occasion of most of self-encoding encoder, the function of compression and decompression is by neural fusion.
Error threshold setting is the key that realize abnormality processing, if threshold value setting is too low, may cause many normal numbers
According to abnormal data is mistaken as, if instead threshold value setting is excessively high, it may cause some abnormal datas and be mistaken as normal data.
Summary of the invention
Based on the above content, the invention proposes a kind of applied to unsupervised abnormality detection side of the field based on self-encoding encoder
Method, it is more suitable for data variable, without the abnormality detection under the unsupervised environment of label.
In the present invention, the algorithm parameter of self-encoding encoder can be set to default parameters, or can also rule of thumb into
Row is adjusted.Self-encoding encoder also has many derivative algorithms, and this kind of algorithms can be similarly used in the method that we introduce.
Using self-encoding encoder, data are subjected to coding further decoding, obtained result is compared with former data, works as error
After reaching threshold value, illustrate that the data are larger with the most data difference for constituting self-encoding encoder, it can be determined that for abnormal number
According to.
Further,
First with the neural network function in self-encoding encoder, original normal data is subjected to unsupervised instruction by self-encoding encoder
Practice.
Obtained model can be used to compress the data newly inputted, and compressed data are used for and compressed instruction
Practice data (normal data) to be compared.
If compressed error is more than threshold value, it is judged as abnormal data.Data after compressed encoding can more embody number
According to substantive characteristics, the feature mode of data can be caught, therefore more accurate.
Further,
Operating process are as follows:
1) partial history normal data training self-encoding encoder model is first taken;
2) data to be tested are carried out abnormality detection using trained model, and exports result;
5, according to the method described in claim 4, it is characterized in that,
Operating process is broadly divided into two aspects: 1) error threshold is arranged, and 2) detection foundation.
Wherein, the error threshold setting, after referring to that model training is good, holds each data for training sample
Row encoding operation is to get to coded data corresponding to these data;It is calculated from the data after these codings average
Coded data;Then each training sample data and this average data calculate Euclidean distance to get to one group of number and instruction
Practice the consistent distance values of sample;Then average and standard deviation is calculated, threshold value is finally obtained.Threshold value is that average value adds 3 times
Or 6 times of standard deviation.
The detection judge whether new data is extremely according to get to after threshold value in next step;Using model to newly into
The sample come executes encoding operation, and obtained coded data and the average data data obtained before calculate Euclidean distance;This
Distance is compared with threshold value and obtains result.
The beneficial effects of the invention are as follows
The abnormality detection model in current industrial application is improved, can preferably be applied using deep learning in the field of big data
Jing Zhong allows abnormality detection to be applied under big data scene.
Algorithm realization is carried out by Major Epidemic programming language.Abnormality detection is industry 4.0, and industry internet field is most heavy
One of application wanted plays the role of important technical support in industry internet application to company.
Detailed description of the invention
Fig. 1 is workflow schematic diagram of the invention.
Specific embodiment
More detailed elaboration is carried out to the contents of the present invention below:
Application scenarios of the present invention belong to unsupervised field, so needing gradually to adjust threshold parameter according to the actual situation.
Dynamic encoder is a kind of compression algorithm of data, wherein the compression and decompression function of data be data it is relevant,
It is damaging, learn automatically from sample.In the occasion for largely mentioning autocoder, the function of compression and decompression is logical
Cross neural fusion.
1) autocoder is that data are relevant (data-specific or data-dependent), it means that from
Dynamic encoder can only compress those data similar with training data.Exist for example, training the autocoder come using face
Compress other picture, such as poor performance when trees because it learn to be characterized in it is relevant to face.
2) autocoder damages, and means that the output of decompression is to degenerate compared with original input, MP3,
The compression algorithms such as JPEG are also such.This is different from lossless compression algorithm.3) autocoder is learned automatically from data sample
It practises, it means that be easy to train the input of specified class a kind of specific encoder, without completing any new work
Make.
It is carried out abnormality detection using this unsupervised deep learning method of autocoder, this method may be implemented
Software view is cured in hardware.This method is applied to edge calculations end or Embedded model, as one
The innovative application of kind.
Operating procedure are as follows:
1: first taking a part of history normal data training self-encoding encoder model.This partial data has not needed label.
2: data to be tested being carried out abnormality detection using trained model, and export result.
Specific judgment basis is as follows:
After model training is good, coding (drop is executed to each data for training sample (trained history normal data)
Dimension processing) it operates to get coded data corresponding to these data is arrived.It is calculated from the data after these codings average
Coded data (column vector or row vector).Then each training sample data and this average data calculate Euclidean distance,
Obtain one group of (number is consistent with training sample) distance values.Then average and standard deviation is calculated.Finally obtain threshold value
For the standard deviation of average value plus 3 times (or 6 times).This threshold value is used to judge whether the data in future are abnormal data, i.e., super
Crossing this threshold value is abnormal data.
It is in next step exactly to judge whether new data is abnormal after obtaining threshold value.The sample newly come in is held using model
Row encoding operation (dimension-reduction treatment), obtained coded data and the average data data obtained before calculate Euclidean distance.This
Distance is compared with threshold value and obtains result.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (8)
1. a kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder, which is characterized in that
Using self-encoding encoder, data are subjected to coding further decoding, obtained result are compared with former data, when error reaches
After threshold value, illustrates that the data are larger with more than half data differences for constituting self-encoding encoder, be judged as abnormal data.
2. the method according to claim 1, wherein
First with the neural network function in self-encoding encoder, original normal data is subjected to unsupervised instruction by self-encoding encoder
Practice.
3. according to the method described in claim 2, it is characterized in that,
The model obtained after training is used to compress the data newly inputted, new compressed data are obtained, after compression
Data be used for and compressed training data be compared;If compressed error is more than threshold value, it is judged as abnormal data.
4. according to the method described in claim 3, it is characterized in that,
Operating process are as follows:
1) partial history normal data training self-encoding encoder model is first taken;
2) data to be tested are carried out abnormality detection using trained model, and exports result.
5. according to the method described in claim 4, it is characterized in that,
Operating process is broadly divided into two aspects: 1) error threshold is arranged, and 2) detection foundation.
6. according to the method described in claim 5, it is characterized in that,
Wherein, the error threshold setting, after referring to that model training is good, executes volume to each data for training sample
Code operates to arrive coded data corresponding to these data;Average coding is calculated from the data after these codings
Data;Then each training sample data and this average data calculate Euclidean distance to get to one group of number and training sample
This consistent distance values;Then average and standard deviation is calculated, threshold value is finally obtained.
7. according to the method described in claim 6, it is characterized in that
Threshold value is the standard deviation that average value adds 3 times or 6 times.
8. according to the method described in claim 7, it is characterized in that
The detection judge whether new data is extremely according to get to after threshold value in next step;Using model to newly coming in
Sample executes encoding operation, and obtained coded data and the average data data obtained before calculate Euclidean distance;This distance
It is compared with threshold value and obtains result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811330477.0A CN109492767A (en) | 2018-11-09 | 2018-11-09 | A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811330477.0A CN109492767A (en) | 2018-11-09 | 2018-11-09 | A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109492767A true CN109492767A (en) | 2019-03-19 |
Family
ID=65694191
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811330477.0A Pending CN109492767A (en) | 2018-11-09 | 2018-11-09 | A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109492767A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502895A (en) * | 2019-08-27 | 2019-11-26 | 中国工商银行股份有限公司 | Interface exception call determines method and device |
CN110796497A (en) * | 2019-10-31 | 2020-02-14 | 支付宝(杭州)信息技术有限公司 | Method and device for detecting abnormal operation behaviors |
CN111104241A (en) * | 2019-11-29 | 2020-05-05 | 苏州浪潮智能科技有限公司 | Server memory anomaly detection method, system and equipment based on self-encoder |
CN111241688A (en) * | 2020-01-15 | 2020-06-05 | 北京百度网讯科技有限公司 | Method and device for monitoring composite production process |
CN111538614A (en) * | 2020-04-29 | 2020-08-14 | 济南浪潮高新科技投资发展有限公司 | Method for detecting time sequence abnormal operation behavior of operating system |
CN112395382A (en) * | 2020-11-23 | 2021-02-23 | 武汉理工大学 | Ship abnormal track data detection method and device based on variational self-encoder |
CN113632140A (en) * | 2019-06-17 | 2021-11-09 | 乐人株式会社 | Automatic learning method and system for product inspection |
CN115293663A (en) * | 2022-10-10 | 2022-11-04 | 国网山东省电力公司滨州供电公司 | Bus unbalance rate abnormity detection method, system and device |
-
2018
- 2018-11-09 CN CN201811330477.0A patent/CN109492767A/en active Pending
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113632140A (en) * | 2019-06-17 | 2021-11-09 | 乐人株式会社 | Automatic learning method and system for product inspection |
CN110502895A (en) * | 2019-08-27 | 2019-11-26 | 中国工商银行股份有限公司 | Interface exception call determines method and device |
CN110796497A (en) * | 2019-10-31 | 2020-02-14 | 支付宝(杭州)信息技术有限公司 | Method and device for detecting abnormal operation behaviors |
CN111104241A (en) * | 2019-11-29 | 2020-05-05 | 苏州浪潮智能科技有限公司 | Server memory anomaly detection method, system and equipment based on self-encoder |
CN111241688A (en) * | 2020-01-15 | 2020-06-05 | 北京百度网讯科技有限公司 | Method and device for monitoring composite production process |
CN111241688B (en) * | 2020-01-15 | 2023-08-25 | 北京百度网讯科技有限公司 | Method and device for monitoring composite production process |
CN111538614A (en) * | 2020-04-29 | 2020-08-14 | 济南浪潮高新科技投资发展有限公司 | Method for detecting time sequence abnormal operation behavior of operating system |
CN111538614B (en) * | 2020-04-29 | 2024-04-05 | 山东浪潮科学研究院有限公司 | Time sequence abnormal operation behavior detection method of operating system |
CN112395382A (en) * | 2020-11-23 | 2021-02-23 | 武汉理工大学 | Ship abnormal track data detection method and device based on variational self-encoder |
CN115293663A (en) * | 2022-10-10 | 2022-11-04 | 国网山东省电力公司滨州供电公司 | Bus unbalance rate abnormity detection method, system and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109492767A (en) | A kind of method for detecting abnormality applied to unsupervised field based on self-encoding encoder | |
CN109413028B (en) | SQL injection detection method based on convolutional neural network algorithm | |
CN109408389B (en) | Code defect detection method and device based on deep learning | |
CN109034140B (en) | Industrial control network signal abnormity detection method based on deep learning structure | |
CN113242207B (en) | Iterative clustering network flow abnormity detection method | |
CN115033895A (en) | Binary program supply chain safety detection method and device | |
Soukup et al. | Reliably decoding autoencoders’ latent spaces for one-class learning image inspection scenarios | |
US11727052B2 (en) | Inspection systems and methods including image retrieval module | |
CN114626426A (en) | Industrial equipment behavior detection method based on K-means optimization algorithm | |
CN117574383A (en) | Feature fusion and code visualization technology-based software vulnerability detection model method | |
CN116597635B (en) | Wireless communication intelligent gas meter controller and control method thereof | |
CN115017015B (en) | Method and system for detecting abnormal behavior of program in edge computing environment | |
CN114936615B (en) | Small sample log information anomaly detection method based on characterization consistency correction | |
CN116680639A (en) | Deep-learning-based anomaly detection method for sensor data of deep-sea submersible | |
Akavalappil et al. | A convolutional neural network (CNN)‐based direct method to detect stiction in control valves | |
CN117333726B (en) | Quartz crystal cutting abnormality monitoring method, system and device based on deep learning | |
CN117521042B (en) | High-risk authorized user identification method based on ensemble learning | |
CN118094549B (en) | Malicious behavior identification method based on bimodal fusion of source program and executable code | |
CN117574782B (en) | Method, device, system and medium for judging winding materials based on transformer parameters | |
CN117237165B (en) | Method for detecting fake data | |
CN116384797A (en) | Digital infrastructure health assessment method oriented to data fusion | |
CN117892777A (en) | Decision risk assessment method and system for target detection model | |
Abdurrazaq et al. | Improving performance of network scanning detection through PCA-based feature selection | |
Wu et al. | Automated Anomaly Detection Assisted by Discrimination Model for Time Series | |
CN118070273A (en) | Webshell attack detection method based on graph semantic analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190319 |
|
RJ01 | Rejection of invention patent application after publication |