CN109508733A - A kind of method for detecting abnormality based on distribution probability measuring similarity - Google Patents
A kind of method for detecting abnormality based on distribution probability measuring similarity Download PDFInfo
- Publication number
- CN109508733A CN109508733A CN201811233705.2A CN201811233705A CN109508733A CN 109508733 A CN109508733 A CN 109508733A CN 201811233705 A CN201811233705 A CN 201811233705A CN 109508733 A CN109508733 A CN 109508733A
- Authority
- CN
- China
- Prior art keywords
- data
- sample
- training
- node
- test point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
Abstract
The embodiment of the present invention proposes a kind of method for detecting abnormality based on distribution probability measuring similarity, it include: multiple subsets that multiple stochastical sampling obtains normal sample data, the random isolation processes that each subset is saved with full binary tree structure delimit the threshold depth of backtracking according to drift ratio;External leaf node position and the threshold depth that each tree is fallen according to test point, leaf node where it trace back to the ancestor node of threshold depth, extract training data of all data as measurement and test point similarity under the node;With certain point in test point and training dataset for endpoint, the probability that remaining data points occur between this two o'clock is calculated separately in each attribute dimensions and obtains the exceptional value of the point in conjunction with the dissimilar degree of all the points in Min Shi distance calculating test point and data set.Technical solution provided in an embodiment of the present invention can effectively solve training data and concentrate data without exception and local anomaly test problems.
Description
[technical field]
The present invention relates to machine learning field method for detecting abnormality, more particularly to one kind to be based on distribution probability measuring similarity
Method for detecting abnormality.
[background technique]
When solving the problems, such as abnormality detection using machine learning method, there are no abnormal datas to be trained, part is different
Often, each dimension dimension different distributions of data differ larger problem.The unsupervised part of data is solved according to suitable sorting algorithm
Abnormality detection problem is one of the hot spot studied now to improve model to normal and whole exceptional sample discrimination.It solves
At present for abnormality detection problem, conventional method is broadly divided into three types.The first is the method by mathematical statistics, is passed through
The probability size that each dimension in statistical number strong point or each dimension various combination of data point occur judges whether exception.Second for based on
The method for detecting abnormality of distance mainly judges to be by calculating test point with the degree of closeness of normal data is locally or globally gone up
No is abnormal.The third is to judge whether it is abnormal by clustering or calculating the modes such as distribution relative density based on data distribution.
But these methods have certain supposed premise: abnormal point and normal distribution cluster meet farther out, abnormal data density it is far low
In normal data density etc..But in actual application environment, being distributed with for abnormal data may very be concentrated or away from normal number
It is closer according to distribution cluster, or even have abnormal data and be wrapped in inside normal data cluster.The hypotheses that above-mentioned algorithm is done
It excessively idealizes, this does not always set up even possibility very little in actual application environment, causes model recognition effect unstable
It is fixed.In addition, will appear the situation of cluster exception in the events such as Epidemic outbreak of disease, network attack.It is a large amount of different when occurring extremely
Regular data distribution concentrates on one or more distributed areas, that is to say, that spatial abnormal feature feature is more concentrated, is densely distributed, base
Correct judgement can not be made to this situation in the serial algorithm of density.And in a practical situation, the case where anomalous concentration is that have
Very big researching value, discrete abnormal point illustrates that such abnormal probability of happening is lower, and for the region of anomalous concentration
Probability of happening is relatively high, detects that this exception can reduce loss to the full extent.Such as in network attack, if can be in time
It detects cluster exception, finds out its attack mode, then can provide effective Informational support for network O&M worker, avoid system
It is broken.
In conclusion for abnormality detection problem, there are following difficult points at present: abnormal data acquisition is more difficult;Abnormal number
More closely there is local anomaly according to away from normal data, existing major part algorithm only judges abnormal journey by global or local data distribution
Degree, can not comprehensively consider global and local distributed intelligence;Abnormal data distribution is more intensive, and distribution density is close with normal data
Even higher, highdensity abnormal data cluster is easily judged as normal data by the algorithm based on relative density;Each attribute of data point
Cloth range difference is larger, and by traditional distance calculating method such as Euclidean distance, weight has big difference between different dimensions, and early period
Normalization, standardization can adapt data original distribution state again;Abnormal data is distributed in inside normal data, and is distributed
Range has certain intersection;Have label training data concentrate data point distribution will not completely high density concentrate on a region, need to sieve
The training data for having information redundancy is selected, guarantees that subsequent processing is interference-free.
[summary of the invention]
In view of this, the embodiment of the present invention proposes a kind of method for detecting abnormality based on distribution probability measuring similarity,
To improve disaggregated model to the discrimination of positive negative sample entirety.
A kind of method for detecting abnormality based on distribution probability measuring similarity that the embodiment of the present invention proposes, comprising:
Multiple stochastical sampling obtains multiple subsets of normal sample data, complete with binary tree structure save each subset with
Machine isolation processes delimit the threshold depth of backtracking according to drift ratio;
External leaf node position and the threshold depth that each tree is fallen according to test point, leaf node is recalled where it
To the ancestor node of threshold depth, training data of all data as measurement and test point similarity under the node is extracted;
Remaining data points appearance is calculated separately in each attribute dimensions for endpoint with certain point in test point and training dataset
Probability between this two o'clock calculates the dissimilar degree of all the points in test point and data set in conjunction with Min Shi distance, obtains this
The exceptional value of point.
In the above method, multiple stochastical sampling obtains multiple subsets of normal sample data, with the preservation of full binary tree structure
The random isolation processes of each subset, according to drift ratio delimit backtracking threshold depth method are as follows: by training dataset D with
Machine samples to obtain several training subsets X_all, and each subset X contains m sample X={ X1, X2..., Xm, m is less than training
The positive integer of data set D size, can select appropriate value according to the actual situation, and each sample point contains n dimension, i.e., i-th
SampleRandomly select dimension and isolation threshold, isolation threshold be subset in certain dimension between
Random value between its maximum value and minimum value;Continuous iteration, until meet following three conditions one of them, then terminate and change
Generation: (1) spatially only one sample of each isolation;(2) spatially each sample point is identical in the dimension values;(3) reach
Iteration limit number;By this process record in a tree structure, a complete binary tree is formed, each node can contain
There is zero or two child node, what is saved in leaf node is the sample in each insulating space, and what internal node saved is isolation
Dimension and corresponding threshold value, depth threshold of the Dt as the retrospective search neighbours training points in tree need according to each training points
The average value E (h (x)) of depth h (x) is determined in each tree, as follows:
Wherein, E (h (x)) is the mean depth after sample x is traversed on all t isolation trees, and t is selected according to the actual situation
Select suitable positive integer, liIt (x) is the pathdepth of i-th tree;
Need to be arranged a drift ratio r, 0≤r≤1, the i.e. relative depature in all normal training dataset D for Dt
The ratio data of normal data distribution, r setting need to according in the dispersion degree and actual conditions of data distribution to model
Indices demand is measured, and is selected before each training sample mean depth in ((1-r) * 100) % according to the drift ratio r of setting
Minimum value as tree in retrospective search part training points depth threshold Dt.
In the above method, external leaf node position and the threshold depth of each tree are fallen according to test point, where it
Leaf node traces back to the ancestor node of threshold depth, extracts all data under the node as measurement and test point similarity
The method of training data are as follows: test point is sent into every one tree, if test sample falls in certain node under Dt depth, by test point
Place node is recalled upwards, and until forefathers' node of Dt depth, training sample Ltd all under forefathers' node are extracted work
For the data for next calculating test sample intensity of anomaly, the data extracted in all trees are incorporated as next instruction
Practice sample, if test sample more than Dt depth, using data all in Ltd as next training sample, is surveyed this
For sample sheet, farther out, local training data in its vicinity is less for the distribution in the overall situation away from normal data, therefore by Dt
Under all training samples extract as next training sample, the intensity of anomaly of further validation test data.
In the above method, it is calculated separately in each attribute dimensions for endpoint with certain point in test point and training dataset
There is the probability between this two o'clock in remainder strong point, and the dissmilarity of all the points in test point and data set is calculated in conjunction with Min Shi distance
Degree, the method for obtaining the exceptional value of the point are as follows: firstly, by Ri(x, y) is defined as sample x and sample y in i-th dimension xiAnd yiTwo
Region between value, at this time x ∈ Ltd.If S is the subspace set where all data of Ltd, SiFor space S i-th dimension sky
Between distribution, ifSelect SiMost value be boundary, then Ri(x, y) is converted to Ri(x, S), Numi(x, y, d, S) is
I-th dimension diWhether in RiBoolean in (x, y) range, wherein d is other samples in Ltd in addition to x, Mi(x,y|Ltd,S)
It is as follows for the training points number in Ltd in i-th dimension between x and y:
Wherein, I () is indicator function, and condition in bracket is otherwise 0 if true, its value is 1;Then with Mi(x,y|
Ltd, S) different degree of the ratio as x and y in i-th dimension in Ltd is accounted for, calculate different degree of the x and y in all dimensions
D ' (x, y), as follows:
Wherein, p is the index value in Minkowski Distance, finally, the abnormality score p (y) of test point y is as follows:
Wherein, p (y) be the similarity at test point y and all midpoints Ltd and, test point is ranked up, p (y) is got over
Greatly, intensity of anomaly is higher.
[Detailed description of the invention]
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this field
For those of ordinary skill, without any creative labor, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is that the process for the method for detecting abnormality based on distribution probability measuring similarity that the embodiment of the present invention is proposed is shown
It is intended to.
[specific embodiment]
For a better understanding of the technical solution of the present invention, being retouched in detail to the embodiment of the present invention with reference to the accompanying drawing
It states.
It will be appreciated that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its
Its embodiment, shall fall within the protection scope of the present invention.
The embodiment of the present invention provides the method for detecting abnormality based on distribution probability measuring similarity, referring to FIG. 1, it is this
The flow diagram for the method for detecting abnormality based on distribution probability measuring similarity that inventive embodiments are proposed, as shown in Figure 1,
Method includes the following steps:
Step 101, multiple stochastical sampling obtains multiple subsets of normal sample data, is saved with full binary tree structure each
The random isolation processes of subset delimit the threshold depth of backtracking according to drift ratio.
Specifically, obtaining several training subsets X_all by training dataset D stochastical sampling, each subset X contains m
Sample X={ X1, X2..., Xm, m is the positive integer less than training dataset D size, can select suitable number according to the actual situation
Value, each sample point contain n dimension, i.e. i-th of sample Randomly select dimension and isolation threshold
Value, isolation threshold are random value of the subset in certain dimension between its maximum value and minimum value;Continuous iteration, until meeting
Below three conditions one of them, then terminate iteration: (1) spatially only one sample of each isolation;(2) spatially each
Sample point is identical in the dimension values;(3) reach iteration limit number;By this process record in a tree structure, formed
One complete binary tree, each node can contain zero or two child node, and what is saved in leaf node is each insulating space
In sample, what internal node saved is the dimension and corresponding threshold value of isolation, and Dt is used as the retrospective search neighbours training points in tree
Depth threshold, need the average value E (h (x)) of the depth h (x) in each tree according to each training points to determine, following institute
Show:
Wherein, E (h (x)) is the mean depth after sample x is traversed on all t isolation trees, and t is selected according to the actual situation
Select suitable positive integer, liIt (x) is the pathdepth of i-th tree;
Need to be arranged a drift ratio r, 0≤r≤1, the i.e. relative depature in all normal training dataset D for Dt
The ratio data of normal data distribution, r setting need to according in the dispersion degree and actual conditions of data distribution to model
Indices demand is measured, and is selected before each training sample mean depth in ((1-r) * 100) % according to the drift ratio r of setting
Minimum value as tree in retrospective search part training points depth threshold Dt.
Algorithm 1 and algorithm 2 are the isolation processes of step 101 and the pseudocode of depth threshold setting method:
Step 102, external leaf node position and the threshold depth that each tree is fallen according to test point, leaf where it
Node traces back to the ancestor node of threshold depth, extracts training of all data as measurement and test point similarity under the node
Data.
Specifically, test point is sent into every one tree, if test sample falls in certain node under Dt depth, by test point institute
Recall upwards in node, until forefathers' node of Dt depth, training sample Ltd all under forefathers' node are extracted into conduct
The data extracted in all trees are incorporated as next training by the data for next calculating test sample intensity of anomaly
Sample, if test sample more than Dt depth, using data all in Ltd as next training sample, tests this
For sample, farther out, local training data in its vicinity is less, therefore will be under Dt for the distribution in the overall situation away from normal data
All training samples are extracted as next training sample, the intensity of anomaly of further validation test data.
Algorithm 3 and algorithm 4 are the pseudocode that local training data method is extracted in step 102:
Step 103, its remainder is calculated separately for endpoint in each attribute dimensions with certain point in test point and training dataset
There is the probability between this two o'clock in strong point, and the dissimilar journey of all the points in test point and data set is calculated in conjunction with Min Shi distance
Degree, obtains the exceptional value of the point.
Specifically, firstly, by Ri(x, y) is defined as sample x and sample y in i-th dimension xiAnd yiRegion between two values, this
When x ∈ Ltd.If S is the subspace set where all data of Ltd, SiFor space S i-th dimension spatial distribution range, ifSelect SiMost value be boundary, then Ri(x, y) is converted to Ri(x, S), Numi(x, y, d, S) is i-th dimension diWhether
RiBoolean in (x, y) range, wherein d is other samples in Ltd in addition to x, Mi(x, y | Ltd, S) it is in Ltd i-th
The training points number between x and y is tieed up, as follows:
Wherein, I () is indicator function, and condition in bracket is otherwise 0 if true, its value is 1;Then with Mi(x,y|
Ltd, S) different degree of the ratio as x and y in i-th dimension in Ltd is accounted for, calculate different degree of the x and y in all dimensions
D ' (x, y), as follows:
Wherein, p is the index value in Minkowski Distance, finally, the abnormality score p (y) of test point y is as follows:
Wherein, p (y) be the similarity at test point y and all midpoints Ltd and, test point is ranked up, p (y) is got over
Greatly, intensity of anomaly is higher.
Table solves 10 public affairs first is that the embodiment of the present invention provides the method for detecting abnormality based on distribution probability measuring similarity
When opening data set abnormality detection task, the pretreatment of data set is divided into normal data and abnormal data for all kinds of in data set.
Table one
Table solves 10 public affairs second is that the embodiment of the present invention provides the method for detecting abnormality based on distribution probability measuring similarity
When opening data set abnormality detection task, AUC value (ranking of random selection positive sample is higher than the probability of random selection negative sample)
Contrast and experiment, wherein in the embodiment of the present invention control methods be the type solution never KNN of balanced sort problem,
Eight kinds of methods of iForest, SCiForest, iNNE, ALSH, L1SH, L2SH, KLSH.By table one, it can be concluded that, the present invention is mentioned
Method DPSM out is concentrated in public data and is significantly improved in AUC value compared to control methods.It is especially real in the first seven group
Have greatly improved in testing, proposition method is highest level in eight groups of methods, remaining two groups also close with highest level.
The method that the embodiment of the present invention is proposed achieves certain breakthrough in method for detecting abnormality.
Table two
In conclusion the embodiment of the present invention has the advantages that
In the technical solution that the present invention is implemented, multiple stochastical sampling obtains multiple subsets of normal sample data, with complete two
Fork tree construction saves the random isolation processes of each subset, and the threshold depth of backtracking delimited according to drift ratio;According to test point
External leaf node position and the threshold depth for falling in each tree, the ancestors that leaf node where it traces back to threshold depth save
Point extracts training data of all data as measurement and test point similarity under the node;With test point and training dataset
Certain interior point is endpoint, probability of the remaining data points appearance between this two o'clock is calculated separately in each attribute dimensions, in conjunction with Min Shi
Distance calculates the dissimilar degree of all the points in test point and data set, obtains the exceptional value of the point.According to embodiments of the present invention
The technical solution of offer can effectively solve local anomaly test problems, can be using original training data according to where test point
The distribution of regional area normal data obtains its intensity of anomaly, improve the local anomaly of abnormality detection model detectability and its
Overall target.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (4)
1. a kind of method for detecting abnormality based on distribution probability measuring similarity, which is characterized in that the method step includes:
(1) repeatedly stochastical sampling obtain normal sample data multiple subsets, complete with binary tree structure save each subset with
Machine isolation processes delimit the threshold depth of backtracking according to drift ratio;
(2) external leaf node position and the threshold depth that each tree is fallen according to test point, leaf node is recalled where it
To the ancestor node of threshold depth, training data of all data as measurement and test point similarity under the node is extracted;
(3) remaining data points appearance is calculated separately for endpoint in each attribute dimensions with certain point in test point and training dataset
Probability between this two o'clock calculates the dissimilar degree of all the points in test point and data set in conjunction with Min Shi distance, obtains this
The exceptional value of point.
2. the method according to claim 1, wherein repeatedly stochastical sampling obtains multiple sons of normal sample data
Collection, the random isolation processes of each subset are saved with full binary tree structure, and the threshold depth of backtracking, tool delimited according to drift ratio
Body is described as follows: obtaining several training subsets X_all by training dataset D stochastical sampling, each subset X contains m sample X
={ X1, X2..., Xm, m is the positive integer less than training dataset D size, can select appropriate value according to the actual situation, each
Sample point contains n dimension, i.e. i-th of sampleDimension and isolation threshold are randomly selected, is isolated
Threshold value is random value of the subset in certain dimension between its maximum value and minimum value;Continuous iteration, until meeting following three
A condition one of them, then terminate iteration: (1) spatially only one sample of each isolation;(2) spatially each sample point
It is identical in the dimension values;(3) reach iteration limit number;By this process record in a tree structure, formation one is complete
Full binary tree, each node can contain zero or two child node, and what is saved in leaf node is the sample in each insulating space
This, what internal node saved is the dimension and corresponding threshold value of isolation, depth of the Dt as the retrospective search neighbours training points in tree
Threshold value needs the average value E (h (x)) of the depth h (x) in each tree according to each training points to determine, as follows:
Wherein, E (h (x)) is the mean depth after sample x is traversed on all t isolation trees, and t selects to close according to the actual situation
Suitable positive integer, liIt (x) is the pathdepth of i-th tree;
Need to be arranged a drift ratio r for Dt, 0≤r≤1, i.e., relative depature is normal in all normal training dataset D
The setting of the ratio data of data distribution range, r need to be according to every to model in the dispersion degree and actual conditions of data distribution
Index demand is measured, and is selected before each training sample mean depth in ((1-r) * 100) % most according to the drift ratio r of setting
Depth threshold Dt of the small value as retrospective search part training points in tree.
3. the method according to claim 1, wherein falling in the external leaf node position of each tree according to test point
It sets and threshold depth, leaf node where it traces back to the ancestor node of threshold depth, extract all data under the node and make
To measure the training data with test point similarity, illustrate are as follows: test point is sent into every one tree, if test sample is fallen in
Certain node under Dt depth, the node where test point are recalled upwards, will be under forefathers' node until forefathers' node of Dt depth
All training sample Ltd are extracted as the following data for calculating test sample intensity of anomaly, will be extracted in all trees
To data be incorporated as next training sample, if test sample more than Dt depth, using data all in Ltd as
Next training sample, for this test sample, the distribution in the overall situation away from normal data farther out, in its vicinity
Local training data is less, therefore training samples all under Dt are extracted as next training sample, further tests
Demonstrate,prove the intensity of anomaly of test data.
4. the method according to claim 1, wherein with certain point in test point and training dataset for endpoint,
Calculate separately remaining data points in each attribute dimensions and probability between this two o'clock occur, in conjunction with Min Shi distance calculate test point with
The dissimilar degree of all the points, obtains the exceptional value of the point, illustrates in data set are as follows: firstly, by Ri(x, y) is defined as sample
This x and sample y are in i-th dimension xiAnd yiRegion between two values, at this time x ∈ Ltd.If S is the subspace where all data of Ltd
Set, SiFor space S i-th dimension spatial distribution range, ifSelect SiMost value be boundary, then Ri(x, y) turns
It is changed to Ri(x, S), Numi(x, y, d, S) is i-th dimension diWhether in RiBoolean in (x, y) range, wherein d is that y is removed in Ltd
Other samples in addition, Mi(x, y | Ltd, S) is the training points number in Ltd in i-th dimension positioned at x and y between, as follows:
Wherein, I () is indicator function, and condition in bracket is otherwise 0 if true, its value is 1;Then with Mi(x, y | Ltd, S) it accounts for
Different degree of the ratio as x and y in i-th dimension in Ltd calculates different degree D ' (x, y) of the x and y in all dimensions,
It is as follows:
Wherein, p is the index value in Minkowski Distance, finally, the abnormality score p (y) of test point y is as follows:
Wherein, p (y) be the similarity at test point y and all midpoints Ltd and, test point is ranked up, p (y) is bigger, different
Chang Chengdu is higher.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811233705.2A CN109508733A (en) | 2018-10-23 | 2018-10-23 | A kind of method for detecting abnormality based on distribution probability measuring similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811233705.2A CN109508733A (en) | 2018-10-23 | 2018-10-23 | A kind of method for detecting abnormality based on distribution probability measuring similarity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109508733A true CN109508733A (en) | 2019-03-22 |
Family
ID=65745932
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811233705.2A Pending CN109508733A (en) | 2018-10-23 | 2018-10-23 | A kind of method for detecting abnormality based on distribution probability measuring similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109508733A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266680A (en) * | 2019-06-17 | 2019-09-20 | 辽宁大学 | A kind of industrial communication method for detecting abnormality based on dual similarity measurement |
CN110377828A (en) * | 2019-07-22 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Information recommendation method, device, server and storage medium |
CN110781433A (en) * | 2019-10-11 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Data type determination method and device, storage medium and electronic device |
CN111639680A (en) * | 2020-05-09 | 2020-09-08 | 西北工业大学 | Identity recognition method based on expert feedback mechanism |
CN111784966A (en) * | 2020-06-15 | 2020-10-16 | 武汉烽火众智数字技术有限责任公司 | Personnel management and control method and system based on machine learning |
CN112085053A (en) * | 2020-07-30 | 2020-12-15 | 济南浪潮高新科技投资发展有限公司 | Data drift discrimination method and device based on nearest neighbor method |
CN112181706A (en) * | 2020-10-23 | 2021-01-05 | 北京邮电大学 | Power dispatching data anomaly detection method based on logarithmic interval isolation |
CN113204542A (en) * | 2021-04-22 | 2021-08-03 | 武汉大学 | Abnormal electricity sample cleaning and behavior recognition method |
WO2022134578A1 (en) * | 2020-12-22 | 2022-06-30 | 深圳壹账通智能科技有限公司 | Method and apparatus for determining answer sequence |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040064029A1 (en) * | 2002-09-30 | 2004-04-01 | The Government Of The Usa As Represented By The Secretary Of The Dept. Of Health & Human Services | Computer-aided classification of anomalies in anatomical structures |
CN102664961A (en) * | 2012-05-04 | 2012-09-12 | 北京邮电大学 | Method for anomaly detection in MapReduce environment |
CN103400152A (en) * | 2013-08-20 | 2013-11-20 | 哈尔滨工业大学 | High sliding window data stream anomaly detection method based on layered clustering |
CN103473540A (en) * | 2013-09-11 | 2013-12-25 | 天津工业大学 | Vehicle track incremental modeling and on-line abnormity detection method of intelligent traffic system |
CN104317681A (en) * | 2014-09-02 | 2015-01-28 | 上海交通大学 | Behavioral abnormality automatic detection method and behavioral abnormality automatic detection system aiming at computer system |
WO2015167562A1 (en) * | 2014-04-30 | 2015-11-05 | Hewlett-Packard Development Company, L.P. | Using local memory nodes of a multicore machine to process a search query |
US20150332523A1 (en) * | 2014-05-19 | 2015-11-19 | EpiSys Science, Inc. | Method and apparatus for biologically inspired autonomous infrastructure monitoring |
CN106503086A (en) * | 2016-10-11 | 2017-03-15 | 成都云麒麟软件有限公司 | The detection method of distributed local outlier |
CN106598822A (en) * | 2015-10-15 | 2017-04-26 | 华为技术有限公司 | Abnormal data detection method and device applied to capacity estimation |
CN107426207A (en) * | 2017-07-21 | 2017-12-01 | 哈尔滨工程大学 | A kind of network intrusions method for detecting abnormality based on SA iForest |
CN107657288A (en) * | 2017-10-26 | 2018-02-02 | 国网冀北电力有限公司 | A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm |
CN108333314A (en) * | 2018-04-02 | 2018-07-27 | 深圳凯达通光电科技有限公司 | A kind of air pollution intelligent monitor system |
-
2018
- 2018-10-23 CN CN201811233705.2A patent/CN109508733A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040064029A1 (en) * | 2002-09-30 | 2004-04-01 | The Government Of The Usa As Represented By The Secretary Of The Dept. Of Health & Human Services | Computer-aided classification of anomalies in anatomical structures |
CN102664961A (en) * | 2012-05-04 | 2012-09-12 | 北京邮电大学 | Method for anomaly detection in MapReduce environment |
CN103400152A (en) * | 2013-08-20 | 2013-11-20 | 哈尔滨工业大学 | High sliding window data stream anomaly detection method based on layered clustering |
CN103473540A (en) * | 2013-09-11 | 2013-12-25 | 天津工业大学 | Vehicle track incremental modeling and on-line abnormity detection method of intelligent traffic system |
WO2015167562A1 (en) * | 2014-04-30 | 2015-11-05 | Hewlett-Packard Development Company, L.P. | Using local memory nodes of a multicore machine to process a search query |
US20150332523A1 (en) * | 2014-05-19 | 2015-11-19 | EpiSys Science, Inc. | Method and apparatus for biologically inspired autonomous infrastructure monitoring |
CN104317681A (en) * | 2014-09-02 | 2015-01-28 | 上海交通大学 | Behavioral abnormality automatic detection method and behavioral abnormality automatic detection system aiming at computer system |
CN106598822A (en) * | 2015-10-15 | 2017-04-26 | 华为技术有限公司 | Abnormal data detection method and device applied to capacity estimation |
CN106503086A (en) * | 2016-10-11 | 2017-03-15 | 成都云麒麟软件有限公司 | The detection method of distributed local outlier |
CN107426207A (en) * | 2017-07-21 | 2017-12-01 | 哈尔滨工程大学 | A kind of network intrusions method for detecting abnormality based on SA iForest |
CN107657288A (en) * | 2017-10-26 | 2018-02-02 | 国网冀北电力有限公司 | A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm |
CN108333314A (en) * | 2018-04-02 | 2018-07-27 | 深圳凯达通光电科技有限公司 | A kind of air pollution intelligent monitor system |
Non-Patent Citations (2)
Title |
---|
NISHAD P ET AL: "Anomaly detection for IGBTs using Mahalanobis distance", 《MICROELECTRONICS RELIABILITY》 * |
董国宾等: "基于RFID路径数据的异常路径检测", 《计算机应用研究》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266680B (en) * | 2019-06-17 | 2021-08-24 | 辽宁大学 | Industrial communication anomaly detection method based on dual similarity measurement |
CN110266680A (en) * | 2019-06-17 | 2019-09-20 | 辽宁大学 | A kind of industrial communication method for detecting abnormality based on dual similarity measurement |
CN110377828B (en) * | 2019-07-22 | 2023-05-26 | 腾讯科技(深圳)有限公司 | Information recommendation method, device, server and storage medium |
CN110377828A (en) * | 2019-07-22 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Information recommendation method, device, server and storage medium |
CN110781433B (en) * | 2019-10-11 | 2023-06-02 | 腾讯科技(深圳)有限公司 | Data type determining method and device, storage medium and electronic device |
CN110781433A (en) * | 2019-10-11 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Data type determination method and device, storage medium and electronic device |
CN111639680A (en) * | 2020-05-09 | 2020-09-08 | 西北工业大学 | Identity recognition method based on expert feedback mechanism |
CN111639680B (en) * | 2020-05-09 | 2022-08-09 | 西北工业大学 | Identity recognition method based on expert feedback mechanism |
CN111784966A (en) * | 2020-06-15 | 2020-10-16 | 武汉烽火众智数字技术有限责任公司 | Personnel management and control method and system based on machine learning |
CN112085053B (en) * | 2020-07-30 | 2022-08-26 | 山东浪潮科学研究院有限公司 | Data drift discrimination method and device based on nearest neighbor method |
CN112085053A (en) * | 2020-07-30 | 2020-12-15 | 济南浪潮高新科技投资发展有限公司 | Data drift discrimination method and device based on nearest neighbor method |
CN112181706A (en) * | 2020-10-23 | 2021-01-05 | 北京邮电大学 | Power dispatching data anomaly detection method based on logarithmic interval isolation |
CN112181706B (en) * | 2020-10-23 | 2023-09-22 | 北京邮电大学 | Power dispatching data anomaly detection method based on logarithmic interval isolation |
WO2022134578A1 (en) * | 2020-12-22 | 2022-06-30 | 深圳壹账通智能科技有限公司 | Method and apparatus for determining answer sequence |
CN113204542A (en) * | 2021-04-22 | 2021-08-03 | 武汉大学 | Abnormal electricity sample cleaning and behavior recognition method |
CN113204542B (en) * | 2021-04-22 | 2023-08-22 | 武汉大学 | Abnormal electricity consumption sample cleaning and behavior recognition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109508733A (en) | A kind of method for detecting abnormality based on distribution probability measuring similarity | |
US20200374720A1 (en) | Method for Detecting Abnormal Data in Sensor Network | |
CN111833172A (en) | Consumption credit fraud detection method and system based on isolated forest | |
US20080306715A1 (en) | Detecting Method Over Network Intrusion | |
Arbin et al. | Comparative analysis between k-means and k-medoids for statistical clustering | |
CN111562108A (en) | Rolling bearing intelligent fault diagnosis method based on CNN and FCMC | |
CN103473540B (en) | The modeling of intelligent transportation system track of vehicle increment type and online method for detecting abnormality | |
CN113887616A (en) | Real-time abnormity detection system and method for EPG (electronic program guide) connection number | |
CN109039503A (en) | A kind of frequency spectrum sensing method, device, equipment and computer readable storage medium | |
CN107579846B (en) | Cloud computing fault data detection method and system | |
CN112735097A (en) | Regional landslide early warning method and system | |
CN106792883A (en) | Sensor network abnormal deviation data examination method and system | |
CN113516228B (en) | Network anomaly detection method based on deep neural network | |
CN110297207A (en) | Method for diagnosing faults, system and the electronic device of intelligent electric meter | |
CN101738998A (en) | System and method for monitoring industrial process based on local discriminatory analysis | |
CN111445147A (en) | Generative confrontation network model evaluation method for mechanical fault diagnosis | |
CN111950645A (en) | Method for improving class imbalance classification performance by improving random forest | |
CN112756759A (en) | Spot welding robot workstation fault judgment method | |
CN114707571A (en) | Credit data anomaly detection method based on enhanced isolation forest | |
CN108508297A (en) | A kind of fault arc detection method based on change coefficient and SVM | |
CN113822336A (en) | Cloud hard disk fault prediction method, device and system and readable storage medium | |
CN114756420A (en) | Fault prediction method and related device | |
CN106991171A (en) | Topic based on Intelligent campus information service platform finds method | |
CN110472188A (en) | A kind of abnormal patterns detection method of facing sensing data | |
CN106683263B (en) | The defect management method and system of valuable bills |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190322 |
|
WD01 | Invention patent application deemed withdrawn after publication |