CN109101661A - The detection method and device of abnormal point in a kind of data sample set - Google Patents
The detection method and device of abnormal point in a kind of data sample set Download PDFInfo
- Publication number
- CN109101661A CN109101661A CN201811069817.9A CN201811069817A CN109101661A CN 109101661 A CN109101661 A CN 109101661A CN 201811069817 A CN201811069817 A CN 201811069817A CN 109101661 A CN109101661 A CN 109101661A
- Authority
- CN
- China
- Prior art keywords
- data sample
- target
- cluster
- exceptional value
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses the detection methods and device of abnormal point in a kind of data sample set, pass through the calculating quantified to each data sample in data sample set, obtain the target exceptional value of each data sample, can determine whether each data sample is abnormal point in the data sample set according to the target exceptional value of each data sample.Wherein, according to the space length between each data sample and other data samples and/or the target class cluster in the target cluster analysis result for data sample set where the data sample, the target exceptional value of each data sample is determined.In this way, the not accurate enough problem of the abnormal point that the subjective judgement that effective solution relies solely on technical staff detects, pass through the target exceptional value of each data sample in the data sample set of quantization, it can determine the abnormal point in the data sample set, accurately so as to so that the subsequent data analysis result to the data sample set is relatively reliable and effective.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of detection side of abnormal point in data sample set
Method and device.
Background technique
In data sample set, the feature of the feature of some data samples and other most of data samples exists significant
Difference, then these data samples are the abnormal point in the data sample set.In general, the pretreatment before data analysis
Process needs to detect abnormal point from data sample set.Wherein, the accuracy of outlier detection, to data analysis result
Accuracy have very important influence.
Currently, the detection mode of abnormal point will largely be realized by the subjective judgement of technical staff.For example, logical
It crosses the mode of data visualization and image displaying is carried out to the data sample in data sample set, technical staff paddles one's own canoe subjectivity
Experience and knowledge judge the feature difference situation of each data sample in image, to identify the exception in data sample set
Point.But the feature difference between each data sample excessively is determined by the subjective judgement of technical staff, it is difficult to guarantee data
Feature difference between sample objectively, is scientifically assessed, to be difficult to ensure the accuracy of outlier detection.
Summary of the invention
The technical problem to be solved by the invention is to provide the detection methods and dress of abnormal point in a kind of data sample set
Set so that in data sample set each data sample relative to other data samples feature difference can by objectively, section
Ground quantization is learned, so that the abnormal point in data sample set can be accurately detected.
In a first aspect, the embodiment of the invention provides a kind of detection methods of abnormal point in data sample set, comprising:
Obtain the target data sample in the data sample set;
According to the space length between the target data sample and other data samples and/or it is being directed to the data sample
Target class cluster where target data sample described in the target cluster analysis result of this set determines the target data sample
Target exceptional value;Wherein, other described data samples be the data sample set in addition to the target data sample
Data sample;
According to the target exceptional value of the target data sample, determine whether the target data sample is the data sample
Abnormal point in this set.
Optionally, the space length according between the target data sample and other data samples and/or in needle
Target class cluster where target data sample described in target cluster analysis result to the data sample set, determine described in
The target exceptional value of target data sample, comprising:
The space length between the target data sample and other each described data samples is calculated separately, and according to institute
The space length between target data sample and other each described data samples is stated, the first of the target data sample is calculated
Exceptional value;
Clustering is carried out to the data sample set, obtains the target cluster analysis result, and according to the mesh
The target class cluster where target data sample described in cluster analysis result is marked, calculate the target data sample second is abnormal
Value;
According to fusion weight, the first exceptional value of the target data sample and the second exceptional value are fused to the target
The target exceptional value of data sample.
Optionally, the space length according between the target data sample and other each described data samples,
Calculate the first exceptional value of the target data sample, comprising:
With the space length between the target data sample and other each described data samples, the number of targets is determined
Distance is superimposed according between sample and other each described data samples;
The distance that is superimposed between the target data sample and other each described data samples is overlapped, institute is obtained
State the first exceptional value of target data sample.
Optionally, superposition distance be specially to the space length carry out obtained from Nonlinear Mapping it is non-linear away from
From.
Optionally, the space length is specially Minkowski Distance three times.
Optionally, described that clustering is carried out to the data sample set, obtain the cluster analysis result, comprising:
The data sample as initial cluster center is chosen in the data sample set;
Using initial cluster center as current cluster centre, using the current cluster centre to the data sample set
Clustering is carried out, current cluster analysis result is obtained;
If being unsatisfactory for iteration stopping condition, redefined according to the current class cluster in the current cluster analysis result described
Current cluster centre, is returned again to execute later and described is clustered using the current cluster centre to the data sample set
Analysis;
If meeting iteration stopping condition, the current cluster analysis result is determined as the target cluster analysis result.
Optionally, first initial cluster center selected is the set of data samples from the data sample set
The smallest data sample of first exceptional value described in conjunction.
Optionally, the current class cluster according in the current cluster analysis result redefines in the current cluster
The heart, comprising:
According to the current class cluster, the first exceptional value of each data sample in the current class cluster is calculated;
Based on the reservation data sample in the current class cluster, the class cluster center of the current class cluster is calculated;Wherein, described
First exceptional value of reservation data sample is respectively less than the data sample in the current class cluster in addition to the reservation data sample
The first exceptional value;
The current cluster centre is redefined, so that working as described in the class cluster center conduct of the current class cluster
Preceding cluster centre.
Optionally, the target class cluster where the target data sample according to the target cluster analysis result,
Calculate the second exceptional value of the target data sample, comprising:
According to the target class cluster, determine in the target class cluster quantity of data sample and the target data sample with
Space length between the class cluster center of the target class cluster;
According to the class of the quantity of data sample and the target data sample and the target class cluster in the target class cluster
Space length between cluster center calculates the second exceptional value of the target data sample.
Second aspect, the embodiment of the invention also provides a kind of detection devices of abnormal point in data sample set, comprising:
Module is obtained, for obtaining the target data sample in the data sample set;
First determining module, for according between the target data sample and other data samples space length and/
Or for the target class cluster where target data sample described in the target cluster analysis result of the data sample set, really
The target exceptional value of the fixed target data sample;Wherein, other described data samples in the data sample set remove institute
State the data sample except target data sample;
Second determining module determines the target data sample for the target exceptional value according to the target data sample
Whether this is abnormal point in the data sample set.
Optionally, first determining module, comprising:
First computational submodule, for calculating separately between the target data sample and other each described data samples
Space length calculate institute and according to the space length between the target data sample and other each described data samples
State the first exceptional value of target data sample;
Second computational submodule obtains the target cluster point for carrying out clustering to the data sample set
Analysis as a result, and the target class cluster where the target data sample according to the target cluster analysis result, calculate the mesh
Mark the second exceptional value of data sample;
Submodule is merged, is used for according to fusion weight, the first exceptional value of the target data sample and second is abnormal
Value is fused to the target exceptional value of the target data sample.
Optionally, first computational submodule, comprising:
First determination unit, for the space between the target data sample and other each described data samples away from
From determining and be superimposed distance between the target data sample and other each described data samples;
Superpositing unit, for by between the target data sample and other each described data samples be superimposed distance into
Row superposition, obtains the first exceptional value of the target data sample.
Optionally, superposition distance be specially to the space length carry out obtained from Nonlinear Mapping it is non-linear away from
From.
Optionally, the space length is specially Minkowski Distance three times.
Optionally, second computational submodule, comprising:
Selection unit, for choosing the data sample as initial cluster center in the data sample set;
Cluster cell is used for using initial cluster center as current cluster centre, using the current cluster centre to institute
It states data sample set and carries out clustering, obtain current cluster analysis result;
Second determination unit, if for being unsatisfactory for iteration stopping condition, according to working as in the current cluster analysis result
Preceding class cluster redefines the current cluster centre, returns again to execute the utilization current cluster centre to the number later
Clustering is carried out according to sample set;
Third determination unit, if being determined as the current cluster analysis result described for meeting iteration stopping condition
Target cluster analysis result.
Optionally, first initial cluster center selected is the set of data samples from the data sample set
The smallest data sample of first exceptional value described in conjunction.
Optionally, second determination unit, comprising:
First computation subunit, for calculating each data sample in the current class cluster according to the current class cluster
First exceptional value;
Second computation subunit, for calculating the current class cluster based on the reservation data sample in the current class cluster
Class cluster center;Wherein, first exceptional value for retaining data sample is respectively less than in the current class cluster except the encumbrance
According to the first exceptional value of the data sample except sample;
Subelement is determined, for redefining to the current cluster centre, so that the class of the current class cluster
Cluster center is as the current cluster centre.
Optionally, second computational submodule, comprising:
4th determination unit, for according to the target class cluster, determine in the target class cluster quantity of data sample and
Space length between the target data sample and the class cluster center of the target class cluster;
Computing unit, for according to the quantity of data sample in the target class cluster and the target data sample with it is described
Space length between the class cluster center of target class cluster, calculates the second exceptional value of the target data sample.
The third aspect, it is described the embodiment of the invention also provides a kind of detection device of abnormal point in data sample set
Equipment includes processor and memory:
Said program code is transferred to the processor for storing program code by the memory;
Described in the processor is used for according to provided by the instruction execution first aspect present invention in said program code
The detection method of abnormal point in data sample set.
Fourth aspect, the embodiment of the invention also provides a kind of storage medium, the storage medium is for storing program generation
Code, said program code are used to execute the detection side of abnormal point in the data sample set provided by first aspect present invention
Method.
Compared with prior art, the embodiment of the present invention has the advantage that
In embodiments of the present invention, it by the calculating quantified to each data sample in data sample set, obtains
The target exceptional value of each data sample can determine whether each data sample is this according to the target exceptional value of each data sample
Abnormal point in data sample set.Wherein, specifically calculate target exceptional value mode may is that according to each data sample with
Space length between other data samples and/or the data sample in the target cluster analysis result for data sample set
Target class cluster where this determines the target exceptional value of each data sample.In this way, quantization that can be accurate, scientific goes out to count
According to the feature of data sample in sample set, effective solution relies solely on the exception that the subjective judgement of technical staff detects
The not accurate enough and less reliable problem of point can by the target exceptional value of each data sample in the data sample set of quantization
Accurately to determine the abnormal point in the data sample set, so as to so that the follow-up data to the data sample set divides
It is relatively reliable and effective to analyse result.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The some embodiments recorded in invention, for those of ordinary skill in the art, without creative efforts,
It is also possible to obtain other drawings based on these drawings.
Fig. 1 is the process signal of the detection method of abnormal point in a kind of data sample set provided in an embodiment of the present invention
Figure;
Fig. 2 is a kind of flow example figure of implementation of step 102 provided in an embodiment of the present invention;
Fig. 3 is the flow example figure of another implementation of step 102 provided in an embodiment of the present invention;
Fig. 4 is the flow example figure of another implementation of step 102 provided in an embodiment of the present invention;
Fig. 5 is the structural representation of the detection device of abnormal point in a kind of data sample set provided in an embodiment of the present invention
Figure;
Fig. 6 is the structural representation of the detection device of abnormal point in a kind of data sample set provided in an embodiment of the present invention
Figure.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention
Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only this
Invention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist
Every other embodiment obtained under the premise of creative work is not made, shall fall within the protection scope of the present invention.
Currently, being carried out to the data sample in data sample set abnormal in the preprocessing process before data analysis
Point detection, depends on the subjective judgement of technical staff usually to realize.For example, by data sample by way of data visualization
Data sample in set intuitively shows technical staff in the form of images, then by the self-dependent subjective warp of technical staff
It tests and judges whether the corresponding each data sample of image is abnormal point with knowledge.But the subjectivity of technical staff is relied solely in this way
Judge the abnormal point detected, it is difficult to guarantee that the abnormal point in data sample set objectively, scientifically can be determined and be arranged
It removes, it is not accurate enough and reliable to lead to the detection of abnormal point, to be likely to make the subsequent data analysis to the data sample set
As a result it produces serious influence.
Based on this, to solve the above-mentioned problems, the embodiment of the invention provides abnormal points in a kind of data sample set
Detection method obtains the target of each data sample by the calculating quantified to each data sample in data sample set
Exceptional value, wherein the specific mode for calculating target exceptional value may is that according between each data sample and other data samples
Space length and/or the target class in the target cluster analysis result for data sample set where the data sample
Cluster determines the target exceptional value of each data sample;It, can be according to each number after equivalent dissolves the target exceptional value of each data sample
According to the target exceptional value of sample, determine whether each data sample is abnormal point in the data sample set.In this way, effective solution
It has determined and has relied solely on the not accurate enough and less reliable problem of abnormal point that the subjective judgement of technical staff detects, it can be accurate
, the scientific target exceptional value for quantifying data sample in data sample set out, so as to accurately determine the data
Abnormal point in sample set, and then may insure that the subsequent data analysis result of the data sample set is relatively reliable and has
Effect.
With reference to the accompanying drawing, the various non-limiting embodiments in embodiment that the present invention will be described in detail.
It is the process of the detection method of abnormal point in a kind of data sample set provided in an embodiment of the present invention referring to Fig. 1
Schematic diagram.In the present embodiment, this method can specifically include 101~step 103 of following step:
Step 101, the target data sample in data sample set is obtained.
It is understood that data sample set is the set for including multiple data samples, wherein each data sample
It can be expressed as the vector of a N-dimensional data (wherein, N is positive integer).For example, for data sample set X, including data sample
This x1、x2、…、xi..., wherein data sample xiIt can indicate are as follows: xi={ xi1, xi2..., xiN}。
When specific implementation, any one data sample therein can be determined from all data samples of data sample set
This, as target data sample, and the target data sample is object to be detected, executes subsequent step and detects the number of targets
It whether is abnormal point according to sample.It should be noted that for all data samples in data sample set, may require by
Object as outlier detection in the present embodiment, that is, each data sample is required to execute as target data sample primary
The present embodiment.
Step 102, according to the space length between target data sample and other data samples and/or for data sample
Target class cluster in the target cluster analysis result of this set where target data sample determines that the target of target data sample is different
Constant value;Wherein, other data samples are the data sample in data sample set in addition to target data sample.
It is understood that target exceptional value, refer to the feature of any one data sample in data sample set with
The size being had differences between the feature of data sample in the data sample set other than any one data sample,
It is the outlier detection to data sample for indicating the intensity of anomaly of any one data sample in data sample set
Quantization.For any one data sample in data sample set, if target exceptional value is bigger, any one number is indicated
Between feature according to the data sample in the feature of sample and the data sample set other than any one data sample
Difference it is bigger, further relate to any one data sample be abnormal point a possibility that it is bigger;If target exceptional value is smaller,
Indicate the number in the feature and the data sample set of any one data sample other than any one data sample
It is smaller according to the difference between the feature of sample, it is smaller to further relate to a possibility that any one data sample is abnormal point.
It include three kinds of concrete implementation modes in step 102:, can basis in the first implementation when specific implementation
Space length between target data sample and other data samples determines the target exceptional value of target data sample.As one
A example, the implementation can specifically include: firstly, calculating separately between target data sample and other each data samples
Space length;Then, according to the space length between target data sample and other each data samples, the number of targets is calculated
Target exceptional value according to the first exceptional value of sample, as the target data sample.
In second of implementation, can for data sample set target cluster analysis result in target data sample
Target class cluster where this determines the target exceptional value of target data sample.As an example, which specifically can be with
It include: firstly, obtaining target cluster analysis result to data sample set progress clustering;Then, it is clustered according to the target
The target class cluster in result where target data sample is analyzed, the second exceptional value of the target data sample is calculated, as the mesh
Mark the target exceptional value of data sample.
In the third implementation, can also according to the space length between target data sample and other data samples,
And the target class cluster in the target cluster analysis result for data sample set where target data sample, determine mesh
Mark the target exceptional value of data sample.As an example, which may include: firstly, according to the first realization side
Mode shown in example in formula obtains the first exceptional value of the target data sample;Then, according in second of implementation
Example shown in mode obtain the second exceptional value of the target data sample;Finally, according to fusion weight, by the target data
The first exceptional value and the second exceptional value of sample are fused to the target exceptional value of the target data sample.
It is understood that in the present embodiment, the first exceptional value refers to any one data in data sample set
Sample is based on the above-mentioned calculated exceptional value of the first implementation, and it is to data which, which is properly termed as local anomaly value,
A kind of quantization parameter of the outlier detection of sample;Second exceptional value refers to any one data sample in data sample set
Based on the above-mentioned calculated exceptional value of second of implementation, it is to data which, which can be referred to as whole exceptional value,
Another quantization parameter of the outlier detection of sample.It should be noted that as whether being abnormal for determining data sample
The target exceptional value of point can also be the first exceptional value and second either the first exceptional value, is also possible to the second exceptional value
Exceptional value carries out fused exceptional value.
For the first implementation, space length refers between target data sample and other data samples
Distance, such as: in a kind of situation, which can be Euclidean distance;In another case, in order to make abnormal data sample
Can be embodied on space length with the difference of normal data sample it is more obvious, to promote the accurate of outlier detection
Property, which is also possible to Minkowski Distance three times.
For example, for data sample x1And x2, then, data sample x1With data sample x2Between Euclidean distance
Calculation formula are as follows:
Data sample x1With data sample x2Between Minkowski Distance three times calculation formula are as follows:
It is understood that the embodiment that Minkowski Distance can be more obvious three times is abnormal compared to Euclidean distance
The difference of data sample and normal data sample on space length, so as to more accurately realize the detection of abnormal point.
When specific implementation, as shown in Fig. 2, step 102 can specifically include:
Step 201, the space length between target data sample and other each data samples is calculated separately.
Step 202, with the space length between target data sample and other each data samples, target data sample is determined
This is superimposed distance between other each described data samples.
Step 203, the distance that is superimposed between target data sample and other each described data samples is overlapped, is obtained
To the first exceptional value of target data sample.
Wherein, the sum of the distance for being superimposed distance, referring to the distance for superposition, rather than obtained after space length superposition.
The superposition distance, corresponding with space length to exist, a space length corresponds to a superposition distance.
As an example, being superimposed distance and can directly take between the target data sample and other each data samples
Space length itself between target data sample and other each data samples.So, mode according to Fig.2, calculates this
The calculation formula of first exceptional value of target data sample can be with are as follows:
Wherein, in a kind of situation, M can be other data in the data sample set other than target data sample
The quantity of sample;In another case, M may be the quantity of all data samples in the data sample set, due to mesh
It marks data sample and the space length of its own is 0, then, the calculating effect of formula described in above-mentioned two situations (3) is complete
It is complete consistent.
For example, it is assumed that data sample set X includes data sample x1、x2、x3、x4、x5, the mesh that is got according to step 101
Mark data sample is x1, it is possible to calculate target data sample x according to above-mentioned formula (1) or formula (2)1Respectively and
Data sample x2、x3、x4、x5Between space length d12、d13、d14And d15, then, according to above-mentioned formula (3), available L
(x1, X) and=d12+d13+d14+d15。
As another example, in order to make larger feature difference existing for target data sample and other data samples more
It, can also be by target data sample and other each numbers in first exceptional value of the prominent target data sample being embodied in
Certain processing is carried out according to the space length between sample, is obtained between the target data sample and other each data samples
It is superimposed distance, such as: Nonlinear Mapping is carried out to space length, the corresponding non-linear distance of space length is obtained, as the sky
Between the corresponding superposition distance of distance.In this way, obtaining corresponding superposition distance will be bigger when space length is larger,;Conversely,
When space length is smaller, obtaining corresponding superposition distance will be smaller, so as to so that the first exceptional value can protrude embodiment
The unusual condition of data sample.At this point, mode according to Fig.2, calculates the calculating of the first exceptional value of the target data sample
Formula can be with are as follows:
Wherein,For Nonlinear Mapping, Sigmoid function specifically can be used, it may be assumed thatThe letter
Number is a S type growth type function, which has monotonic increase, its inverse function monotonic increase and functional value between 0 to 1
The characteristics of.When the argument of function (that is, space length) is bigger, obtained dependent variable (that is, superposition distance) is closer to 1;Instead
It, when the argument of function (that is, space length) gets over hour, obtained dependent variable (that is, superposition distance) is closer to 0.
For example, it is assumed that the target data sample in data sample set X is x1, and target data sample x1Sum number respectively
According to sample x2、x3、x4、x5Between space length be d12、d13、d14And d15, then, to space length d12、d13、d14And d15Into
The corresponding Nonlinear Mapping of row Sigmoid function obtains the corresponding superposition distance S of each space length12、S13、S14And S15,According to
Above-mentioned formula (4), available L (x1, X) and=S12+S13+S14+S15。
In this way, can according to the space between other data samples in target data sample and data sample set away from
From, determine the first exceptional value of the target data sample, as the quantization of the local anomaly degree to the target data sample,
A possibility that target data sample is abnormal point size is objectively embodied, the master of data sample set outlier detection is overcome
The property seen, therefore, the first implementation provides a kind of science and reliable abnormal point assessment strategy.
For second of implementation, referring to Fig. 3, step 102 be can specifically include:
Step 301, clustering is carried out to data sample set, obtains target cluster analysis result;
Step 302, according to the target class cluster where target data sample in the target cluster analysis result, the target is calculated
Second exceptional value of data sample.
It is understood that clustering, refers to and is grouped physics or abstract data sample, by similar data sample
The analytic process of multiple class clusters of this composition.In the present embodiment, clustering is carried out to data sample set, such as can adopted
With K-means++ clustering algorithm, clustering is carried out to data sample set.It, will be in data sample set after clustering
Multiple data samples are categorized into multiple class clusters, obtain the relevant information of this multiple class cluster and each class cluster, and it is poly- to be denoted as target
Alanysis result.
Illustrate the specific implementation of the step 301: the first step by taking K-means++ clustering algorithm as an example, from data sample set
Middle selection initial cluster center, comprising: first choose first initial cluster center, then calculate first initial cluster center
The biggish data sample of space length is chosen for using such as wheel disc mechanism with the space length between each data sample
Second initial cluster center, until selecting K cluster centre;Second step, for each data in data sample set
Sample calculates its space length for arriving K initial cluster center, and each data sample is assigned to the smallest with its space length
In the corresponding class cluster of initial cluster center;Third step recalculates the first time cluster centre of such cluster for each class cluster;
4th step judges whether to meet iteration stopping condition, if it is, class cluster and relevant information that the secondary clustering is obtained are made
Otherwise above-mentioned second step and third step are repeated, until meeting iteration stopping condition for cluster analysis result.
When specific implementation, which be can specifically include:
Step 3011, the data sample as initial cluster center is chosen in data sample set.
As an example, when choosing initial cluster center, a number can be first randomly selected in data sample set
According to sample, the initial cluster center selected as first;Again from data sample set choose with this selected just
Beginning cluster centre is apart from farther away data sample, as other initial cluster centers.
As another example, in order to keep the initial cluster center chosen more suitable, so that process of cluster analysis be made to make
Computing resource is less, convergence rate faster, can be first first from data sample set when choosing initial cluster center
The initial cluster center selected is the smallest data sample of the first exceptional value in data sample set;Again from data sample set
Middle selection and the initial cluster center selected are apart from farther away data sample, as other initial cluster centers.This
Sample is chosen in the initial clustering that the corresponding data sample of the first exceptional value of minimum is selected as first in data sample set
The heart, since corresponding first exceptional value is minimum, then, the data sample being selected in data sample set except this is selected
Data sample except data sample space length it is minimum, that is, the data sample region being selected is data sample
This most intensive region, in this way, the initial cluster center selected can effectively reduce the number of subsequent clustering.
Step 3012, using initial cluster center as current cluster centre, using current cluster centre to the data sample
Set carries out clustering, obtains current cluster analysis result.
It is understood that may include with multiple initial cluster centers in current cluster analysis result in current cluster
Multiple current class clusters that the heart divides.And for each current class cluster, it may each comprise multiple data samples, wherein data sample
Abnormal point in set is likely to be present in a current class cluster, it is also possible to be respectively present in multiple current class clusters.A kind of situation
Under, if in current class cluster both having included normal data sample or including abnormal point, due to abnormal point feature and deserve
There are notable differences for the feature of other data samples in preceding class cluster, therefore, the sky at the class cluster center of the abnormal point and the current class cluster
Between distance farther out.In another case, if the negligible amounts for the data sample for including in current class cluster, it may deserve
All data samples in preceding class cluster are abnormal point, such as only exist a data sample in certain current class cluster, i.e., it is believed that
The data sample is the abnormal point in data sample set.
Step 3013, judge whether to meet iteration stopping condition, if not, thening follow the steps 3014;Otherwise, step is executed
3015。
It is understood that whether complete iteration stopping condition, be used to indicate the clustering carried out to data sample set
At, if meeting the iteration stopping condition, can no longer to the data sample can carry out next time iteration and cluster point
Analysis;If being unsatisfactory for the iteration stopping condition, there is still a need for redefining new current cluster centre, and above-mentioned steps are executed
3012.Iteration stopping condition can specifically include but be not limited to: in a kind of situation, can be clustering number reach it is default
Number;In another case, can also be current cluster analysis result and the cluster analysis result that last clustering obtains
It compares, the mean value of the space length of each data sample is less than preset threshold.
It should be noted that needing to carry out an iteration stop condition after obtaining current cluster analysis result every time
Judgement be then considered as until meeting iteration stopping condition to the completion of the clustering of the data sample set.
Step 3014, current cluster centre is redefined according to the current class cluster in current cluster analysis result, returns to step
Rapid 3012.
It is understood that is obtained is current poly- when being directed to step 3012 with initial cluster center as current cluster centre
In alanysis result, due to including multiple data samples in each current class cluster, it is easy to the mass center of each current class cluster occur
(i.e. class cluster center) is not the corresponding current cluster centre of the current class cluster, then, it needs in next iteration, needs weight
It newly determines current cluster centre, such as can be the class cluster center conduct that the current class cluster for embodying current class cluster feature will be more capable of
The current cluster centre redefined.
As an example, step 3014 redefines current cluster centre and can specifically include when realizing:
S1 calculates the first exceptional value of each data sample in current class cluster according to current class cluster.
It, can be using each data sample in data sample set as target data sample, Ke Yigen when specific implementation
The first exceptional value of each data sample is calculated according to above-mentioned mode shown in Fig. 2, it specifically can be using formula (3) or public
Formula (4) is calculated.
It is understood that no matter the abnormal point in data sample set is present in a current class cluster, still deposit respectively
Be in multiple current class clusters, can accurately distinguish abnormal point and normal data sample, abnormal possibility compared with
Big data sample can calculate biggish first exceptional value;And the abnormal lesser data sample of possibility, calculated the
One exceptional value correspondence is smaller.
S2 calculates the class cluster center of current class cluster based on the reservation data sample in current class cluster;Wherein, the encumbrance
The first exception of the data sample in current class cluster in addition to the reservation data sample is respectively less than according to the first exceptional value of sample
Value.
It is understood that in order in this current cluster analysis result, normal data sample current class cluster in the majority
In abnormal point, to the current class cluster, when clustering next time, the current cluster centre that redefines is not had an impact, can will
Partial data sample retains data sample based on the part and goes to recalculate currently as data sample is retained in the current class cluster
The class cluster center of class cluster.
Wherein, retain data sample, need the normal data samples covered in the current class cluster more as far as possible, and reject
Abnormal point in the current class cluster, it is possible to by the current class cluster, the corresponding data sample of lesser first exceptional value is made
For retain data sample, and by biggish first exceptional value for data sample reject.Moreover, in order to based on reservation data sample
The clustering of this progress still can quickly and effectively restrain, and the quantity for retaining data sample cannot be less than in current class cluster
90 the percent of data sample quantity.
As an example, can by the data sample in current class cluster, according to each data sample the first exceptional value from
It is small to big sequence, then will come preceding 90% data sample as reservation data sample.For example, it is assumed that being wrapped in current class cluster C
Include data sample x1、x2、……、x100, sort from small to large according to the first exceptional value of each data sample are as follows: x100、
x99、……、x2、x1, then, the reservation data sample selected includes: x100、x99、……、x12、x11, the data sample of rejecting
Including x1~x11。
It should be noted that the quantity for retaining data sample cannot be less than percent of data sample quantity in current class cluster
90, this can according to need the default data sample that retains in current class cluster 9 percent tenth is that a preset lowest threshold
Data sample accounting, still, the accounting of setting has to be larger than equal to 90 percent.
The class cluster center that the current class cluster is calculated based on reservation data sample, specifically may refer to following formula:
Wherein, μiFor the class cluster center of i-th of current class cluster, CiCollection for the data sample for including in i-th of current class cluster
It closes, andFor the set in i-th of current class cluster including reservation data sample, NiTo retain data in this i-th current class cluster
The quantity of sample.
S3 redefines current cluster centre, so that working as described in the class cluster center conduct of the current class cluster
Preceding cluster centre.
It is understood that can be corresponded to the current class cluster of each of data sample set according to above-mentioned S1 and S2
Class cluster center, it is possible to complete the data sample using the class cluster center of the current class cluster as new current cluster centre
The new current cluster centre of this set redefines.
It should be noted that after having redefined new current cluster centre every time, it is still desirable to which this is new current
Cluster centre feeds back to step 3012, and then sequence executes step 3012, then carries out the judgement in step 3013, and so on,
Until meeting iteration stopping condition, then it is considered as the clustering completion to the data sample set, i.e., executable following step
3015。
Step 3015, current cluster analysis result is determined as target cluster analysis result.
It is understood that when the current cluster analysis result that step 3012 obtains meets iteration stopping condition, for example,
The number of clustering reaches preset times;In another example the cluster that current cluster analysis result and last clustering obtain
Analysis result is compared, and the mean value of the space length of each data sample is less than preset threshold, then explanation is to the data sample set
Clustering, can no longer carry out iteration and clustering next time etc. operation, directly by the current cluster analysis result
As target cluster analysis result, data basis is provided for subsequent the second exceptional value for calculating target data sample.
After having introduced the specific implementation of step 301, step 302 is then according to target clustering determined by step 301
As a result the target class cluster where middle target data sample calculates the second exceptional value of the target data sample.
As an example, step 302 in specific implementation, may include:
Step 3021, according to target class cluster, the quantity and target data sample of data sample in the target class cluster are determined
Space length between the class cluster center of target class cluster;
Step 3022, according to the class of the quantity of data sample in target class cluster and target data sample and the target class cluster
Space length between cluster center calculates the second exceptional value of the target data sample.
When specific implementation, the second exceptional value of each data sample can be determined according to following two factors: first, it should
The data sample quantity that current class cluster includes belonging to data sample;Second, the data sample and the current class cluster belonging to it
Space length between class cluster center.
As an example, the calculation formula in step 3022 is as follows:
Wherein, C xiAffiliated current class cluster,The data sample for including for the current class cluster C Jing Guo normalized
Quantity, ρ (xi, C) and x can be embodiediSpace length between the class cluster center of current class cluster C.It is understood that a kind of
In the case of, ρ (xi, C) and it can be as xiSpace length itself between the class cluster center of current class cluster C;In another case,
In order to more intuitive the second exceptional value for embodying each data center, the ρ (xi, C) and it is also possible to xiWith current class cluster C
Class cluster center between space length in this prior in class cluster C in the space length at each data sample distance-like cluster center
Order coefficient can be used for embodying xiCome the position in current class cluster C, the ρ (xi, C) specifically it can be to xiWith current class cluster
Corresponding order coefficient obtained is normalized in space length between the class cluster center of C, then, the ρ (xi,C)
Value range can be 0~1, if the ρ (xi, C) and closer to 0, illustrate the class cluster center of the data sample Yu current class cluster C
Between space length it is closer, the position of arrangement is more forward;Otherwise, if the ρ (xi, C) and closer to 1, illustrate the data sample
Space length between the class cluster center of current class cluster C is remoter, and the position of arrangement is more rearward.
As an example it is assumed that data sample set includes 100 data samples, and in target cluster analysis result, target
Data sample x1The target class cluster at place is C1, and target class cluster C1Including data sample x1、x2、……、x5, according to step
3021 can determine that the quantity of data sample in target class cluster is 5, target data sample x1With target class cluster C1Class cluster center it
Between space length be dx, target data sample x can be calculated according to above-mentioned formula (6)1The second exceptional value are as follows:
In this way, target cluster analysis result can be obtained, further according to mesh by carrying out clustering to data sample set
The target class cluster in cluster analysis result where target data sample is marked, the second exceptional value of target data sample is calculated, as
Quantization to the whole intensity of anomaly of the target data sample objectively embodies a possibility that target data sample is abnormal point
Size overcomes the subjectivity of data sample set outlier detection, therefore, second of implementation provide it is a kind of science and can
The abnormal point assessment strategy leaned on.
For the third implementation, in order to which what can more be integrated embodies target data sample in data sample set
Intensity of anomaly can combine the first above-mentioned implementation and second of implementation, and two exceptional values are carried out data
Fusion, completely analyze a possibility that target data sample is abnormal point size from part and whole two aspects, more
The comprehensive, scientifical target exceptional value for obtaining target data sample.
When specific implementation, referring to fig. 4, step 102 be can specifically include:
Step 401, the space length between target data sample and other each data samples is calculated separately, and according to mesh
The space length between data sample and other each described data samples is marked, the first exceptional value of target data sample is calculated.
Step 402, clustering is carried out to data sample set, obtains target cluster analysis result, and poly- according to target
Target class cluster in alanysis result where target data sample calculates the second exceptional value of target data sample.
Step 403, according to fusion weight, the first exceptional value of target data sample and the second exceptional value are fused to described
The target exceptional value of target data sample.
It is understood that the implementation of above-mentioned steps 401 may refer to above-mentioned the first implementation shown in Fig. 2
In description, the implementation of step 402 may refer to the description in above-mentioned second of implementation shown in Fig. 3, here not
It repeats again.
It is understood that fusion weight, can be technical staff according to multiple experimental data and carries out statistics and analysis,
Obtained empirical value, the second exceptional value for the first exceptional value and target data sample to target data sample are melted
It closes, the target exceptional value of fused target data sample can embody the intensity of anomaly of target data sample.
As an example, step 403 can specifically be calculated according to the following formula:
I(xi, X) and=L (xi,X)+αG(xi, X) ... formula (7)
Wherein, L (xi, X) and it is target data sample xiThe first exceptional value, G (xi, X) and it is target data sample xiSecond
Exceptional value, α are fusion weight.
In this way, can be by the way that the first above-mentioned implementation and second of implementation be combined, by two exceptional values
The fusion of data is carried out, the more comprehensive, scientifical target exceptional value for obtaining target data sample overcomes data sample set
The subjectivity of outlier detection, therefore, the third implementation provides a kind of scientific, comprehensive, accurate and objective abnormal comment
Estimate strategy.
By the specific implementation of above-mentioned three kinds of steps 102, the target exceptional value of target data sample can be determined,
Whether for the target data sample in determining data sample set the numerical basis of judgement is provided for abnormal point.
Step 103, according to the target exceptional value of target data sample, determine whether target data sample is set of data samples
Abnormal point in conjunction.
It is understood that determining whether target data sample is that the foundation of abnormal point in data sample set can be into
The flexible setting of row.When specific implementation, the foundation of the determination includes but is not limited to following two kinds of concrete implementation modes:
In one example, step 103 can be completed by presetting abnormal point number to be detected.When specific implementation, first
Step, can obtain each target data sample using each data sample in data sample set as target data sample
Target exceptional value;Second step, to the target exceptional value of each target data sample according to row from small to large or from big to small
Sequence;Third step in each target data sample after sequence, will preset abnormal point to be detected before target exceptional value is maximum
Several corresponding target data samples, as abnormal point.So, if the target exceptional value of target data sample does not come maximum
Preceding preset in abnormal point number to be detected, it is determined that the target data sample is not the exception in the data sample set
Point;Otherwise, if the target exceptional value of target data sample come it is maximum before preset in abnormal point number to be detected, really
The fixed target data sample is the abnormal point in the data sample set.
In another example, step 103 can be completed by presetting exceptional value threshold value.When specific implementation, it can be determined that mesh
Whether the target exceptional value of mark data sample is less than the default exceptional value threshold value, if it is less, determining the target data sample
It is not the abnormal point in the data sample set;Otherwise, if it is not, then determining that the target data sample is the data sample
Abnormal point in set.
It follows that in embodiments of the present invention, passing through what is quantified to each data sample in data sample set
It calculates, obtains the target exceptional value of each data sample, each data sample can be determined according to the target exceptional value of each data sample
It whether is abnormal point in the data sample set.Wherein, the mode for specifically calculating target exceptional value may is that according to each number
According to the space length between sample and other data samples and/or in the target cluster analysis result for data sample set
Target class cluster where the data sample determines the target exceptional value of each data sample.In this way, can be accurate, science
Quantify the feature of data sample in data sample set out, effective solution relies solely on the subjective judgement detection of technical staff
The not accurate enough and less reliable problem of abnormal point out, the target by each data sample in the data sample set of quantization are different
Constant value can accurately determine the abnormal point in the data sample set, so as to so that after to the data sample set
Continuous data analysis result is relatively reliable and effective.
Correspondingly, the embodiment of the invention also provides a kind of detection devices of abnormal point in data sample set, such as Fig. 5 institute
Show, which can specifically include:
Module 501 is obtained, for obtaining the target data sample in the data sample set;
First determining module 502, for according to the space length between the target data sample and other data samples
And/or for the target class where target data sample described in the target cluster analysis result of the data sample set
Cluster determines the target exceptional value of the target data sample;Wherein, other described data samples are in the data sample set
Data sample in addition to the target data sample;
Second determining module 503 determines the target data for the target exceptional value according to the target data sample
Whether sample is abnormal point in the data sample set.
Optionally, shown first determining module 502, can specifically include:
First computational submodule, for calculating separately between the target data sample and other each described data samples
Space length calculate institute and according to the space length between the target data sample and other each described data samples
State the first exceptional value of target data sample;
Second computational submodule obtains the target cluster point for carrying out clustering to the data sample set
Analysis as a result, and the target class cluster where the target data sample according to the target cluster analysis result, calculate the mesh
Mark the second exceptional value of data sample;
Submodule is merged, is used for according to fusion weight, the first exceptional value of the target data sample and second is abnormal
Value is fused to the target exceptional value of the target data sample.
Optionally, first computational submodule, comprising:
First determination unit, for the space between the target data sample and other each described data samples away from
From determining and be superimposed distance between the target data sample and other each described data samples;
Superpositing unit, for by between the target data sample and other each described data samples be superimposed distance into
Row superposition, obtains the first exceptional value of the target data sample.
Optionally, superposition distance be specially to the space length carry out obtained from Nonlinear Mapping it is non-linear away from
From.
Optionally, the space length is specially Minkowski Distance three times.
Optionally, second computational submodule, comprising:
Selection unit, for choosing the data sample as initial cluster center in the data sample set;
Cluster cell is used for using initial cluster center as current cluster centre, using the current cluster centre to institute
It states data sample set and carries out clustering, obtain current cluster analysis result;
Second determination unit, if for being unsatisfactory for iteration stopping condition, according to working as in the current cluster analysis result
Preceding class cluster redefines the current cluster centre, returns again to execute the utilization current cluster centre to the number later
Clustering is carried out according to sample set;
Third determination unit, if being determined as the current cluster analysis result described for meeting iteration stopping condition
Target cluster analysis result.
Optionally, first initial cluster center selected is the set of data samples from the data sample set
The smallest data sample of first exceptional value described in conjunction.
Optionally, second determination unit, comprising:
First computation subunit, for calculating each data sample in the current class cluster according to the current class cluster
First exceptional value;
Second computation subunit, for calculating the current class cluster based on the reservation data sample in the current class cluster
Class cluster center;Wherein, first exceptional value for retaining data sample is respectively less than in the current class cluster except the encumbrance
According to the first exceptional value of the data sample except sample;
Subelement is determined, for redefining to the current cluster centre, so that the class of the current class cluster
Cluster center is as the current cluster centre.
Optionally, second computational submodule, comprising:
4th determination unit, for according to the target class cluster, determine in the target class cluster quantity of data sample and
Space length between the target data sample and the class cluster center of the target class cluster;
Computing unit, for according to the quantity of data sample in the target class cluster and the target data sample with it is described
Space length between the class cluster center of target class cluster, calculates the second exceptional value of the target data sample.
Foregoing description is the associated description of the detection device of abnormal point in data sample set, wherein specific implementation
And the effect reached, it may refer to the description of the detection method embodiment of abnormal point in data sample set shown in FIG. 1, this
In repeat no more.
In addition, the embodiment of the invention also provides a kind of detection devices of abnormal point in data sample set, such as Fig. 6 institute
Show, which includes processor 601 and memory 602:
Said program code is transferred to the processor 601 for storing program code by the memory 602;
The processor 601 is used for according to provided by the embodiment shown in FIG. 1 of the instruction execution in said program code
The detection method of abnormal point in data sample set.
The specific implementation of the detection device of abnormal point and the effect reached, may refer in the data sample set
The description of the detection method embodiment of abnormal point in data sample set shown in FIG. 1, which is not described herein again.
In addition, the storage medium is for storing program code, institute the embodiment of the invention also provides a kind of storage medium
Program code is stated for executing the detection method of abnormal point in data sample set provided by embodiment shown in FIG. 1.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.The terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or equipment for including a series of elements not only includes those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or equipment institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including institute
State in the process, method, article or equipment of element that there is also other identical elements.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality
Apply the part explanation of example.Method, apparatus and apparatus embodiments described above is only schematical, wherein the work
It may or may not be physically separated for the unit of separate part description, component shown as a unit can be
Or it may not be physical unit, it can it is in one place, or may be distributed over multiple network units.It can be with
Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment according to the actual needs.The common skill in this field
Art personnel can understand and implement without creative efforts.
The above is only the specific embodiment of the application, it is noted that for the ordinary skill people of the art
For member, under the premise of not departing from the application principle, several improvements and modifications can also be made, these improvements and modifications are also answered
It is considered as the protection scope of the application.
Claims (10)
1. the detection method of abnormal point in a kind of data sample set characterized by comprising
Obtain the target data sample in the data sample set;
According to the space length between the target data sample and other data samples and/or it is being directed to the set of data samples
Target class cluster where target data sample described in the target cluster analysis result of conjunction determines the mesh of the target data sample
Mark exceptional value;Wherein, other described data samples are the number in the data sample set in addition to the target data sample
According to sample;
According to the target exceptional value of the target data sample, determine whether the target data sample is the set of data samples
Abnormal point in conjunction.
2. detection method according to claim 1, which is characterized in that described according to the target data sample and other numbers
According to the space length between sample and/or the number of targets described in the target cluster analysis result for the data sample set
According to the target class cluster where sample, the target exceptional value of the target data sample is determined, comprising:
The space length between the target data sample and other each described data samples is calculated separately, and according to the mesh
The space length between data sample and other each described data samples is marked, calculate the target data sample first is abnormal
Value;
Clustering is carried out to the data sample set, obtains the target cluster analysis result, and poly- according to the target
Target class cluster where target data sample described in alanysis result calculates the second exceptional value of the target data sample;
According to fusion weight, the first exceptional value of the target data sample and the second exceptional value are fused to the target data
The target exceptional value of sample.
3. detection method according to claim 2, which is characterized in that described according to the target data sample and each institute
The space length between other data samples is stated, the first exceptional value of the target data sample is calculated, comprising:
With the space length between the target data sample and other each described data samples, the target data sample is determined
This is superimposed distance between other each described data samples;
The distance that is superimposed between the target data sample and other each described data samples is overlapped, the mesh is obtained
Mark the first exceptional value of data sample.
4. detection method according to claim 2, which is characterized in that described to carry out cluster point to the data sample set
Analysis, obtains the cluster analysis result, comprising:
The data sample as initial cluster center is chosen in the data sample set;
Using initial cluster center as current cluster centre, the data sample set is carried out using the current cluster centre
Clustering obtains current cluster analysis result;
If being unsatisfactory for iteration stopping condition, redefined according to the current class cluster in the current cluster analysis result described current
Cluster centre, return again to later execute it is described using the current cluster centre to the data sample set carry out cluster divide
Analysis;
If meeting iteration stopping condition, the current cluster analysis result is determined as the target cluster analysis result.
5. detection method according to claim 4, which is characterized in that first selects from the data sample set
Initial cluster center be the data sample set described in the smallest data sample of the first exceptional value.
6. detection method according to claim 4, which is characterized in that described according in the current cluster analysis result
Current class cluster redefines the current cluster centre, comprising:
According to the current class cluster, the first exceptional value of each data sample in the current class cluster is calculated;
Based on the reservation data sample in the current class cluster, the class cluster center of the current class cluster is calculated;Wherein, the reservation
First exceptional value of data sample is respectively less than the of the data sample in the current class cluster in addition to the reservation data sample
One exceptional value;
The current cluster centre is redefined, so that the class cluster center of the current class cluster is as described current poly-
Class center.
7. detection method according to claim 2, which is characterized in that described according to institute in the target cluster analysis result
The target class cluster where target data sample is stated, the second exceptional value of the target data sample is calculated, comprising:
According to the target class cluster, determine in the target class cluster quantity of data sample and the target data sample with it is described
Space length between the class cluster center of target class cluster;
According in the class cluster of the quantity of data sample in the target class cluster and the target data sample and the target class cluster
Space length between the heart calculates the second exceptional value of the target data sample.
8. the detection device of abnormal point in a kind of data sample set characterized by comprising
Module is obtained, for obtaining the target data sample in the data sample set;
First determining module, for according between the target data sample and other data samples space length and/or
Target class cluster where target data sample described in target cluster analysis result for the data sample set, determines institute
State the target exceptional value of target data sample;Wherein, other described data samples are that the mesh is removed in the data sample set
Mark the data sample except data sample;
Second determining module determines that the target data sample is for the target exceptional value according to the target data sample
The no abnormal point in the data sample set.
9. the detection device of abnormal point, the equipment include processor and memory in a kind of data sample set:
Said program code is transferred to the processor for storing program code by the memory;
The processor is used for according to the described in any item data samples of instruction execution claim 1 to 7 in said program code
The detection method of abnormal point in this set.
10. a kind of storage medium, the storage medium is for storing program code, and said program code is for perform claim requirement
The detection method of abnormal point in 1 to 7 described in any item data sample set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811069817.9A CN109101661A (en) | 2018-09-13 | 2018-09-13 | The detection method and device of abnormal point in a kind of data sample set |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811069817.9A CN109101661A (en) | 2018-09-13 | 2018-09-13 | The detection method and device of abnormal point in a kind of data sample set |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109101661A true CN109101661A (en) | 2018-12-28 |
Family
ID=64866275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811069817.9A Pending CN109101661A (en) | 2018-09-13 | 2018-09-13 | The detection method and device of abnormal point in a kind of data sample set |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109101661A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111079653A (en) * | 2019-12-18 | 2020-04-28 | 中国工商银行股份有限公司 | Automatic database sorting method and device |
CN111612055A (en) * | 2020-05-15 | 2020-09-01 | 北京中科三清环境技术有限公司 | Weather situation typing method, air pollution condition prediction method and device |
CN112052057A (en) * | 2020-08-12 | 2020-12-08 | 北京科技大学 | Data visualization method and system for optimizing color chart based on spring model |
CN112542026A (en) * | 2020-12-03 | 2021-03-23 | 武汉理工大学 | Multifunctional health index detection cloud system |
CN116416577A (en) * | 2023-05-06 | 2023-07-11 | 苏州开普岩土工程有限公司 | Abnormality identification method for construction monitoring system |
CN117595464A (en) * | 2024-01-18 | 2024-02-23 | 深圳创芯技术股份有限公司 | Battery charger charging detection control method and system |
-
2018
- 2018-09-13 CN CN201811069817.9A patent/CN109101661A/en active Pending
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111079653A (en) * | 2019-12-18 | 2020-04-28 | 中国工商银行股份有限公司 | Automatic database sorting method and device |
CN111079653B (en) * | 2019-12-18 | 2024-03-22 | 中国工商银行股份有限公司 | Automatic database separation method and device |
CN111612055A (en) * | 2020-05-15 | 2020-09-01 | 北京中科三清环境技术有限公司 | Weather situation typing method, air pollution condition prediction method and device |
CN112052057A (en) * | 2020-08-12 | 2020-12-08 | 北京科技大学 | Data visualization method and system for optimizing color chart based on spring model |
CN112052057B (en) * | 2020-08-12 | 2021-10-22 | 北京科技大学 | Data visualization method and system for optimizing color chart based on spring model |
CN112542026A (en) * | 2020-12-03 | 2021-03-23 | 武汉理工大学 | Multifunctional health index detection cloud system |
CN116416577A (en) * | 2023-05-06 | 2023-07-11 | 苏州开普岩土工程有限公司 | Abnormality identification method for construction monitoring system |
CN116416577B (en) * | 2023-05-06 | 2023-12-26 | 苏州开普岩土工程有限公司 | Abnormality identification method for construction monitoring system |
CN117595464A (en) * | 2024-01-18 | 2024-02-23 | 深圳创芯技术股份有限公司 | Battery charger charging detection control method and system |
CN117595464B (en) * | 2024-01-18 | 2024-04-12 | 深圳创芯技术股份有限公司 | Battery charger charging detection control method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109101661A (en) | The detection method and device of abnormal point in a kind of data sample set | |
Tuncer et al. | Online diagnosis of performance variation in HPC systems using machine learning | |
CN106529569B (en) | Threedimensional model triangular facet feature learning classification method and device based on deep learning | |
CN107301118B (en) | A kind of fault indices automatic marking method and system based on log | |
CN107260161B (en) | A kind of electrocardio dynamics data quantitative analysis method | |
CN110147732A (en) | Refer to vein identification method, device, computer equipment and storage medium | |
US8140915B2 (en) | Detecting apparatus, system, program, and detecting method | |
CN108595585A (en) | Sample data sorting technique, model training method, electronic equipment and storage medium | |
CN103218689B (en) | The analysis method for reliability and device of operator's state estimation | |
JP6952660B2 (en) | Update support device, update support method and program | |
CN108647707B (en) | Probabilistic neural network creation method, failure diagnosis method and apparatus, and storage medium | |
CN109817339A (en) | Patient's group technology and device based on big data | |
CN110516535A (en) | A kind of mouse liveness detection method and system and hygienic appraisal procedure based on deep learning | |
CN109711707B (en) | Comprehensive state evaluation method for ship power device | |
CN114387201A (en) | Cytopathic image auxiliary diagnosis system based on deep learning and reinforcement learning | |
CN110363228A (en) | Noise label correcting method | |
Gupta et al. | A supervised deep learning framework for proactive anomaly detection in cloud workloads | |
CN109753408A (en) | A kind of process predicting abnormality method based on machine learning | |
CN112597921A (en) | Human behavior recognition method based on attention mechanism GRU deep learning | |
CN110175100A (en) | A kind of storage dish failure prediction method and forecasting system | |
CN110007764A (en) | A kind of gesture skeleton recognition methods, device, system and storage medium | |
CN106407910A (en) | Multi-instance learning-based video target tracking method | |
CN109584267A (en) | A kind of dimension self-adaption correlation filtering tracking of combination background information | |
Hou et al. | r-HUMO: A risk-aware human-machine cooperation framework for entity resolution with quality guarantees | |
CN102546235A (en) | Performance diagnosis method and system of web-oriented application under cloud computing environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181228 |