CN109739904A - A kind of labeling method of time series, device, equipment and storage medium - Google Patents
A kind of labeling method of time series, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109739904A CN109739904A CN201811648187.0A CN201811648187A CN109739904A CN 109739904 A CN109739904 A CN 109739904A CN 201811648187 A CN201811648187 A CN 201811648187A CN 109739904 A CN109739904 A CN 109739904A
- Authority
- CN
- China
- Prior art keywords
- sequence
- points
- result
- point
- abnormal point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of detection method of time series, device, equipment and storage mediums.Wherein, this method comprises: sequence of points in acquisition time sequence;By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;If the first definitive result is consistent with the second definitive result, it will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as exceptional sample;The testing result of each sequence of points in time series is obtained by disaggregated model, and marks the abnormal point in time series according to testing result.Technical solution provided in an embodiment of the present invention, it avoids the problem that missing inspection and erroneous detection existing when detecting using single statistical model or unsupervised learning model to the sequence of points in time series, improves the accuracy and reliability of abnormal point label in time series.
Description
Technical field
The present embodiments relate to Internet technical field more particularly to a kind of labeling method of time series, device, set
Standby and storage medium.
Background technique
Time series, which refers under a certain application scenarios, to be had for what specific indexes obtained based on associated with time sequencing
Sequence observes data set, with the fast development of Internet technology, needs to carry out the corresponding time series data of indices pre-
Analysis is surveyed, to judge in time series with the presence or absence of abnormal index.
Abnormal marking in existing time series is to detect label manually by engineer, or pass through a kind of linear mostly
Regression model carries out abnormality detection time series, so that corresponding abnormal point is marked, but it is corresponding to require engineer to have
The business background of application scenarios locating for the time series, and the sequence data amount for needing to detect mark is larger, can expend a large amount of people
Power cost;Linear regression model (LRM) has some limitations simultaneously and real-time is lower, obtains the abnormal marking knot of time series
The reliability of fruit is not strong.
Summary of the invention
The embodiment of the invention provides a kind of labeling method of time series, device, equipment and storage mediums, when realizing
Between sequence abnormal marking, improve abnormal marking result accuracy and reliability.
In a first aspect, the embodiment of the invention provides a kind of labeling methods of time series, this method comprises:
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through preparatory structure
The unsupervised learning model built obtain sequence of points whether be abnormal point the second definitive result;
If first definitive result is consistent with second definitive result, the sequence of points that will determine as normal point is made
For normal sample, the sequence of points of abnormal point will determine as exceptional sample;
The testing result of each sequence of points in the time series is obtained by disaggregated model, and according to the testing result mark
Remember the abnormal point in the time series, the disaggregated model passes through the normal sample training after the exceptional sample and down-sampling
It obtains.
Further, it is described by the statistical model constructed in advance obtain sequence of points whether be abnormal point first determine tie
Fruit, comprising:
The statistical model includes two or more statistics submodels, then is obtained respectively by the statistics submodel
To sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, first that the sequence of points is normal point is obtained
Definitive result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained
Point is the first definitive result of abnormal point, and the default statistical threshold is determined by the quantity of the statistics submodel.
Further, described to obtain whether sequence of points is the second of abnormal point by the unsupervised learning model constructed in advance
Definitive result, comprising:
The quantity of the unsupervised learning model be one, by the unsupervised learning model obtain sequence of points whether be
The initial detecting result of abnormal point is as second definitive result.
Further, described to obtain whether sequence of points is the second of abnormal point by the unsupervised learning model constructed in advance
Definitive result, comprising:
The unsupervised learning model includes two or more unsupervised learning submodels, by described unsupervised
Study submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, second that the sequence of points is normal point is obtained
Definitive result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do
Sequence of points is the second definitive result of abnormal point, and described to preset unsupervised threshold value true by the quantity of the unsupervised learning submodel
It is fixed.
It is further, described that the testing result of each sequence of points in the time series is obtained by disaggregated model, comprising:
Each sequence of points in the time series is inputted in the disaggregated model, the abnormal general of the sequence of points is obtained
Rate;
Each sequence of points is ranked up according to the abnormal probability, and true in the sequence of points after sequence using Top algorithm
Set the goal sequence of points, using the abnormal probability of the target sequence point as the classification thresholds of the disaggregated model;
According to the abnormal probability and the classification thresholds of the sequence of points, determine in the time series whether is each sequence of points
For the testing result of abnormal point.
Second aspect, the embodiment of the invention provides a kind of labelling apparatus of time series, which includes:
Sequence of points obtains module, for the sequence of points in acquisition time sequence;
Definitive result obtains module, for obtaining whether sequence of points is the of abnormal point by the statistical model that constructs in advance
One definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;
Sample determining module will determine if consistent with second definitive result for first definitive result
For normal point sequence of points as normal sample, will determine as the sequence of points of abnormal point as exceptional sample;
Abnormal point mark module, for obtaining the testing result of each sequence of points in the time series by disaggregated model,
And the abnormal point in the time series is marked according to the testing result, the disaggregated model is by the exceptional sample under
Normal sample training after sampling obtains.
Further, the definitive result obtains module, comprising:
Statistical result acquiring unit, the quantity for the statistical model are one, obtain sequence by the statistical model
Whether column point is the initial detecting result of abnormal point as first definitive result.
Further, the statistical result acquiring unit, is specifically used for:
The statistical model includes two or more statistics submodels, then is obtained respectively by the statistics submodel
To sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, first that the sequence of points is normal point is obtained
Definitive result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained
Point is the first definitive result of abnormal point, and the default statistical threshold is determined by the quantity of the statistics submodel.
Further, the definitive result obtains module, comprising:
Unsupervised result acquiring unit, the quantity for the unsupervised learning model is one, by described unsupervised
Learning model obtains whether sequence of points is the initial detecting result of abnormal point as second definitive result.
Further, the unsupervised result acquiring unit, is specifically used for:
The unsupervised learning model includes two or more unsupervised learning submodels, by described unsupervised
Study submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, second that the sequence of points is normal point is obtained
Definitive result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do
Sequence of points is the second definitive result of abnormal point, and described to preset unsupervised threshold value true by the quantity of the unsupervised learning submodel
It is fixed.
Further, the abnormal point mark module, comprising:
Abnormal probability acquiring unit is obtained for inputting each sequence of points in the time series in the disaggregated model
To the abnormal probability of the sequence of points;
Classification thresholds determination unit for being ranked up according to the abnormal probability to each sequence of points, and uses Top algorithm
Target sequence point is determined in sequence of points after sequence, using the abnormal probability of the target sequence point as the disaggregated model
Classification thresholds;
Testing result determination unit, for the abnormal probability and the classification thresholds according to the sequence of points, determine described in
In time series each sequence of points whether be abnormal point testing result.
The third aspect, the embodiment of the invention provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the labeling method of time series described in any embodiment of that present invention.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey
Sequence realizes the labeling method of time series described in any embodiment of that present invention when the program is executed by processor.
The embodiment of the invention provides a kind of labeling method of time series, device, equipment and storage mediums, pass through respectively
Whether the statistical model and unsupervised learning model constructed in advance is that abnormal point is initially examined to the sequence of points in time series
It surveys, avoids detecting the sequence of points in time series etc. only with single statistical model or unsupervised learning model
When existing missing inspection and the problem of erroneous detection, improve the abnormality detection accuracy of sequence of points in time series, statistics mould will be passed through
Type and unsupervised learning model are determined as the sequence of points of normal point as normal sample, will pass through statistical model and unsupervised
Habit model is determined as the sequence of points of abnormal point as exceptional sample, and then according to the normal sample of the exceptional sample and down-sampling
Disaggregated model is trained, improves the classification accuracy of disaggregated model, it is subsequent according to the disaggregated model in time series
Each sequence of points detected again, the abnormal point in time series is gone out with accurate marker according to testing result, is solved existing
Artificial detection expends a large amount of human costs in technology and linear regression model (LRM) has some limitations and real-time is lower
Problem improves the accuracy and reliability of time series abnormal marking result.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is a kind of flow chart of the labeling method for time series that the embodiment of the present invention one provides;
Fig. 2A, Fig. 2 B, Fig. 2 C and Fig. 2 D are respectively under different model frameworks provided by Embodiment 2 of the present invention to time sequence
Arrange the schematic illustration detected;
Fig. 3 A is a kind of flow chart of the labeling method for time series that the embodiment of the present invention three provides;
Fig. 3 B is the schematic illustration of the labeling process for the time series that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram of the labelling apparatus for time series that the embodiment of the present invention four provides;
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention five provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart of the labeling method for time series that the embodiment of the present invention one provides, and the present embodiment can answer
The equipment for carrying out abnormality detection and marking for the sequence of points in any pair of time series.The technical side of the embodiment of the present invention
Case suitable for how to time series abnormal point carry out accurate marker in the case where.A kind of time sequence provided in this embodiment
The labeling method of column can be executed by the labelling apparatus of time series provided in an embodiment of the present invention, which can be by soft
The mode of part and/or hardware is realized, and is integrated in the equipment for executing this method.
Specifically, this method may include steps of with reference to Fig. 1:
S110, the sequence of points in acquisition time sequence.
Wherein, time series refers to that some Testing index that will include in certain phenomenon is corresponding each on different time
A numerical value, the sequence formed according to chronological order arrangement can describe development of the Testing index in corresponding phenomenon
Change procedure.For example, by include in a certain website specific specifying information in intraday amount of access according to chronological order
And the time series formed.Specifically, mainly for how to moment each in time series corresponding sequence of points in the present embodiment
Whether it is that abnormal point is detected, to accurately detect the abnormal point for including in the time series, is needed at this time by time sequence
The a large amount of sequence of points for including in column train under online as corresponding training sample can be each in accurate detection time sequence
Sequence of points whether be abnormal point detection model, the training process in the present embodiment mainly for the detection model is illustrated.
Optionally, the time series in the present embodiment refers to for the Testing index institute for including in time series to be detected
The business scenario at place is different, the Testing index shape according to chronological order in the history implementation procedure of corresponding service
At historical time sequence and when the time series run on front, each moment for including in the time series at this time is corresponding
Whether the numerical value of the Testing index exception and does not know in sequence of points, that is, can not learn that each sequence of points in time series is
No is abnormal point, and due to being trained by there is the learning method of supervision to corresponding detection model in subsequent needs, to mention
The detection accuracy of high training pattern, it is therefore desirable to sequence of points each in time series be carried out abnormality detection, to mark
Whole abnormal points in the presence of time series obtain the inspection to clearly being determined whether in advance in time series for abnormal point
Each sequence of points of result is surveyed as training sample, executes the model training process of subsequent supervised learning.
Specifically, in the present embodiment when the detection model to time series is trained, it is necessary first to obtain the model
The training sample needed in training process, that is, a certain Testing index in the implementation procedure of corresponding service according to the time
Sequencing and each sequence of points for including in the time series that is formed, require at this time each sequence of points whether be abnormal point inspection
Surveying result can predefine;Therefore in the present embodiment firstly the need of obtaining the Testing index in the implementation procedure of corresponding service
The each sequence of points for including in not processed time series, whether each sequence of points is abnormal point and does not know at this time, subsequent
It needs to carry out abnormality detection each sequence of points in the time series, to choose the sequence of points for being determined as abnormal point and normal point
As training sample, therefore it is required that the present embodiment executes subsequent firstly the need of a large amount of sequence of points for including in acquisition time sequence
Outlier detection operation.
S120, by the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through
The unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result.
Wherein, statistical model refer to construct in advance can be by the statistical decision method of setting to including in time series
Each sequence of points whether be model that abnormal point is detected, the statistical decision method in the present embodiment can be for based on 3sigma
The normal distribution method of principle, figure base detect all kinds of determination methods based on statistical analysis principle such as Tukey ' s test;Without prison
Superintend and direct learning model refer to construct in advance can be by the unsupervised learning method of setting to each sequence for including in time series
Whether point is model that abnormal point is detected, and unsupervised learning method is and motivating testing result correct behavior
Judge whether sequence of points is abnormal point, the unsupervised learning method in the present embodiment can be isolated forest algorithm (Isolation
Forest, iForest), single category support vector machines (One Class Support Vector Machine, One Class
SVM)) etc. all kinds of to be based on machine learning algorithm.
In addition, being to each sequence of points in time series by statistical model or unsupervised learning model in the present embodiment
It is no for abnormal point detected when, can be by before the moment corresponding to the sequence of points other sequences point and the sequence of points
Correlation analyzed, or to sequence of points identical with the moment corresponding to the sequence of points in time series and other when
Between other sequences point in sequence before the moment where the sequence of points carry out correlation analysis, judge whether the sequence of points is abnormal
Point.
Optionally, the present embodiment is when getting the sequence of points in time series, can be by the statistics mould that constructs in advance
Whether type and unsupervised learning model are that abnormal point detects to each sequence of points in time series, so as to logical
Cross statistical model obtain sequence of points whether be abnormal point the first definitive result, obtaining sequence of points by unsupervised learning model is
No the second definitive result for abnormal point can clearly learn whether each sequence of points for including in the time series is abnormal
Point.
It illustratively, can be by each sequence of points in time series point after getting the sequence of points in time series
It is not input in the statistical model constructed in advance and unsupervised learning model, passes through the statistical decision method and unsupervised of setting
Whether learning method is respectively that abnormal point detects, and then passes through statistical model respectively in time series to each sequence of points for including
With unsupervised learning model obtain sequence of points whether be abnormal point the first definitive result and the second definitive result;At this time due to system
The big ups and downs on model preference detection stationary time series are counted, and are peeled off in unsupervised learning model preference detection time sequence
Whether the abnormal conditions of point, be only different to sequence of points in time series by single statistical model or unsupervised learning model
When often point is detected, there are problems that certain missing inspection or erroneous detection, therefore by the statistics constructed in advance in the present embodiment
Whether model and unsupervised learning model are that abnormal point detects to sequence of points respectively, and to the sequence obtained by statistical model
Whether column point is the first definitive result of abnormal point and whether the sequence of points obtained by unsupervised learning model is abnormal point
Second definitive result is compared, and judges whether each sequence of points is abnormal point, at this time the probability pole of corresponding erroneous detection or missing inspection
It is low, improve the accuracy of the abnormality detection result of each sequence of points in time series.
S130 will determine as the sequence of points conduct of normal point if the first definitive result is consistent with the second definitive result
Normal sample will determine as the sequence of points of abnormal point as exceptional sample.
Optionally, obtaining whether sequence of points is the first definitive result of abnormal point and passes through unsupervised by statistical model
Whether learning model obtains sequence of points when being the second definitive result of abnormal point, since statistical model preference detects stationary time sequence
Big ups and downs on column, and in unsupervised learning model preference detection time sequence outlier abnormal conditions, only by single
Statistical model or unsupervised learning model to when whether original time series include that abnormal point detects, exist certain
Missing inspection or erroneous detection problem, therefore also need that the first definitive result and the second definitive result is compared in the present embodiment, it is quasi-
Really judge whether sequence of points is abnormal point;At this time if the first definitive result is consistent with the second definitive result, illustrates to determine and be somebody's turn to do
Sequence of points whether be erroneous detection corresponding to the result of abnormal point or missing inspection probability it is extremely low, will determine as the sequence of normal point at this time
Point is used as normal sample, will determine as the sequence of points of abnormal point as exceptional sample, subsequently through normal sample and exceptional sample
Available markd intermediate sample training library can accurately detect whether sequence of points is abnormal in time series with training
The detection model of point.
In addition, the present embodiment is passing through the statistical model constructed in advance and unsupervised learning model respectively in time series
When whether each sequence of points is that abnormal point is detected, it can also obtain whether sequence of points wraps uncertain result for abnormal point;This
When by the statistical model constructed in advance can also obtain sequence of points whether be abnormal point the first uncertain result;And it is logical
After the unsupervised learning model constructed in advance can also obtain sequence of points whether be abnormal point the second uncertain result.
Specifically, whether sequence of points is that the uncertain result of abnormal point refers to by statistical model or unsupervised learning
Model can not accurately obtain whether each sequence of points is abnormal point to when whether each sequence of points is that abnormal point detects
Corresponding testing result, that is, presence can not judge the case where whether sequence of points is abnormal point.It optionally, will be in time series
Each sequence of points is inputted respectively in the statistical model constructed in advance and unsupervised learning model, passes through the statistical decision method of setting
It whether is respectively that abnormal point detects to each sequence of points with unsupervised learning method, respectively by statistical model and without prison
When superintending and directing learning model can not judge whether a certain sequence of points is abnormal point, obtain whether sequence of points is the first uncertain of abnormal point
As a result with the second uncertain result.
At this point, when judging whether the first definitive result consistent with the second definitive result, there is also the first definitive result and
Second definitive result is inconsistent or passes through the available sequence of points of at least one of statistical model and unsupervised learning model
It is the case where whether being the uncertain result of abnormal point, inconsistent in the first definitive result and the second definitive result at this time, alternatively, logical
Cross at least one of statistical model and unsupervised learning model obtain sequence of points whether be abnormal point the first uncertain result
Or when the second uncertain result, illustrate to be unable to judge accurately whether the sequence of points is abnormal point at this time, thus not by the sequence of points
As the subsequent training sample for carrying out model training, the accuracy of model training is improved.
Further, since the corresponding statistical decision method of statistical model and unsupervised learning model are corresponding unsupervised
Learning method may each comprise numbers that are a variety of, therefore can independently setting statistical model and unsupervised learning model in the present embodiment
Amount, it also may include two or more by different statistical decision sides that the quantity of statistical model, which can be one, at this time
The corresponding statistics submodel of method;The quantity of unsupervised learning model can be one, also may include two or more
By the different corresponding unsupervised learning submodels of unsupervised learning method.Optionally, in statistical model or unsupervised learning
When the quantity of model is one, directly it can obtain whether sequence of points is different by the statistical model or unsupervised learning model
The first definitive result or the second definitive result often put, there is no obtain the first uncertain result and the second uncertain result
Situation;It and include two or more statistics submodels or unsupervised in statistical model or unsupervised learning model
When practising submodel, can be according to the sequence of points respectively obtained by each statistics submodel or unsupervised learning submodel
It is no to be compared for the initial detecting result of abnormal point, thus what judgement was obtained by statistical model or unsupervised learning model
Sequence of points whether be abnormal point definitive result and uncertain result;Specific deterministic process carries out in detail in the following embodiments
Illustrate, is not introduced specifically in the present embodiment.
S140 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result
Abnormal point in time series.
Wherein, disaggregated model is obtained by the normal sample training after exceptional sample and down-sampling.Specifically, due in reality
In the business of border, the sequence of points for including in time series is largely normal point, and abnormal point is only minority, passes through statistics mould at this time
The quantity for the normal sample that type and unsupervised learning model inspection determine is much larger than the quantity of exceptional sample, therefore firstly the need of right
Normal sample carries out down-sampling, the normal sample after exceptional sample and down-sampling is formed corresponding sample training library, at this time sample
The quantity of exceptional sample and the quantity of normal sample are similar equal in this training library, to guarantee the accuracy of following model training;
Disaggregated model refer to the method using supervised learning to after down-sampling in sample training library normal sample and exceptional sample into
Row training obtain can accurate detection sequence point whether be abnormal point model, that is, before the detection model that refers to, this
Disaggregated model in embodiment can be a kind of neural network model.
Specifically, when obtaining normal sample and exceptional sample, since the quantity of normal sample is much larger than exceptional sample
Quantity, therefore down-sampling is carried out firstly the need of to normal sample, the normal sample after exceptional sample and down-sampling is formed and is corresponded to
Sample training library need to guarantee the quantity of training sample in training sample database at this time in order to improve the accuracy of model training
It is sufficiently large, therefore it is required that including a large amount of sequence of points in time series, pass through statistical model and unsupervised learning model pair respectively
Each sequence of points is detected, so that selecting can clearly determine whether for the sequence of points of abnormal point, that is, the present embodiment
In normal sample and exceptional sample, to construct markd sample training library.
It optionally, can will be in sample training library when being trained using the method for supervised learning to disaggregated model
Normal sample and exceptional sample input in preset detection model, obtain the sample whether be abnormal point testing result, this
When the testing result be a kind of discreet value, which can be compared with corresponding sample labeling result, that is,
The estimation results of each sample are compared with the result of really normal sample or exceptional sample, to obtain this instruction
Practice existing Classification Loss, which can indicate currently trained detection model journey accurate for the classification of sequence of points
Degree at this time judges the Classification Loss and default loss threshold value, if the Classification Loss illustrates this beyond default loss threshold value
The accuracy that the detection model of secondary training detects sequence of points is not also high, needs to be trained again;This is instructed at this time
The Classification Loss got carries out backpropagation according to model training process, and according to the Classification Loss to preset detection model
In training parameter be modified, to constantly adjust the training parameter in the detection model, continue to obtain new training sample,
Namely new normal sample or exceptional sample, by revised detection model again to the new normal sample or abnormal sample
Whether this is that abnormal point is detected, and obtains new Classification Loss, circuits sequentially, until obtained Classification Loss is lower than default damage
Threshold value is lost, illustrates whether the detection of this training is that the detection of abnormal point has reached certain accuracy to sequence of points, is not necessarily to
Train again, the detection model at this time obtaining current training as final disaggregated model, so as to it is subsequent to sequence of points whether
It is detected for abnormal point.
Optionally, when obtaining corresponding disaggregated model according to the training of the normal sample of exceptional sample and down-sampling, this point
Class model can guarantee whether to sequence of points be accuracy that abnormal point is detected, at this time can will be each in time series
Whether sequence of points inputs in the disaggregated model, be that abnormal point detects, and obtains each sequence of points to each sequence of points
Whether be abnormal point testing result, and determine a certain sequence of points in time series be abnormal point when, can be in the time
The abnormal point is marked in sequence, to mark each abnormal point in time series according to testing result, improves abnormality detection
Accuracy.
Technical solution provided in this embodiment passes through the statistical model constructed in advance and unsupervised learning model clock synchronization respectively
Between sequence of points in sequence whether be that abnormal point carries out initial detecting, avoid only with single statistical model or unsupervised
The problem of learning model missing inspection existing when detecting to the sequence of points in time series etc. and erroneous detection, improves time series
The abnormality detection accuracy of middle sequence of points will be determined as the sequence of points of normal point by statistical model and unsupervised learning model
As normal sample, it will be determined as the sequence of points of abnormal point by statistical model and unsupervised learning model as abnormal sample
This, and then disaggregated model is trained according to the normal sample of the exceptional sample and down-sampling, improve point of disaggregated model
Class accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with according to testing result
Accurate marker goes out the abnormal point in time series, solves artificial detection in the prior art and expends a large amount of human costs and line
The problem that property regression model has some limitations and real-time is lower, improves the accurate of time series abnormal marking result
Property and reliability.
Embodiment two
Due to the corresponding statistical decision method of statistical model and the corresponding unsupervised learning method of unsupervised learning model
It may each comprise a variety of, therefore can independently set statistical model and unsupervised learning model in the present embodiment in the present embodiment
Quantity, that is, can take the circumstances into consideration to select the combination of statistical model and unsupervised learning model.Fig. 2A, Fig. 2 B, Fig. 2 C and figure
2D is respectively the principle that sequence of points in time series is marked under different model frameworks provided by Embodiment 2 of the present invention
Schematic diagram.The present embodiment is to optimize on the basis of the above embodiments.Specifically, the present embodiment is for statistical model and nothing
Supervised learning model carries out detailed explanation to the abnormality detecting process of the sequence of points in time series under various combination.
Following four kinds can be divided into for the various combination situation of statistical model and unsupervised learning model in the present embodiment:
1) quantity of statistical model and unsupervised learning model is one;2) statistical model includes two or more statistics
Model, the quantity of unsupervised learning model are one;3) quantity of statistical model is one, and unsupervised learning model includes two
Or more than two unsupervised learning submodels;4) statistical model includes two or more statistics submodels, unsupervised
Learning model includes two or more unsupervised learning submodels.The present embodiment is mainly for above four kinds of combined situations
It is introduced respectively.
It optionally, as shown in Figure 2 A, is this combination for the quantity of statistical model and unsupervised learning model
Situation is illustrated, and the labeling method of the time series may include steps of at this time:
S201, the sequence of points in acquisition time sequence.
S202 obtains whether sequence of points is the initial detecting result of abnormal point as the first determining knot by statistical model
Fruit, and, obtain whether sequence of points is the initial detecting result of abnormal point as the second determining knot by unsupervised learning model
Fruit.
Specifically, the present embodiment is only with a kind of statistical decision method to time sequence when the quantity of statistical model is one
Whether each sequence of points for including in column is that abnormal point is detected, and can clearly be obtained according to the statistical decision method at this time each
A sequence of points whether be abnormal point initial detecting as a result, the initial detecting result at this time obtaining the statistical model directly as
First definitive result;When the quantity of unsupervised learning model is one simultaneously, the present embodiment is also only with a kind of unsupervised learning
Whether method is that abnormal point detects in original time series to each sequence of points for including, at this time according to the unsupervised learning side
Method also can clearly obtain whether each sequence of points is the initial detecting of abnormal point as a result, at this time by the unsupervised learning model
Obtained initial detecting result is directly as the second definitive result.It should be noted that working as statistical model and unsupervised learning mould
When the quantity of type is one, due to can clearly be judged according to single statistical decision method and unsupervised learning method at this time
Whether each sequence of points is abnormal point, thus by statistical model and the available sequence of points of unsupervised learning model whether
For determining as a result, may be not present that the case where whether sequence of points is the uncertain result of abnormal point obtained for abnormal point.
S203 judges whether the first definitive result is consistent with the second definitive result;If so, executing S204;If it is not, executing
S205。
Obtaining whether sequence of points is the first definitive result of abnormal point and passes through nothing by statistical model in the present embodiment
Supervised learning model obtain sequence of points whether be abnormal point the second definitive result after, it is also necessary to judge the first definitive result and the
Whether two definitive results are consistent, to further increase the detection accuracy of sequence of points;At this time the first definitive result and second really
Determine result it is consistent when, then whether explanation by statistical model and the sequence of points obtained by unsupervised learning model is abnormal point
Testing result is consistent, then the subsequent sequence of points that will determine as normal point will determine as the sequence of points of abnormal point as normal sample
As exceptional sample.
S204 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different
Normal sample.
S205 is returned and is executed S201, and the first definitive result of next sequence of points in acquisition time sequence and second determines
As a result, until being detected to the full sequence point in time series.
S206 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result
Abnormal point in time series.
It optionally, as shown in Figure 2 B, include two or more statistics submodels, unsupervised for statistical model
The quantity for practising model is that this combined situation is illustrated, and the labeling method of the time series may include walking as follows at this time
It is rapid:
S211, the sequence of points in acquisition time sequence.
S212 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, first according to this by counting submodel
Beginning testing result determine the sequence of points that is obtained by statistical model whether be abnormal point the first definitive result.
Specifically, each statistics submodel is corresponding when including two or more statistics submodels in statistical model
Have different statistical decision methods, at this time by each statistics submodel can to sequence of points each in time series whether be
Abnormal point is detected, by it is each statistics submodel respectively obtain sequence of points whether be abnormal point initial detecting as a result, this
When can according to determined in the corresponding initial detecting result of each statistics submodel each sequence of points whether be abnormal point as a result,
Judge whether statistical model is the testing result of abnormal point for each sequence of points, so that it is determined that obtained by the statistical model
Sequence of points whether be abnormal point the first definitive result.It is directed to the difference of initial detecting result in the present embodiment, is counted to passing through
The case where whether sequence of points that model obtains is the first definitive result or the first uncertain result of abnormal point is said respectively
It is bright.
Optionally, whether the sequence of points obtained in the present embodiment by two or more statistics submodels is abnormal
The initial detecting result of point includes following three kinds of situations:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the first determining knot that the sequence of points is normal point
Fruit.
Optionally, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result it is equal
Be normal point for the sequence of points, at this time each obtained sequence of points of statistics submodel whether be abnormal point testing result it is consistent,
The first definitive result that the sequence of points is normal point is then obtained by statistical model.
2) sequence of points is that the initial detecting fruiting quantities of abnormal point are more than or equal to default statistical threshold, then obtains the sequence of points
For the first definitive result of abnormal point.
Wherein, statistical threshold is preset to be determined by the quantity of statistics submodel;Statistical threshold is preset in the present embodiment can be
The median of the quantity of submodel is counted, demand setting can be detected according to business in the present embodiment, this is not construed as limiting.
Specifically, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result
In, sequence of points is that the quantity of the initial detecting result of abnormal point is more than or equal to default statistical threshold, that is, passes through each statistics
Submodel determines that the quantity that the sequence of points is the result of abnormal point is more than or equal to default statistical threshold, then is obtained by statistical model
The sequence of points is the first definitive result of abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point
Amount is less than default statistical threshold, then obtains the first uncertain result that the sequence of points is abnormal point.
Optionally, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result
In, the case where there are the sequence of points being abnormal point, and sequence of points is that the quantity of the initial detecting result of abnormal point is less than default system
Threshold value is counted, that is, determines that the quantity that the sequence of points is the result of abnormal point is less than default statistics threshold by each statistics submodel
Value, illustrates that the statistics submodel for also having more than default statistical threshold determines that the sequence of points is not abnormal point, passes through statistics mould at this time
Type can not clearly determine whether the sequence of points is abnormal point, then obtain whether the sequence of points is the of abnormal point by statistical model
One uncertain result.It is uncertain if obtain that the sequence of points is abnormal point first at this time as a result, if no matter pass through unsupervised learning
Model obtains which kind of testing result of the sequence of points, and the sequence of points is therefore straight not as the training sample in sample training library
It connects and ignores the sequence of points, and then rejudge the testing result of next sequence of points in time series.
S213, by unsupervised learning model obtain sequence of points whether be abnormal point initial detecting result as second really
Determine result.
S214 judges whether the first definitive result is consistent with the second definitive result;If so, executing S215;If it is not, executing
S216。
S215 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different
Normal sample.
S216 is returned and is executed S211, and the first definitive result of next sequence of points and second determines knot in acquisition time sequence
Fruit, until being detected to the full sequence point in time series.
S217 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result
Abnormal point in time series.
It as shown in Figure 2 C, is one for the quantity of statistical model, unsupervised learning model includes two or more
Unsupervised learning submodel this combined situation be illustrated, the labeling method of the time series may include walking as follows at this time
It is rapid:
S221, the sequence of points in acquisition time sequence.
S222 obtains whether sequence of points is the initial detecting result of abnormal point as the first determining knot by statistical model
Fruit.
S223 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, root by unsupervised learning submodel
According to the initial detecting result determine the sequence of points that is obtained by unsupervised learning model whether be abnormal point the second definitive result.
Specifically, when including two or more unsupervised learning submodels in unsupervised learning model, Ge Gewu
Supervised learning submodel is corresponding with different unsupervised learning methods, at this time can be right by each unsupervised learning submodel
Whether each sequence of points is that abnormal point is detected in time series, respectively obtains sequence by each unsupervised learning submodel
Point whether be the initial detecting of abnormal point as a result, in the initial detecting result comprising to each sequence of points whether be abnormal point really
It is fixed as a result, so that it is determined that the time series obtained by the unsupervised learning model whether be abnormal point the second definitive result.
It is directed to the difference of initial detecting result in the present embodiment, whether the original time series obtained by unsupervised learning model are wrapped
The case where including the definitive result or uncertain result of abnormal point is illustrated respectively.
Optionally, the original time sequence obtained in the present embodiment by two or more unsupervised learning submodels
Whether the initial detecting result including abnormal point includes following three kinds of situations to column:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the second determining knot that the sequence of points is normal point
Fruit.
Optionally, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting
Result be the sequence of points be normal point, the sequence of points that each unsupervised learning submodel obtains at this time whether be abnormal point inspection
Survey result is consistent, then obtains the second definitive result that the sequence of points is normal point by unsupervised learning model.
2) sequence of points is more than or equal to for the initial detecting fruiting quantities of abnormal point presets unsupervised threshold value, then obtains the sequence
Point is the second definitive result of abnormal point.
Wherein, unsupervised threshold value is preset to be determined by the quantity of unsupervised learning submodel;It is preset in the present embodiment unsupervised
Threshold value can be the median of the quantity of unsupervised learning submodel, and demand setting can be detected according to business in the present embodiment,
This is not construed as limiting.
Specifically, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting
As a result in, sequence of points is more than or equal to for the quantity of the initial detecting result of abnormal point presets unsupervised threshold value, that is, by each
A unsupervised learning submodel determines that the quantity that the sequence of points is the result of abnormal point is more than or equal to default statistical threshold, then passes through
Unsupervised learning model obtains the second definitive result that the sequence of points is abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point
Amount, which is less than, presets unsupervised threshold value, then obtains the second uncertain result that the sequence of points is abnormal point.
Optionally, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting
As a result in, the case where there are the sequence of points being abnormal point, and sequence of points be the initial detecting result of abnormal point quantity be less than it is pre-
If unsupervised threshold value, that is, determine that the quantity that the sequence of points is the result of abnormal point is small by each unsupervised learning submodel
In presetting unsupervised threshold value, illustrate that also having more than the unsupervised learning submodel for presetting unsupervised threshold value determines that the sequence of points is not
Abnormal point can not clearly determine whether the sequence of points is abnormal point by unsupervised learning model at this time, then pass through unsupervised
Practise model obtain the sequence of points whether be abnormal point the second uncertain result.If obtaining that the sequence of points is abnormal point at this time
Two is uncertain as a result, then no matter by statistical model which kind of testing result of the sequence of points obtained, the sequence of points is not as sample
Training sample in this training library, therefore directly ignore the sequence of points, and then rejudge next sequence of points in time series
Testing result.
S224 judges whether the first definitive result is consistent with the second definitive result;If so, executing S225;If it is not, executing
S226。
S225 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different
Normal sample.
S226 is returned and is executed S221, and the first definitive result of next sequence of points and second determines knot in acquisition time sequence
Fruit, until being detected to the full sequence point in time series.
S227 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result
Abnormal point in time series.
It as shown in Figure 2 D, include two or more statistics submodels, unsupervised learning model for statistical model
It is illustrated including two or more this combined situation of unsupervised learning submodel, at this time the label of the time series
Method may include steps of:
S231, the sequence of points in acquisition time sequence.
S232 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, first according to this by counting submodel
Beginning testing result determine the sequence of points that is obtained by statistical model whether be abnormal point the first definitive result.
Optionally, in the present embodiment by two or more obtained original time series of statistics submodels whether
Initial detecting result including abnormal point includes following three kinds of situations:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the first determining knot that the sequence of points is normal point
Fruit.
2) sequence of points is that the initial detecting fruiting quantities of abnormal point are more than or equal to default statistical threshold, then obtains the sequence of points
For the first definitive result of abnormal point, default statistical threshold is determined by the quantity of statistics submodel.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point
Amount is less than default statistical threshold, then obtains the first uncertain result that the sequence of points is abnormal point.
S233 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, root by unsupervised learning submodel
According to the initial detecting result determine the sequence of points that is obtained by unsupervised learning model whether be abnormal point the second definitive result.
Optionally, the original time sequence obtained in the present embodiment by two or more unsupervised learning submodels
Whether the initial detecting result including abnormal point includes following three kinds of situations to column:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the second determining knot that the sequence of points is normal point
Fruit.
2) sequence of points is more than or equal to for the initial detecting fruiting quantities of abnormal point presets unsupervised threshold value, then obtains the sequence
Point is the second definitive result of abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point
Amount, which is less than, presets unsupervised threshold value, then obtains the second uncertain result that the sequence of points is abnormal point.
S234 judges whether the first definitive result is consistent with the second definitive result;If so, executing S235;If it is not, executing
S236。
S235 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different
Normal sample.
S236 returns and executes S232, obtains the first definitive result and the second definitive result of next sequence of points, until clock synchronization
Between full sequence point in sequence detected.
S237 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result
Abnormal point in time series.
Technical solution provided in this embodiment, by the various combination for independently selecting statistical model and unsupervised learning model
Whether mode is that abnormal point carries out initial detecting to each sequence of points in time series, avoids only with single statistics
The problem of missing inspection existing when carrying out abnormality detection to sequence of points of model or unsupervised learning model and erroneous detection, improves sequence
The detection accuracy of point, will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points conduct of abnormal point
Exceptional sample, and then disaggregated model is trained according to the normal sample of down-sampling and exceptional sample, improve disaggregated model
Classification accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with accurate
Each sequence of points into time series whether be abnormal point testing result, improve the accurate of time series abnormal marking result
Property and reliability.
Embodiment three
Fig. 3 A is a kind of flow chart of the labeling method for time series that the embodiment of the present invention three provides, and Fig. 3 B is the present invention
The schematic illustration of the detection process for the time series that embodiment three provides.The present embodiment be on the basis of the above embodiments into
Row optimization.Specifically, the present embodiment is mainly to the training process of disaggregated model and according to trained disaggregated model to the time
The process that each sequence of points is detected in sequence carries out detailed explanation.
Optionally, as shown in Figure 3A, this method can specifically include following steps:
S310, the sequence of points in acquisition time sequence.
S320, by the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through
The unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result.
S330 will determine as the sequence of points conduct of normal point if the first definitive result is consistent with the second definitive result
Normal sample will determine as the sequence of points of abnormal point as exceptional sample.
S340 obtains the abnormal probability of sequence of points, the classification in each sequence of points input disaggregated model in time series
Model is obtained by the normal sample training after exceptional sample and down-sampling.
Specifically, the present embodiment is obtaining corresponding classification mould by the normal sample training after exceptional sample and down-sampling
When type, each sequence of points in time series can be input in the disaggregated model by the present embodiment, by trained disaggregated model
The case where whether being abnormal point in time series to each sequence of points for including, judges, obtains the exception of each sequence of points
Probability, the exception probability is for a possibility that whether sequence of points is abnormal point to be indicated.
S350 is ranked up each sequence of points according to abnormal probability, and true in the sequence of points after sequence using Top algorithm
Set the goal sequence of points, using the abnormal probability of target sequence point as the classification thresholds of disaggregated model.
It optionally, can be according to the height of abnormal probability after the abnormal probability for obtaining each sequence of points by disaggregated model
Low sequence is ranked up each sequence of points, and is chosen in each sequence of points after sequence using preset Top algorithm
It is ordered as the target sequence point of N out, and using the abnormal probability of the target sequence point as the classification thresholds of disaggregated model;Exist at this time
The size for being ordered as N of Top algorithm setting can detect demand according to specific business and be set, so that the present embodiment
In classification thresholds can be carried out according to different business examination criterias certain flexible, the detection for improving time series is accurate
Property.
S360 determines whether each sequence of points is abnormal in time series according to the abnormal probability and classification thresholds of sequence of points
The testing result of point.
Specifically, after obtaining the classification thresholds of disaggregated model, it can be to the abnormal general of sequence of points each in time series
Rate is compared with the classification thresholds, when the abnormal probability of a certain sequence of points is more than or equal to the classification thresholds, determines the sequence
Point is abnormal point;If the abnormal probability of the sequence of points is less than the classification thresholds, it is determined that the sequence of points is normal point;And then judge
Whether each sequence of points for including in time series is abnormal point, obtains whether each sequence of points in time series is abnormal point
Testing result, and corresponding abnormal point is marked in time series, it can be intuitively displayed the abnormality detection feelings of the time series
Condition prompts administrative staff to carry out artificial correction for the abnormal conditions to carry out alarm and abnormal show.
Technical solution provided in this embodiment obtains the abnormal probability of sequence of points by disaggregated model trained in advance, and
The classification thresholds for determining disaggregated model in the abnormal probability of sequence of points using Top algorithm, at this time can be according to corresponding industry
Business examination criteria selects corresponding Classified Proportion in Top algorithm, so that classification thresholds detect mark according to different business
Quasi- progress is certain to stretch, and carries out real-time abnormality detection to time series by disaggregated model, improves to time series progress
The accuracy and reliability of abnormality detection.
Example IV
Fig. 4 is a kind of structural schematic diagram of the labelling apparatus for time series that the embodiment of the present invention four provides, specifically, such as
Shown in Fig. 4, the apparatus may include:
Sequence of points obtains module 410, for the sequence of points in acquisition time sequence;
Definitive result obtains module 420, obtains whether sequence of points is abnormal point for the statistical model by constructing in advance
The first definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point second determine tie
Fruit;
Sample determining module 430 will determine as normal if consistent with the second definitive result for the first definitive result
The sequence of points of point will determine as the sequence of points of abnormal point as exceptional sample as normal sample;
Abnormal point mark module 440, for obtaining the testing result of each sequence of points in time series by disaggregated model, and
The abnormal point in time series is marked according to testing result, which passes through the normal sample after exceptional sample and down-sampling
Training obtains.
Technical solution provided in this embodiment passes through the statistical model constructed in advance and unsupervised learning model clock synchronization respectively
Between sequence of points in sequence whether be that abnormal point carries out initial detecting, avoid only with single statistical model or unsupervised
The problem of learning model missing inspection existing when detecting to the sequence of points in time series etc. and erroneous detection, improves time series
The abnormality detection accuracy of middle sequence of points will be determined as the sequence of points of normal point by statistical model and unsupervised learning model
As normal sample, it will be determined as the sequence of points of abnormal point by statistical model and unsupervised learning model as abnormal sample
This, and then disaggregated model is trained according to the normal sample of the exceptional sample and down-sampling, improve point of disaggregated model
Class accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with according to testing result
Accurate marker goes out the abnormal point in time series, solves artificial detection in the prior art and expends a large amount of human costs and line
The problem that property regression model has some limitations and real-time is lower, improves the accurate of time series abnormal marking result
Property and reliability.
Further, above-mentioned definitive result obtains module 420, may include:
Statistical result acquiring unit, the quantity for statistical model are one, whether obtain sequence of points by statistical model
For abnormal point initial detecting result as the first definitive result.
Further, above-mentioned statistical result acquiring unit, can be specifically used for:
Statistical model includes two or more statistics submodels, then respectively obtains sequence of points by counting submodel
Whether be abnormal point initial detecting result;
If it is normal point that initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained
Point is the first definitive result of abnormal point, which is determined by the quantity of statistics submodel.
Further, above-mentioned definitive result obtains module 420, may include:
Unsupervised result acquiring unit, the quantity for unsupervised learning model are one, pass through unsupervised learning model
Obtain whether sequence of points is the initial detecting result of abnormal point as the second definitive result.
Further, above-mentioned unsupervised result acquiring unit, can be specifically used for:
Unsupervised learning model includes two or more unsupervised learning submodels, passes through unsupervised learning submodule
Type respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do
Sequence of points is the second definitive result of abnormal point, this is preset unsupervised threshold value and is determined by the quantity of unsupervised learning submodel.
Further, above-mentioned abnormal point mark module 440 may include:
Abnormal probability acquiring unit, for obtaining sequence of points in each sequence of points input disaggregated model in time series
Abnormal probability;
Classification thresholds determination unit is being arranged for being ranked up according to abnormal probability to each sequence of points, and using Top algorithm
Target sequence point is determined in sequence of points after sequence, using the abnormal probability of target sequence point as the classification thresholds of disaggregated model;
Testing result determination unit determines each in time series for the abnormal probability and classification thresholds according to sequence of points
Sequence of points whether be abnormal point testing result.
The labelling apparatus of time series provided in this embodiment is applicable to the time series that above-mentioned any embodiment provides
Labeling method, have corresponding function and beneficial effect.
Embodiment five
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention five provides, as shown in figure 5, the equipment includes place
Manage device 50, storage device 51 and communication device 52;The quantity of processor 50 can be one or more in equipment, with one in Fig. 5
For a processor 50;Processor 50, storage device 51 and communication device 52 in equipment can pass through bus or other modes
It connects, in Fig. 5 for being connected by bus.
Storage device 51 is used as a kind of computer readable storage medium, and it is executable to can be used for storing software program, computer
Program and module, the corresponding program instruction/module of the labeling method of the time series as described in any embodiment of that present invention.Place
Software program, instruction and the module that reason device 50 is stored in storage device 51 by operation, thereby executing the various function of equipment
It can apply and data processing, that is, realize the labeling method of above-mentioned time series.
Storage device 51 can mainly include storing program area and storage data area, wherein storing program area can store operation
Application program needed for system, at least one function;Storage data area, which can be stored, uses created data etc. according to terminal.
It can also include nonvolatile memory in addition, storage device 51 may include high-speed random access memory, for example, at least one
A disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, storage device 51 can
It further comprise the memory remotely located relative to processor 50, these remote memories can be by network connection to setting
It is standby.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Communication device 52 can be used for realizing the network connection or mobile data cube computation of equipment room.
A kind of equipment provided in this embodiment can be used for executing the label side for the time series that above-mentioned any embodiment provides
Method has corresponding function and beneficial effect.
Embodiment six
The embodiment of the present invention six additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should
Program can realize the labeling method of the time series in above-mentioned any embodiment when being executed by processor.This method specifically can wrap
It includes:
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through preparatory structure
The unsupervised learning model built obtain sequence of points whether be abnormal point the second definitive result;
If the first definitive result is consistent with the second definitive result, the sequence of points of normal point will determine as normal sample
This, will determine as the sequence of points of abnormal point as exceptional sample;
The testing result of each sequence of points in time series is obtained by disaggregated model, and marks time sequence according to testing result
Abnormal point in column, the disaggregated model are obtained by the normal sample training after exceptional sample and down-sampling.
Certainly, a kind of storage medium comprising computer executable instructions, computer provided by the embodiment of the present invention
The method operation that executable instruction is not limited to the described above, can also be performed time series provided by any embodiment of the invention
Labeling method in relevant operation.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention
It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more
Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art
Part can be embodied in the form of software products, which can store in computer readable storage medium
In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer
Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, in the embodiment of the labelling apparatus of above-mentioned time series, included each unit and module
It is only divided according to the functional logic, but is not limited to the above division, as long as corresponding functions can be realized;
In addition, the specific name of each functional unit is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art
For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal
Replacement, improvement etc., should all be included in the protection scope of the present invention.
Claims (14)
1. a kind of labeling method of time series characterized by comprising
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through what is constructed in advance
Unsupervised learning model obtain sequence of points whether be abnormal point the second definitive result;
If first definitive result is consistent with second definitive result, the sequence of points of normal point will determine as just
Normal sample will determine as the sequence of points of abnormal point as exceptional sample;
The testing result of each sequence of points in the time series is obtained by disaggregated model, and institute is marked according to the testing result
State the abnormal point in time series, the disaggregated model is trained by the normal sample after the exceptional sample and down-sampling
It arrives.
2. the method according to claim 1, wherein the statistical model by constructing in advance obtains sequence of points
Whether be abnormal point the first definitive result, comprising:
The quantity of the statistical model be one, by the statistical model obtain sequence of points whether be abnormal point initial detecting
As a result it is used as first definitive result.
3. the method according to claim 1, wherein the statistical model by constructing in advance obtains sequence of points
Whether be abnormal point the first definitive result, comprising:
The statistical model includes two or more statistics submodels, then respectively obtains sequence by the statistics submodel
Column point whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, obtaining the sequence of points is
First definitive result of abnormal point, the default statistical threshold are determined by the quantity of the statistics submodel.
4. the method according to claim 1, wherein the unsupervised learning model by constructing in advance obtains
Sequence of points whether be abnormal point the second definitive result, comprising:
The quantity of the unsupervised learning model is one, obtains whether sequence of points is abnormal by the unsupervised learning model
The initial detecting result of point is as second definitive result.
5. the method according to claim 1, wherein the unsupervised learning model by constructing in advance obtains
Sequence of points whether be abnormal point the second definitive result, comprising:
The unsupervised learning model includes two or more unsupervised learning submodels, passes through the unsupervised learning
Submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, the sequence is obtained
Point is the second definitive result of abnormal point, described to preset unsupervised threshold value and determined by the quantity of the unsupervised learning submodel.
6. the method according to claim 1, wherein described obtained in the time series respectively by disaggregated model
The testing result of sequence of points, comprising:
Each sequence of points in the time series is inputted in the disaggregated model, the abnormal probability of the sequence of points is obtained;
Each sequence of points is ranked up according to the abnormal probability, and mesh is determined in the sequence of points after sequence using Top algorithm
Sequence of points is marked, using the abnormal probability of the target sequence point as the classification thresholds of the disaggregated model;
According to the abnormal probability and the classification thresholds of the sequence of points, determine whether each sequence of points is different in the time series
The testing result often put.
7. a kind of labelling apparatus of time series characterized by comprising
Sequence of points obtains module, for the sequence of points in acquisition time sequence;
Definitive result obtains module, for by the statistical model that constructs in advance obtain sequence of points whether be abnormal point first really
It is fixed as a result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;
Sample determining module will determine as just if consistent with second definitive result for first definitive result
The sequence of points often put will determine as the sequence of points of abnormal point as exceptional sample as normal sample;
Abnormal point mark module, for obtaining the testing result of each sequence of points in the time series, and root by disaggregated model
The abnormal point in the time series is marked according to the testing result, the disaggregated model passes through the exceptional sample and down-sampling
Normal sample training afterwards obtains.
8. device according to claim 7, which is characterized in that the definitive result obtains module, comprising:
Statistical result acquiring unit, the quantity for the statistical model are one, obtain sequence of points by the statistical model
It whether is the initial detecting result of abnormal point as first definitive result.
9. device according to claim 7, which is characterized in that the statistical result acquiring unit is specifically used for:
The statistical model includes two or more statistics submodels, then respectively obtains sequence by the statistics submodel
Column point whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, obtaining the sequence of points is
First definitive result of abnormal point, the default statistical threshold are determined by the quantity of the statistics submodel.
10. device according to claim 7, which is characterized in that the definitive result obtains module, comprising:
Unsupervised result acquiring unit, the quantity for the unsupervised learning model are one, pass through the unsupervised learning
Model obtains whether sequence of points is the initial detecting result of abnormal point as second definitive result.
11. device according to claim 7, which is characterized in that the unsupervised result acquiring unit is specifically used for:
The unsupervised learning model includes two or more unsupervised learning submodels, passes through the unsupervised learning
Submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained
As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, the sequence is obtained
Point is the second definitive result of abnormal point, described to preset unsupervised threshold value and determined by the quantity of the unsupervised learning submodel.
12. device according to claim 7, which is characterized in that the abnormal point mark module, comprising:
Abnormal probability acquiring unit obtains institute for inputting each sequence of points in the time series in the disaggregated model
State the abnormal probability of sequence of points;
Classification thresholds determination unit for being ranked up according to the abnormal probability to each sequence of points, and is being arranged using Top algorithm
Target sequence point is determined in sequence of points after sequence, using the abnormal probability of the target sequence point as the classification of the disaggregated model
Threshold value;
Testing result determination unit determines the time for the abnormal probability and the classification thresholds according to the sequence of points
In sequence each sequence of points whether be abnormal point testing result.
13. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as the labeling method of time series as claimed in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The labeling method such as time series as claimed in any one of claims 1 to 6 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811648187.0A CN109739904B (en) | 2018-12-30 | 2018-12-30 | Time sequence marking method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811648187.0A CN109739904B (en) | 2018-12-30 | 2018-12-30 | Time sequence marking method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109739904A true CN109739904A (en) | 2019-05-10 |
CN109739904B CN109739904B (en) | 2021-08-10 |
Family
ID=66362886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811648187.0A Active CN109739904B (en) | 2018-12-30 | 2018-12-30 | Time sequence marking method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109739904B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178456A (en) * | 2020-01-15 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Abnormal index detection method and device, computer equipment and storage medium |
CN111277459A (en) * | 2020-01-16 | 2020-06-12 | 新华三信息安全技术有限公司 | Equipment anomaly detection method and device and machine-readable storage medium |
CN111325260A (en) * | 2020-02-14 | 2020-06-23 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and computer readable medium |
CN111353890A (en) * | 2020-03-30 | 2020-06-30 | 中国工商银行股份有限公司 | Application log-based application anomaly detection method and device |
CN111614578A (en) * | 2020-05-09 | 2020-09-01 | 北京邮电大学 | Network resource allocation method and device based on exponential weighting and inflection point detection |
CN112070155A (en) * | 2020-09-07 | 2020-12-11 | 常州微亿智造科技有限公司 | Time series data labeling method and device |
CN112328425A (en) * | 2020-12-04 | 2021-02-05 | 杭州谐云科技有限公司 | Anomaly detection method and system based on machine learning |
CN113409560A (en) * | 2021-07-30 | 2021-09-17 | 佛山市墨纳森智能科技有限公司 | Monitoring method and device of electronic equipment, storage medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104794192A (en) * | 2015-04-17 | 2015-07-22 | 南京大学 | Multi-level anomaly detection method based on exponential smoothing and integrated learning model |
CN105678409A (en) * | 2015-12-31 | 2016-06-15 | 哈尔滨工业大学 | Adaptive and distribution-free time series abnormal point detection method |
CN106411597A (en) * | 2016-10-14 | 2017-02-15 | 广东工业大学 | Network traffic abnormality detection method and system |
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
-
2018
- 2018-12-30 CN CN201811648187.0A patent/CN109739904B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104794192A (en) * | 2015-04-17 | 2015-07-22 | 南京大学 | Multi-level anomaly detection method based on exponential smoothing and integrated learning model |
CN105678409A (en) * | 2015-12-31 | 2016-06-15 | 哈尔滨工业大学 | Adaptive and distribution-free time series abnormal point detection method |
CN106411597A (en) * | 2016-10-14 | 2017-02-15 | 广东工业大学 | Network traffic abnormality detection method and system |
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178456A (en) * | 2020-01-15 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Abnormal index detection method and device, computer equipment and storage medium |
CN111178456B (en) * | 2020-01-15 | 2022-12-13 | 腾讯科技(深圳)有限公司 | Abnormal index detection method and device, computer equipment and storage medium |
CN111277459A (en) * | 2020-01-16 | 2020-06-12 | 新华三信息安全技术有限公司 | Equipment anomaly detection method and device and machine-readable storage medium |
CN111325260A (en) * | 2020-02-14 | 2020-06-23 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and computer readable medium |
CN111325260B (en) * | 2020-02-14 | 2023-10-27 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and computer readable medium |
CN111353890A (en) * | 2020-03-30 | 2020-06-30 | 中国工商银行股份有限公司 | Application log-based application anomaly detection method and device |
CN111614578A (en) * | 2020-05-09 | 2020-09-01 | 北京邮电大学 | Network resource allocation method and device based on exponential weighting and inflection point detection |
CN112070155A (en) * | 2020-09-07 | 2020-12-11 | 常州微亿智造科技有限公司 | Time series data labeling method and device |
CN112328425A (en) * | 2020-12-04 | 2021-02-05 | 杭州谐云科技有限公司 | Anomaly detection method and system based on machine learning |
CN113409560A (en) * | 2021-07-30 | 2021-09-17 | 佛山市墨纳森智能科技有限公司 | Monitoring method and device of electronic equipment, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109739904B (en) | 2021-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109739904A (en) | A kind of labeling method of time series, device, equipment and storage medium | |
US20230162096A1 (en) | Characterizing failures of a machine learning model based on instance features | |
JP2022105263A (en) | Multi-source timing data fault diagnosis method based on graph neural network, and medium | |
CN113282461B (en) | Alarm identification method and device for transmission network | |
CN111915020B (en) | Updating method and device of detection model and storage medium | |
EP3422518B1 (en) | A method for recognizing contingencies in a power supply network | |
CN105372581A (en) | Flexible circuit board manufacturing process automatic monitoring and intelligent analysis system and method | |
CN103617469A (en) | Equipment failure prediction method and system of electrical power system | |
CN111459700A (en) | Method and apparatus for diagnosing device failure, diagnostic device, and storage medium | |
CN107203467A (en) | The reference test method and device of supervised learning algorithm under a kind of distributed environment | |
CN113377567A (en) | Distributed system fault root cause tracing method based on knowledge graph technology | |
US11783474B1 (en) | Defective picture generation method and apparatus applied to industrial quality inspection | |
CN111340054A (en) | Data labeling method and device and data processing equipment | |
CN112859822A (en) | Equipment health analysis and fault diagnosis method and system based on artificial intelligence | |
US20190004490A1 (en) | Method for recognizing contingencies in a power supply network | |
CN111158964B (en) | Disk failure prediction method, system, device and storage medium | |
CN112419268A (en) | Method, device, equipment and medium for detecting image defects of power transmission line | |
CN112036249B (en) | Method, system, medium and terminal for end-to-end pedestrian detection and attribute identification | |
CN115456107A (en) | Time series abnormity detection system and method | |
CN115187772A (en) | Training method, device and equipment of target detection network and target detection method, device and equipment | |
CN115047322A (en) | Method and system for identifying fault chip of intelligent medical equipment | |
CN117034143B (en) | Distributed system fault diagnosis method and device based on machine learning | |
CN115171045A (en) | YOLO-based power grid operation field violation identification method and terminal | |
CN113592939B (en) | Deep learning method for judging size of narrow blood vessel based on coronary angiography image | |
Li et al. | Signal anomaly detection of bridge SHM system based on two-stage deep convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |