CN109739904A - A kind of labeling method of time series, device, equipment and storage medium - Google Patents

A kind of labeling method of time series, device, equipment and storage medium Download PDF

Info

Publication number
CN109739904A
CN109739904A CN201811648187.0A CN201811648187A CN109739904A CN 109739904 A CN109739904 A CN 109739904A CN 201811648187 A CN201811648187 A CN 201811648187A CN 109739904 A CN109739904 A CN 109739904A
Authority
CN
China
Prior art keywords
sequence
points
result
point
abnormal point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811648187.0A
Other languages
Chinese (zh)
Other versions
CN109739904B (en
Inventor
战泓升
龚诚
张昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Urban Network Neighbor Information Technology Co Ltd
Beijing City Network Neighbor Technology Co Ltd
Original Assignee
Beijing City Network Neighbor Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing City Network Neighbor Technology Co Ltd filed Critical Beijing City Network Neighbor Technology Co Ltd
Priority to CN201811648187.0A priority Critical patent/CN109739904B/en
Publication of CN109739904A publication Critical patent/CN109739904A/en
Application granted granted Critical
Publication of CN109739904B publication Critical patent/CN109739904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of detection method of time series, device, equipment and storage mediums.Wherein, this method comprises: sequence of points in acquisition time sequence;By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;If the first definitive result is consistent with the second definitive result, it will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as exceptional sample;The testing result of each sequence of points in time series is obtained by disaggregated model, and marks the abnormal point in time series according to testing result.Technical solution provided in an embodiment of the present invention, it avoids the problem that missing inspection and erroneous detection existing when detecting using single statistical model or unsupervised learning model to the sequence of points in time series, improves the accuracy and reliability of abnormal point label in time series.

Description

A kind of labeling method of time series, device, equipment and storage medium
Technical field
The present embodiments relate to Internet technical field more particularly to a kind of labeling method of time series, device, set Standby and storage medium.
Background technique
Time series, which refers under a certain application scenarios, to be had for what specific indexes obtained based on associated with time sequencing Sequence observes data set, with the fast development of Internet technology, needs to carry out the corresponding time series data of indices pre- Analysis is surveyed, to judge in time series with the presence or absence of abnormal index.
Abnormal marking in existing time series is to detect label manually by engineer, or pass through a kind of linear mostly Regression model carries out abnormality detection time series, so that corresponding abnormal point is marked, but it is corresponding to require engineer to have The business background of application scenarios locating for the time series, and the sequence data amount for needing to detect mark is larger, can expend a large amount of people Power cost;Linear regression model (LRM) has some limitations simultaneously and real-time is lower, obtains the abnormal marking knot of time series The reliability of fruit is not strong.
Summary of the invention
The embodiment of the invention provides a kind of labeling method of time series, device, equipment and storage mediums, when realizing Between sequence abnormal marking, improve abnormal marking result accuracy and reliability.
In a first aspect, the embodiment of the invention provides a kind of labeling methods of time series, this method comprises:
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through preparatory structure The unsupervised learning model built obtain sequence of points whether be abnormal point the second definitive result;
If first definitive result is consistent with second definitive result, the sequence of points that will determine as normal point is made For normal sample, the sequence of points of abnormal point will determine as exceptional sample;
The testing result of each sequence of points in the time series is obtained by disaggregated model, and according to the testing result mark Remember the abnormal point in the time series, the disaggregated model passes through the normal sample training after the exceptional sample and down-sampling It obtains.
Further, it is described by the statistical model constructed in advance obtain sequence of points whether be abnormal point first determine tie Fruit, comprising:
The statistical model includes two or more statistics submodels, then is obtained respectively by the statistics submodel To sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, first that the sequence of points is normal point is obtained Definitive result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained Point is the first definitive result of abnormal point, and the default statistical threshold is determined by the quantity of the statistics submodel.
Further, described to obtain whether sequence of points is the second of abnormal point by the unsupervised learning model constructed in advance Definitive result, comprising:
The quantity of the unsupervised learning model be one, by the unsupervised learning model obtain sequence of points whether be The initial detecting result of abnormal point is as second definitive result.
Further, described to obtain whether sequence of points is the second of abnormal point by the unsupervised learning model constructed in advance Definitive result, comprising:
The unsupervised learning model includes two or more unsupervised learning submodels, by described unsupervised Study submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, second that the sequence of points is normal point is obtained Definitive result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do Sequence of points is the second definitive result of abnormal point, and described to preset unsupervised threshold value true by the quantity of the unsupervised learning submodel It is fixed.
It is further, described that the testing result of each sequence of points in the time series is obtained by disaggregated model, comprising:
Each sequence of points in the time series is inputted in the disaggregated model, the abnormal general of the sequence of points is obtained Rate;
Each sequence of points is ranked up according to the abnormal probability, and true in the sequence of points after sequence using Top algorithm Set the goal sequence of points, using the abnormal probability of the target sequence point as the classification thresholds of the disaggregated model;
According to the abnormal probability and the classification thresholds of the sequence of points, determine in the time series whether is each sequence of points For the testing result of abnormal point.
Second aspect, the embodiment of the invention provides a kind of labelling apparatus of time series, which includes:
Sequence of points obtains module, for the sequence of points in acquisition time sequence;
Definitive result obtains module, for obtaining whether sequence of points is the of abnormal point by the statistical model that constructs in advance One definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;
Sample determining module will determine if consistent with second definitive result for first definitive result For normal point sequence of points as normal sample, will determine as the sequence of points of abnormal point as exceptional sample;
Abnormal point mark module, for obtaining the testing result of each sequence of points in the time series by disaggregated model, And the abnormal point in the time series is marked according to the testing result, the disaggregated model is by the exceptional sample under Normal sample training after sampling obtains.
Further, the definitive result obtains module, comprising:
Statistical result acquiring unit, the quantity for the statistical model are one, obtain sequence by the statistical model Whether column point is the initial detecting result of abnormal point as first definitive result.
Further, the statistical result acquiring unit, is specifically used for:
The statistical model includes two or more statistics submodels, then is obtained respectively by the statistics submodel To sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, first that the sequence of points is normal point is obtained Definitive result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained Point is the first definitive result of abnormal point, and the default statistical threshold is determined by the quantity of the statistics submodel.
Further, the definitive result obtains module, comprising:
Unsupervised result acquiring unit, the quantity for the unsupervised learning model is one, by described unsupervised Learning model obtains whether sequence of points is the initial detecting result of abnormal point as second definitive result.
Further, the unsupervised result acquiring unit, is specifically used for:
The unsupervised learning model includes two or more unsupervised learning submodels, by described unsupervised Study submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If the initial detecting result is that the sequence of points is normal point, second that the sequence of points is normal point is obtained Definitive result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do Sequence of points is the second definitive result of abnormal point, and described to preset unsupervised threshold value true by the quantity of the unsupervised learning submodel It is fixed.
Further, the abnormal point mark module, comprising:
Abnormal probability acquiring unit is obtained for inputting each sequence of points in the time series in the disaggregated model To the abnormal probability of the sequence of points;
Classification thresholds determination unit for being ranked up according to the abnormal probability to each sequence of points, and uses Top algorithm Target sequence point is determined in sequence of points after sequence, using the abnormal probability of the target sequence point as the disaggregated model Classification thresholds;
Testing result determination unit, for the abnormal probability and the classification thresholds according to the sequence of points, determine described in In time series each sequence of points whether be abnormal point testing result.
The third aspect, the embodiment of the invention provides a kind of equipment, which includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the labeling method of time series described in any embodiment of that present invention.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey Sequence realizes the labeling method of time series described in any embodiment of that present invention when the program is executed by processor.
The embodiment of the invention provides a kind of labeling method of time series, device, equipment and storage mediums, pass through respectively Whether the statistical model and unsupervised learning model constructed in advance is that abnormal point is initially examined to the sequence of points in time series It surveys, avoids detecting the sequence of points in time series etc. only with single statistical model or unsupervised learning model When existing missing inspection and the problem of erroneous detection, improve the abnormality detection accuracy of sequence of points in time series, statistics mould will be passed through Type and unsupervised learning model are determined as the sequence of points of normal point as normal sample, will pass through statistical model and unsupervised Habit model is determined as the sequence of points of abnormal point as exceptional sample, and then according to the normal sample of the exceptional sample and down-sampling Disaggregated model is trained, improves the classification accuracy of disaggregated model, it is subsequent according to the disaggregated model in time series Each sequence of points detected again, the abnormal point in time series is gone out with accurate marker according to testing result, is solved existing Artificial detection expends a large amount of human costs in technology and linear regression model (LRM) has some limitations and real-time is lower Problem improves the accuracy and reliability of time series abnormal marking result.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:
Fig. 1 is a kind of flow chart of the labeling method for time series that the embodiment of the present invention one provides;
Fig. 2A, Fig. 2 B, Fig. 2 C and Fig. 2 D are respectively under different model frameworks provided by Embodiment 2 of the present invention to time sequence Arrange the schematic illustration detected;
Fig. 3 A is a kind of flow chart of the labeling method for time series that the embodiment of the present invention three provides;
Fig. 3 B is the schematic illustration of the labeling process for the time series that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram of the labelling apparatus for time series that the embodiment of the present invention four provides;
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention five provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is a kind of flow chart of the labeling method for time series that the embodiment of the present invention one provides, and the present embodiment can answer The equipment for carrying out abnormality detection and marking for the sequence of points in any pair of time series.The technical side of the embodiment of the present invention Case suitable for how to time series abnormal point carry out accurate marker in the case where.A kind of time sequence provided in this embodiment The labeling method of column can be executed by the labelling apparatus of time series provided in an embodiment of the present invention, which can be by soft The mode of part and/or hardware is realized, and is integrated in the equipment for executing this method.
Specifically, this method may include steps of with reference to Fig. 1:
S110, the sequence of points in acquisition time sequence.
Wherein, time series refers to that some Testing index that will include in certain phenomenon is corresponding each on different time A numerical value, the sequence formed according to chronological order arrangement can describe development of the Testing index in corresponding phenomenon Change procedure.For example, by include in a certain website specific specifying information in intraday amount of access according to chronological order And the time series formed.Specifically, mainly for how to moment each in time series corresponding sequence of points in the present embodiment Whether it is that abnormal point is detected, to accurately detect the abnormal point for including in the time series, is needed at this time by time sequence The a large amount of sequence of points for including in column train under online as corresponding training sample can be each in accurate detection time sequence Sequence of points whether be abnormal point detection model, the training process in the present embodiment mainly for the detection model is illustrated.
Optionally, the time series in the present embodiment refers to for the Testing index institute for including in time series to be detected The business scenario at place is different, the Testing index shape according to chronological order in the history implementation procedure of corresponding service At historical time sequence and when the time series run on front, each moment for including in the time series at this time is corresponding Whether the numerical value of the Testing index exception and does not know in sequence of points, that is, can not learn that each sequence of points in time series is No is abnormal point, and due to being trained by there is the learning method of supervision to corresponding detection model in subsequent needs, to mention The detection accuracy of high training pattern, it is therefore desirable to sequence of points each in time series be carried out abnormality detection, to mark Whole abnormal points in the presence of time series obtain the inspection to clearly being determined whether in advance in time series for abnormal point Each sequence of points of result is surveyed as training sample, executes the model training process of subsequent supervised learning.
Specifically, in the present embodiment when the detection model to time series is trained, it is necessary first to obtain the model The training sample needed in training process, that is, a certain Testing index in the implementation procedure of corresponding service according to the time Sequencing and each sequence of points for including in the time series that is formed, require at this time each sequence of points whether be abnormal point inspection Surveying result can predefine;Therefore in the present embodiment firstly the need of obtaining the Testing index in the implementation procedure of corresponding service The each sequence of points for including in not processed time series, whether each sequence of points is abnormal point and does not know at this time, subsequent It needs to carry out abnormality detection each sequence of points in the time series, to choose the sequence of points for being determined as abnormal point and normal point As training sample, therefore it is required that the present embodiment executes subsequent firstly the need of a large amount of sequence of points for including in acquisition time sequence Outlier detection operation.
S120, by the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through The unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result.
Wherein, statistical model refer to construct in advance can be by the statistical decision method of setting to including in time series Each sequence of points whether be model that abnormal point is detected, the statistical decision method in the present embodiment can be for based on 3sigma The normal distribution method of principle, figure base detect all kinds of determination methods based on statistical analysis principle such as Tukey ' s test;Without prison Superintend and direct learning model refer to construct in advance can be by the unsupervised learning method of setting to each sequence for including in time series Whether point is model that abnormal point is detected, and unsupervised learning method is and motivating testing result correct behavior Judge whether sequence of points is abnormal point, the unsupervised learning method in the present embodiment can be isolated forest algorithm (Isolation Forest, iForest), single category support vector machines (One Class Support Vector Machine, One Class SVM)) etc. all kinds of to be based on machine learning algorithm.
In addition, being to each sequence of points in time series by statistical model or unsupervised learning model in the present embodiment It is no for abnormal point detected when, can be by before the moment corresponding to the sequence of points other sequences point and the sequence of points Correlation analyzed, or to sequence of points identical with the moment corresponding to the sequence of points in time series and other when Between other sequences point in sequence before the moment where the sequence of points carry out correlation analysis, judge whether the sequence of points is abnormal Point.
Optionally, the present embodiment is when getting the sequence of points in time series, can be by the statistics mould that constructs in advance Whether type and unsupervised learning model are that abnormal point detects to each sequence of points in time series, so as to logical Cross statistical model obtain sequence of points whether be abnormal point the first definitive result, obtaining sequence of points by unsupervised learning model is No the second definitive result for abnormal point can clearly learn whether each sequence of points for including in the time series is abnormal Point.
It illustratively, can be by each sequence of points in time series point after getting the sequence of points in time series It is not input in the statistical model constructed in advance and unsupervised learning model, passes through the statistical decision method and unsupervised of setting Whether learning method is respectively that abnormal point detects, and then passes through statistical model respectively in time series to each sequence of points for including With unsupervised learning model obtain sequence of points whether be abnormal point the first definitive result and the second definitive result;At this time due to system The big ups and downs on model preference detection stationary time series are counted, and are peeled off in unsupervised learning model preference detection time sequence Whether the abnormal conditions of point, be only different to sequence of points in time series by single statistical model or unsupervised learning model When often point is detected, there are problems that certain missing inspection or erroneous detection, therefore by the statistics constructed in advance in the present embodiment Whether model and unsupervised learning model are that abnormal point detects to sequence of points respectively, and to the sequence obtained by statistical model Whether column point is the first definitive result of abnormal point and whether the sequence of points obtained by unsupervised learning model is abnormal point Second definitive result is compared, and judges whether each sequence of points is abnormal point, at this time the probability pole of corresponding erroneous detection or missing inspection It is low, improve the accuracy of the abnormality detection result of each sequence of points in time series.
S130 will determine as the sequence of points conduct of normal point if the first definitive result is consistent with the second definitive result Normal sample will determine as the sequence of points of abnormal point as exceptional sample.
Optionally, obtaining whether sequence of points is the first definitive result of abnormal point and passes through unsupervised by statistical model Whether learning model obtains sequence of points when being the second definitive result of abnormal point, since statistical model preference detects stationary time sequence Big ups and downs on column, and in unsupervised learning model preference detection time sequence outlier abnormal conditions, only by single Statistical model or unsupervised learning model to when whether original time series include that abnormal point detects, exist certain Missing inspection or erroneous detection problem, therefore also need that the first definitive result and the second definitive result is compared in the present embodiment, it is quasi- Really judge whether sequence of points is abnormal point;At this time if the first definitive result is consistent with the second definitive result, illustrates to determine and be somebody's turn to do Sequence of points whether be erroneous detection corresponding to the result of abnormal point or missing inspection probability it is extremely low, will determine as the sequence of normal point at this time Point is used as normal sample, will determine as the sequence of points of abnormal point as exceptional sample, subsequently through normal sample and exceptional sample Available markd intermediate sample training library can accurately detect whether sequence of points is abnormal in time series with training The detection model of point.
In addition, the present embodiment is passing through the statistical model constructed in advance and unsupervised learning model respectively in time series When whether each sequence of points is that abnormal point is detected, it can also obtain whether sequence of points wraps uncertain result for abnormal point;This When by the statistical model constructed in advance can also obtain sequence of points whether be abnormal point the first uncertain result;And it is logical After the unsupervised learning model constructed in advance can also obtain sequence of points whether be abnormal point the second uncertain result.
Specifically, whether sequence of points is that the uncertain result of abnormal point refers to by statistical model or unsupervised learning Model can not accurately obtain whether each sequence of points is abnormal point to when whether each sequence of points is that abnormal point detects Corresponding testing result, that is, presence can not judge the case where whether sequence of points is abnormal point.It optionally, will be in time series Each sequence of points is inputted respectively in the statistical model constructed in advance and unsupervised learning model, passes through the statistical decision method of setting It whether is respectively that abnormal point detects to each sequence of points with unsupervised learning method, respectively by statistical model and without prison When superintending and directing learning model can not judge whether a certain sequence of points is abnormal point, obtain whether sequence of points is the first uncertain of abnormal point As a result with the second uncertain result.
At this point, when judging whether the first definitive result consistent with the second definitive result, there is also the first definitive result and Second definitive result is inconsistent or passes through the available sequence of points of at least one of statistical model and unsupervised learning model It is the case where whether being the uncertain result of abnormal point, inconsistent in the first definitive result and the second definitive result at this time, alternatively, logical Cross at least one of statistical model and unsupervised learning model obtain sequence of points whether be abnormal point the first uncertain result Or when the second uncertain result, illustrate to be unable to judge accurately whether the sequence of points is abnormal point at this time, thus not by the sequence of points As the subsequent training sample for carrying out model training, the accuracy of model training is improved.
Further, since the corresponding statistical decision method of statistical model and unsupervised learning model are corresponding unsupervised Learning method may each comprise numbers that are a variety of, therefore can independently setting statistical model and unsupervised learning model in the present embodiment Amount, it also may include two or more by different statistical decision sides that the quantity of statistical model, which can be one, at this time The corresponding statistics submodel of method;The quantity of unsupervised learning model can be one, also may include two or more By the different corresponding unsupervised learning submodels of unsupervised learning method.Optionally, in statistical model or unsupervised learning When the quantity of model is one, directly it can obtain whether sequence of points is different by the statistical model or unsupervised learning model The first definitive result or the second definitive result often put, there is no obtain the first uncertain result and the second uncertain result Situation;It and include two or more statistics submodels or unsupervised in statistical model or unsupervised learning model When practising submodel, can be according to the sequence of points respectively obtained by each statistics submodel or unsupervised learning submodel It is no to be compared for the initial detecting result of abnormal point, thus what judgement was obtained by statistical model or unsupervised learning model Sequence of points whether be abnormal point definitive result and uncertain result;Specific deterministic process carries out in detail in the following embodiments Illustrate, is not introduced specifically in the present embodiment.
S140 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result Abnormal point in time series.
Wherein, disaggregated model is obtained by the normal sample training after exceptional sample and down-sampling.Specifically, due in reality In the business of border, the sequence of points for including in time series is largely normal point, and abnormal point is only minority, passes through statistics mould at this time The quantity for the normal sample that type and unsupervised learning model inspection determine is much larger than the quantity of exceptional sample, therefore firstly the need of right Normal sample carries out down-sampling, the normal sample after exceptional sample and down-sampling is formed corresponding sample training library, at this time sample The quantity of exceptional sample and the quantity of normal sample are similar equal in this training library, to guarantee the accuracy of following model training; Disaggregated model refer to the method using supervised learning to after down-sampling in sample training library normal sample and exceptional sample into Row training obtain can accurate detection sequence point whether be abnormal point model, that is, before the detection model that refers to, this Disaggregated model in embodiment can be a kind of neural network model.
Specifically, when obtaining normal sample and exceptional sample, since the quantity of normal sample is much larger than exceptional sample Quantity, therefore down-sampling is carried out firstly the need of to normal sample, the normal sample after exceptional sample and down-sampling is formed and is corresponded to Sample training library need to guarantee the quantity of training sample in training sample database at this time in order to improve the accuracy of model training It is sufficiently large, therefore it is required that including a large amount of sequence of points in time series, pass through statistical model and unsupervised learning model pair respectively Each sequence of points is detected, so that selecting can clearly determine whether for the sequence of points of abnormal point, that is, the present embodiment In normal sample and exceptional sample, to construct markd sample training library.
It optionally, can will be in sample training library when being trained using the method for supervised learning to disaggregated model Normal sample and exceptional sample input in preset detection model, obtain the sample whether be abnormal point testing result, this When the testing result be a kind of discreet value, which can be compared with corresponding sample labeling result, that is, The estimation results of each sample are compared with the result of really normal sample or exceptional sample, to obtain this instruction Practice existing Classification Loss, which can indicate currently trained detection model journey accurate for the classification of sequence of points Degree at this time judges the Classification Loss and default loss threshold value, if the Classification Loss illustrates this beyond default loss threshold value The accuracy that the detection model of secondary training detects sequence of points is not also high, needs to be trained again;This is instructed at this time The Classification Loss got carries out backpropagation according to model training process, and according to the Classification Loss to preset detection model In training parameter be modified, to constantly adjust the training parameter in the detection model, continue to obtain new training sample, Namely new normal sample or exceptional sample, by revised detection model again to the new normal sample or abnormal sample Whether this is that abnormal point is detected, and obtains new Classification Loss, circuits sequentially, until obtained Classification Loss is lower than default damage Threshold value is lost, illustrates whether the detection of this training is that the detection of abnormal point has reached certain accuracy to sequence of points, is not necessarily to Train again, the detection model at this time obtaining current training as final disaggregated model, so as to it is subsequent to sequence of points whether It is detected for abnormal point.
Optionally, when obtaining corresponding disaggregated model according to the training of the normal sample of exceptional sample and down-sampling, this point Class model can guarantee whether to sequence of points be accuracy that abnormal point is detected, at this time can will be each in time series Whether sequence of points inputs in the disaggregated model, be that abnormal point detects, and obtains each sequence of points to each sequence of points Whether be abnormal point testing result, and determine a certain sequence of points in time series be abnormal point when, can be in the time The abnormal point is marked in sequence, to mark each abnormal point in time series according to testing result, improves abnormality detection Accuracy.
Technical solution provided in this embodiment passes through the statistical model constructed in advance and unsupervised learning model clock synchronization respectively Between sequence of points in sequence whether be that abnormal point carries out initial detecting, avoid only with single statistical model or unsupervised The problem of learning model missing inspection existing when detecting to the sequence of points in time series etc. and erroneous detection, improves time series The abnormality detection accuracy of middle sequence of points will be determined as the sequence of points of normal point by statistical model and unsupervised learning model As normal sample, it will be determined as the sequence of points of abnormal point by statistical model and unsupervised learning model as abnormal sample This, and then disaggregated model is trained according to the normal sample of the exceptional sample and down-sampling, improve point of disaggregated model Class accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with according to testing result Accurate marker goes out the abnormal point in time series, solves artificial detection in the prior art and expends a large amount of human costs and line The problem that property regression model has some limitations and real-time is lower, improves the accurate of time series abnormal marking result Property and reliability.
Embodiment two
Due to the corresponding statistical decision method of statistical model and the corresponding unsupervised learning method of unsupervised learning model It may each comprise a variety of, therefore can independently set statistical model and unsupervised learning model in the present embodiment in the present embodiment Quantity, that is, can take the circumstances into consideration to select the combination of statistical model and unsupervised learning model.Fig. 2A, Fig. 2 B, Fig. 2 C and figure 2D is respectively the principle that sequence of points in time series is marked under different model frameworks provided by Embodiment 2 of the present invention Schematic diagram.The present embodiment is to optimize on the basis of the above embodiments.Specifically, the present embodiment is for statistical model and nothing Supervised learning model carries out detailed explanation to the abnormality detecting process of the sequence of points in time series under various combination.
Following four kinds can be divided into for the various combination situation of statistical model and unsupervised learning model in the present embodiment: 1) quantity of statistical model and unsupervised learning model is one;2) statistical model includes two or more statistics Model, the quantity of unsupervised learning model are one;3) quantity of statistical model is one, and unsupervised learning model includes two Or more than two unsupervised learning submodels;4) statistical model includes two or more statistics submodels, unsupervised Learning model includes two or more unsupervised learning submodels.The present embodiment is mainly for above four kinds of combined situations It is introduced respectively.
It optionally, as shown in Figure 2 A, is this combination for the quantity of statistical model and unsupervised learning model Situation is illustrated, and the labeling method of the time series may include steps of at this time:
S201, the sequence of points in acquisition time sequence.
S202 obtains whether sequence of points is the initial detecting result of abnormal point as the first determining knot by statistical model Fruit, and, obtain whether sequence of points is the initial detecting result of abnormal point as the second determining knot by unsupervised learning model Fruit.
Specifically, the present embodiment is only with a kind of statistical decision method to time sequence when the quantity of statistical model is one Whether each sequence of points for including in column is that abnormal point is detected, and can clearly be obtained according to the statistical decision method at this time each A sequence of points whether be abnormal point initial detecting as a result, the initial detecting result at this time obtaining the statistical model directly as First definitive result;When the quantity of unsupervised learning model is one simultaneously, the present embodiment is also only with a kind of unsupervised learning Whether method is that abnormal point detects in original time series to each sequence of points for including, at this time according to the unsupervised learning side Method also can clearly obtain whether each sequence of points is the initial detecting of abnormal point as a result, at this time by the unsupervised learning model Obtained initial detecting result is directly as the second definitive result.It should be noted that working as statistical model and unsupervised learning mould When the quantity of type is one, due to can clearly be judged according to single statistical decision method and unsupervised learning method at this time Whether each sequence of points is abnormal point, thus by statistical model and the available sequence of points of unsupervised learning model whether For determining as a result, may be not present that the case where whether sequence of points is the uncertain result of abnormal point obtained for abnormal point.
S203 judges whether the first definitive result is consistent with the second definitive result;If so, executing S204;If it is not, executing S205。
Obtaining whether sequence of points is the first definitive result of abnormal point and passes through nothing by statistical model in the present embodiment Supervised learning model obtain sequence of points whether be abnormal point the second definitive result after, it is also necessary to judge the first definitive result and the Whether two definitive results are consistent, to further increase the detection accuracy of sequence of points;At this time the first definitive result and second really Determine result it is consistent when, then whether explanation by statistical model and the sequence of points obtained by unsupervised learning model is abnormal point Testing result is consistent, then the subsequent sequence of points that will determine as normal point will determine as the sequence of points of abnormal point as normal sample As exceptional sample.
S204 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different Normal sample.
S205 is returned and is executed S201, and the first definitive result of next sequence of points in acquisition time sequence and second determines As a result, until being detected to the full sequence point in time series.
S206 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result Abnormal point in time series.
It optionally, as shown in Figure 2 B, include two or more statistics submodels, unsupervised for statistical model The quantity for practising model is that this combined situation is illustrated, and the labeling method of the time series may include walking as follows at this time It is rapid:
S211, the sequence of points in acquisition time sequence.
S212 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, first according to this by counting submodel Beginning testing result determine the sequence of points that is obtained by statistical model whether be abnormal point the first definitive result.
Specifically, each statistics submodel is corresponding when including two or more statistics submodels in statistical model Have different statistical decision methods, at this time by each statistics submodel can to sequence of points each in time series whether be Abnormal point is detected, by it is each statistics submodel respectively obtain sequence of points whether be abnormal point initial detecting as a result, this When can according to determined in the corresponding initial detecting result of each statistics submodel each sequence of points whether be abnormal point as a result, Judge whether statistical model is the testing result of abnormal point for each sequence of points, so that it is determined that obtained by the statistical model Sequence of points whether be abnormal point the first definitive result.It is directed to the difference of initial detecting result in the present embodiment, is counted to passing through The case where whether sequence of points that model obtains is the first definitive result or the first uncertain result of abnormal point is said respectively It is bright.
Optionally, whether the sequence of points obtained in the present embodiment by two or more statistics submodels is abnormal The initial detecting result of point includes following three kinds of situations:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the first determining knot that the sequence of points is normal point Fruit.
Optionally, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result it is equal Be normal point for the sequence of points, at this time each obtained sequence of points of statistics submodel whether be abnormal point testing result it is consistent, The first definitive result that the sequence of points is normal point is then obtained by statistical model.
2) sequence of points is that the initial detecting fruiting quantities of abnormal point are more than or equal to default statistical threshold, then obtains the sequence of points For the first definitive result of abnormal point.
Wherein, statistical threshold is preset to be determined by the quantity of statistics submodel;Statistical threshold is preset in the present embodiment can be The median of the quantity of submodel is counted, demand setting can be detected according to business in the present embodiment, this is not construed as limiting.
Specifically, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result In, sequence of points is that the quantity of the initial detecting result of abnormal point is more than or equal to default statistical threshold, that is, passes through each statistics Submodel determines that the quantity that the sequence of points is the result of abnormal point is more than or equal to default statistical threshold, then is obtained by statistical model The sequence of points is the first definitive result of abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point Amount is less than default statistical threshold, then obtains the first uncertain result that the sequence of points is abnormal point.
Optionally, if by each obtained sequence of points of statistics submodel whether be abnormal point initial detecting result In, the case where there are the sequence of points being abnormal point, and sequence of points is that the quantity of the initial detecting result of abnormal point is less than default system Threshold value is counted, that is, determines that the quantity that the sequence of points is the result of abnormal point is less than default statistics threshold by each statistics submodel Value, illustrates that the statistics submodel for also having more than default statistical threshold determines that the sequence of points is not abnormal point, passes through statistics mould at this time Type can not clearly determine whether the sequence of points is abnormal point, then obtain whether the sequence of points is the of abnormal point by statistical model One uncertain result.It is uncertain if obtain that the sequence of points is abnormal point first at this time as a result, if no matter pass through unsupervised learning Model obtains which kind of testing result of the sequence of points, and the sequence of points is therefore straight not as the training sample in sample training library It connects and ignores the sequence of points, and then rejudge the testing result of next sequence of points in time series.
S213, by unsupervised learning model obtain sequence of points whether be abnormal point initial detecting result as second really Determine result.
S214 judges whether the first definitive result is consistent with the second definitive result;If so, executing S215;If it is not, executing S216。
S215 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different Normal sample.
S216 is returned and is executed S211, and the first definitive result of next sequence of points and second determines knot in acquisition time sequence Fruit, until being detected to the full sequence point in time series.
S217 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result Abnormal point in time series.
It as shown in Figure 2 C, is one for the quantity of statistical model, unsupervised learning model includes two or more Unsupervised learning submodel this combined situation be illustrated, the labeling method of the time series may include walking as follows at this time It is rapid:
S221, the sequence of points in acquisition time sequence.
S222 obtains whether sequence of points is the initial detecting result of abnormal point as the first determining knot by statistical model Fruit.
S223 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, root by unsupervised learning submodel According to the initial detecting result determine the sequence of points that is obtained by unsupervised learning model whether be abnormal point the second definitive result.
Specifically, when including two or more unsupervised learning submodels in unsupervised learning model, Ge Gewu Supervised learning submodel is corresponding with different unsupervised learning methods, at this time can be right by each unsupervised learning submodel Whether each sequence of points is that abnormal point is detected in time series, respectively obtains sequence by each unsupervised learning submodel Point whether be the initial detecting of abnormal point as a result, in the initial detecting result comprising to each sequence of points whether be abnormal point really It is fixed as a result, so that it is determined that the time series obtained by the unsupervised learning model whether be abnormal point the second definitive result. It is directed to the difference of initial detecting result in the present embodiment, whether the original time series obtained by unsupervised learning model are wrapped The case where including the definitive result or uncertain result of abnormal point is illustrated respectively.
Optionally, the original time sequence obtained in the present embodiment by two or more unsupervised learning submodels Whether the initial detecting result including abnormal point includes following three kinds of situations to column:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the second determining knot that the sequence of points is normal point Fruit.
Optionally, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting Result be the sequence of points be normal point, the sequence of points that each unsupervised learning submodel obtains at this time whether be abnormal point inspection Survey result is consistent, then obtains the second definitive result that the sequence of points is normal point by unsupervised learning model.
2) sequence of points is more than or equal to for the initial detecting fruiting quantities of abnormal point presets unsupervised threshold value, then obtains the sequence Point is the second definitive result of abnormal point.
Wherein, unsupervised threshold value is preset to be determined by the quantity of unsupervised learning submodel;It is preset in the present embodiment unsupervised Threshold value can be the median of the quantity of unsupervised learning submodel, and demand setting can be detected according to business in the present embodiment, This is not construed as limiting.
Specifically, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting As a result in, sequence of points is more than or equal to for the quantity of the initial detecting result of abnormal point presets unsupervised threshold value, that is, by each A unsupervised learning submodel determines that the quantity that the sequence of points is the result of abnormal point is more than or equal to default statistical threshold, then passes through Unsupervised learning model obtains the second definitive result that the sequence of points is abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point Amount, which is less than, presets unsupervised threshold value, then obtains the second uncertain result that the sequence of points is abnormal point.
Optionally, if the sequence of points obtained by each unsupervised learning submodel whether be abnormal point initial detecting As a result in, the case where there are the sequence of points being abnormal point, and sequence of points be the initial detecting result of abnormal point quantity be less than it is pre- If unsupervised threshold value, that is, determine that the quantity that the sequence of points is the result of abnormal point is small by each unsupervised learning submodel In presetting unsupervised threshold value, illustrate that also having more than the unsupervised learning submodel for presetting unsupervised threshold value determines that the sequence of points is not Abnormal point can not clearly determine whether the sequence of points is abnormal point by unsupervised learning model at this time, then pass through unsupervised Practise model obtain the sequence of points whether be abnormal point the second uncertain result.If obtaining that the sequence of points is abnormal point at this time Two is uncertain as a result, then no matter by statistical model which kind of testing result of the sequence of points obtained, the sequence of points is not as sample Training sample in this training library, therefore directly ignore the sequence of points, and then rejudge next sequence of points in time series Testing result.
S224 judges whether the first definitive result is consistent with the second definitive result;If so, executing S225;If it is not, executing S226。
S225 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different Normal sample.
S226 is returned and is executed S221, and the first definitive result of next sequence of points and second determines knot in acquisition time sequence Fruit, until being detected to the full sequence point in time series.
S227 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result Abnormal point in time series.
It as shown in Figure 2 D, include two or more statistics submodels, unsupervised learning model for statistical model It is illustrated including two or more this combined situation of unsupervised learning submodel, at this time the label of the time series Method may include steps of:
S231, the sequence of points in acquisition time sequence.
S232 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, first according to this by counting submodel Beginning testing result determine the sequence of points that is obtained by statistical model whether be abnormal point the first definitive result.
Optionally, in the present embodiment by two or more obtained original time series of statistics submodels whether Initial detecting result including abnormal point includes following three kinds of situations:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the first determining knot that the sequence of points is normal point Fruit.
2) sequence of points is that the initial detecting fruiting quantities of abnormal point are more than or equal to default statistical threshold, then obtains the sequence of points For the first definitive result of abnormal point, default statistical threshold is determined by the quantity of statistics submodel.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point Amount is less than default statistical threshold, then obtains the first uncertain result that the sequence of points is abnormal point.
S233 respectively obtains whether sequence of points is the initial detecting of abnormal point as a result, root by unsupervised learning submodel According to the initial detecting result determine the sequence of points that is obtained by unsupervised learning model whether be abnormal point the second definitive result.
Optionally, the original time sequence obtained in the present embodiment by two or more unsupervised learning submodels Whether the initial detecting result including abnormal point includes following three kinds of situations to column:
1) it is normal point that initial detecting result, which is sequence of points, then obtains the second determining knot that the sequence of points is normal point Fruit.
2) sequence of points is more than or equal to for the initial detecting fruiting quantities of abnormal point presets unsupervised threshold value, then obtains the sequence Point is the second definitive result of abnormal point.
It 3) is abnormal point there are the sequence of points in initial detecting result, and sequence of points is the initial detecting number of results of abnormal point Amount, which is less than, presets unsupervised threshold value, then obtains the second uncertain result that the sequence of points is abnormal point.
S234 judges whether the first definitive result is consistent with the second definitive result;If so, executing S235;If it is not, executing S236。
S235 will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points of abnormal point as different Normal sample.
S236 returns and executes S232, obtains the first definitive result and the second definitive result of next sequence of points, until clock synchronization Between full sequence point in sequence detected.
S237 obtains the testing result of each sequence of points in time series by disaggregated model, and marks according to testing result Abnormal point in time series.
Technical solution provided in this embodiment, by the various combination for independently selecting statistical model and unsupervised learning model Whether mode is that abnormal point carries out initial detecting to each sequence of points in time series, avoids only with single statistics The problem of missing inspection existing when carrying out abnormality detection to sequence of points of model or unsupervised learning model and erroneous detection, improves sequence The detection accuracy of point, will determine as the sequence of points of normal point as normal sample, will determine as the sequence of points conduct of abnormal point Exceptional sample, and then disaggregated model is trained according to the normal sample of down-sampling and exceptional sample, improve disaggregated model Classification accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with accurate Each sequence of points into time series whether be abnormal point testing result, improve the accurate of time series abnormal marking result Property and reliability.
Embodiment three
Fig. 3 A is a kind of flow chart of the labeling method for time series that the embodiment of the present invention three provides, and Fig. 3 B is the present invention The schematic illustration of the detection process for the time series that embodiment three provides.The present embodiment be on the basis of the above embodiments into Row optimization.Specifically, the present embodiment is mainly to the training process of disaggregated model and according to trained disaggregated model to the time The process that each sequence of points is detected in sequence carries out detailed explanation.
Optionally, as shown in Figure 3A, this method can specifically include following steps:
S310, the sequence of points in acquisition time sequence.
S320, by the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through The unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result.
S330 will determine as the sequence of points conduct of normal point if the first definitive result is consistent with the second definitive result Normal sample will determine as the sequence of points of abnormal point as exceptional sample.
S340 obtains the abnormal probability of sequence of points, the classification in each sequence of points input disaggregated model in time series Model is obtained by the normal sample training after exceptional sample and down-sampling.
Specifically, the present embodiment is obtaining corresponding classification mould by the normal sample training after exceptional sample and down-sampling When type, each sequence of points in time series can be input in the disaggregated model by the present embodiment, by trained disaggregated model The case where whether being abnormal point in time series to each sequence of points for including, judges, obtains the exception of each sequence of points Probability, the exception probability is for a possibility that whether sequence of points is abnormal point to be indicated.
S350 is ranked up each sequence of points according to abnormal probability, and true in the sequence of points after sequence using Top algorithm Set the goal sequence of points, using the abnormal probability of target sequence point as the classification thresholds of disaggregated model.
It optionally, can be according to the height of abnormal probability after the abnormal probability for obtaining each sequence of points by disaggregated model Low sequence is ranked up each sequence of points, and is chosen in each sequence of points after sequence using preset Top algorithm It is ordered as the target sequence point of N out, and using the abnormal probability of the target sequence point as the classification thresholds of disaggregated model;Exist at this time The size for being ordered as N of Top algorithm setting can detect demand according to specific business and be set, so that the present embodiment In classification thresholds can be carried out according to different business examination criterias certain flexible, the detection for improving time series is accurate Property.
S360 determines whether each sequence of points is abnormal in time series according to the abnormal probability and classification thresholds of sequence of points The testing result of point.
Specifically, after obtaining the classification thresholds of disaggregated model, it can be to the abnormal general of sequence of points each in time series Rate is compared with the classification thresholds, when the abnormal probability of a certain sequence of points is more than or equal to the classification thresholds, determines the sequence Point is abnormal point;If the abnormal probability of the sequence of points is less than the classification thresholds, it is determined that the sequence of points is normal point;And then judge Whether each sequence of points for including in time series is abnormal point, obtains whether each sequence of points in time series is abnormal point Testing result, and corresponding abnormal point is marked in time series, it can be intuitively displayed the abnormality detection feelings of the time series Condition prompts administrative staff to carry out artificial correction for the abnormal conditions to carry out alarm and abnormal show.
Technical solution provided in this embodiment obtains the abnormal probability of sequence of points by disaggregated model trained in advance, and The classification thresholds for determining disaggregated model in the abnormal probability of sequence of points using Top algorithm, at this time can be according to corresponding industry Business examination criteria selects corresponding Classified Proportion in Top algorithm, so that classification thresholds detect mark according to different business Quasi- progress is certain to stretch, and carries out real-time abnormality detection to time series by disaggregated model, improves to time series progress The accuracy and reliability of abnormality detection.
Example IV
Fig. 4 is a kind of structural schematic diagram of the labelling apparatus for time series that the embodiment of the present invention four provides, specifically, such as Shown in Fig. 4, the apparatus may include:
Sequence of points obtains module 410, for the sequence of points in acquisition time sequence;
Definitive result obtains module 420, obtains whether sequence of points is abnormal point for the statistical model by constructing in advance The first definitive result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point second determine tie Fruit;
Sample determining module 430 will determine as normal if consistent with the second definitive result for the first definitive result The sequence of points of point will determine as the sequence of points of abnormal point as exceptional sample as normal sample;
Abnormal point mark module 440, for obtaining the testing result of each sequence of points in time series by disaggregated model, and The abnormal point in time series is marked according to testing result, which passes through the normal sample after exceptional sample and down-sampling Training obtains.
Technical solution provided in this embodiment passes through the statistical model constructed in advance and unsupervised learning model clock synchronization respectively Between sequence of points in sequence whether be that abnormal point carries out initial detecting, avoid only with single statistical model or unsupervised The problem of learning model missing inspection existing when detecting to the sequence of points in time series etc. and erroneous detection, improves time series The abnormality detection accuracy of middle sequence of points will be determined as the sequence of points of normal point by statistical model and unsupervised learning model As normal sample, it will be determined as the sequence of points of abnormal point by statistical model and unsupervised learning model as abnormal sample This, and then disaggregated model is trained according to the normal sample of the exceptional sample and down-sampling, improve point of disaggregated model Class accuracy, it is subsequent that each sequence of points in time series is detected again according to the disaggregated model, with according to testing result Accurate marker goes out the abnormal point in time series, solves artificial detection in the prior art and expends a large amount of human costs and line The problem that property regression model has some limitations and real-time is lower, improves the accurate of time series abnormal marking result Property and reliability.
Further, above-mentioned definitive result obtains module 420, may include:
Statistical result acquiring unit, the quantity for statistical model are one, whether obtain sequence of points by statistical model For abnormal point initial detecting result as the first definitive result.
Further, above-mentioned statistical result acquiring unit, can be specifically used for:
Statistical model includes two or more statistics submodels, then respectively obtains sequence of points by counting submodel Whether be abnormal point initial detecting result;
If it is normal point that initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, the sequence is obtained Point is the first definitive result of abnormal point, which is determined by the quantity of statistics submodel.
Further, above-mentioned definitive result obtains module 420, may include:
Unsupervised result acquiring unit, the quantity for unsupervised learning model are one, pass through unsupervised learning model Obtain whether sequence of points is the initial detecting result of abnormal point as the second definitive result.
Further, above-mentioned unsupervised result acquiring unit, can be specifically used for:
Unsupervised learning model includes two or more unsupervised learning submodels, passes through unsupervised learning submodule Type respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, it is somebody's turn to do Sequence of points is the second definitive result of abnormal point, this is preset unsupervised threshold value and is determined by the quantity of unsupervised learning submodel.
Further, above-mentioned abnormal point mark module 440 may include:
Abnormal probability acquiring unit, for obtaining sequence of points in each sequence of points input disaggregated model in time series Abnormal probability;
Classification thresholds determination unit is being arranged for being ranked up according to abnormal probability to each sequence of points, and using Top algorithm Target sequence point is determined in sequence of points after sequence, using the abnormal probability of target sequence point as the classification thresholds of disaggregated model;
Testing result determination unit determines each in time series for the abnormal probability and classification thresholds according to sequence of points Sequence of points whether be abnormal point testing result.
The labelling apparatus of time series provided in this embodiment is applicable to the time series that above-mentioned any embodiment provides Labeling method, have corresponding function and beneficial effect.
Embodiment five
Fig. 5 is a kind of structural schematic diagram for equipment that the embodiment of the present invention five provides, as shown in figure 5, the equipment includes place Manage device 50, storage device 51 and communication device 52;The quantity of processor 50 can be one or more in equipment, with one in Fig. 5 For a processor 50;Processor 50, storage device 51 and communication device 52 in equipment can pass through bus or other modes It connects, in Fig. 5 for being connected by bus.
Storage device 51 is used as a kind of computer readable storage medium, and it is executable to can be used for storing software program, computer Program and module, the corresponding program instruction/module of the labeling method of the time series as described in any embodiment of that present invention.Place Software program, instruction and the module that reason device 50 is stored in storage device 51 by operation, thereby executing the various function of equipment It can apply and data processing, that is, realize the labeling method of above-mentioned time series.
Storage device 51 can mainly include storing program area and storage data area, wherein storing program area can store operation Application program needed for system, at least one function;Storage data area, which can be stored, uses created data etc. according to terminal. It can also include nonvolatile memory in addition, storage device 51 may include high-speed random access memory, for example, at least one A disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, storage device 51 can It further comprise the memory remotely located relative to processor 50, these remote memories can be by network connection to setting It is standby.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Communication device 52 can be used for realizing the network connection or mobile data cube computation of equipment room.
A kind of equipment provided in this embodiment can be used for executing the label side for the time series that above-mentioned any embodiment provides Method has corresponding function and beneficial effect.
Embodiment six
The embodiment of the present invention six additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should Program can realize the labeling method of the time series in above-mentioned any embodiment when being executed by processor.This method specifically can wrap It includes:
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through preparatory structure The unsupervised learning model built obtain sequence of points whether be abnormal point the second definitive result;
If the first definitive result is consistent with the second definitive result, the sequence of points of normal point will determine as normal sample This, will determine as the sequence of points of abnormal point as exceptional sample;
The testing result of each sequence of points in time series is obtained by disaggregated model, and marks time sequence according to testing result Abnormal point in column, the disaggregated model are obtained by the normal sample training after exceptional sample and down-sampling.
Certainly, a kind of storage medium comprising computer executable instructions, computer provided by the embodiment of the present invention The method operation that executable instruction is not limited to the described above, can also be performed time series provided by any embodiment of the invention Labeling method in relevant operation.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art Part can be embodied in the form of software products, which can store in computer readable storage medium In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, in the embodiment of the labelling apparatus of above-mentioned time series, included each unit and module It is only divided according to the functional logic, but is not limited to the above division, as long as corresponding functions can be realized; In addition, the specific name of each functional unit is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal Replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (14)

1. a kind of labeling method of time series characterized by comprising
Sequence of points in acquisition time sequence;
By the statistical model constructed in advance obtain sequence of points whether be abnormal point the first definitive result, pass through what is constructed in advance Unsupervised learning model obtain sequence of points whether be abnormal point the second definitive result;
If first definitive result is consistent with second definitive result, the sequence of points of normal point will determine as just Normal sample will determine as the sequence of points of abnormal point as exceptional sample;
The testing result of each sequence of points in the time series is obtained by disaggregated model, and institute is marked according to the testing result State the abnormal point in time series, the disaggregated model is trained by the normal sample after the exceptional sample and down-sampling It arrives.
2. the method according to claim 1, wherein the statistical model by constructing in advance obtains sequence of points Whether be abnormal point the first definitive result, comprising:
The quantity of the statistical model be one, by the statistical model obtain sequence of points whether be abnormal point initial detecting As a result it is used as first definitive result.
3. the method according to claim 1, wherein the statistical model by constructing in advance obtains sequence of points Whether be abnormal point the first definitive result, comprising:
The statistical model includes two or more statistics submodels, then respectively obtains sequence by the statistics submodel Column point whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, obtaining the sequence of points is First definitive result of abnormal point, the default statistical threshold are determined by the quantity of the statistics submodel.
4. the method according to claim 1, wherein the unsupervised learning model by constructing in advance obtains Sequence of points whether be abnormal point the second definitive result, comprising:
The quantity of the unsupervised learning model is one, obtains whether sequence of points is abnormal by the unsupervised learning model The initial detecting result of point is as second definitive result.
5. the method according to claim 1, wherein the unsupervised learning model by constructing in advance obtains Sequence of points whether be abnormal point the second definitive result, comprising:
The unsupervised learning model includes two or more unsupervised learning submodels, passes through the unsupervised learning Submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, the sequence is obtained Point is the second definitive result of abnormal point, described to preset unsupervised threshold value and determined by the quantity of the unsupervised learning submodel.
6. the method according to claim 1, wherein described obtained in the time series respectively by disaggregated model The testing result of sequence of points, comprising:
Each sequence of points in the time series is inputted in the disaggregated model, the abnormal probability of the sequence of points is obtained;
Each sequence of points is ranked up according to the abnormal probability, and mesh is determined in the sequence of points after sequence using Top algorithm Sequence of points is marked, using the abnormal probability of the target sequence point as the classification thresholds of the disaggregated model;
According to the abnormal probability and the classification thresholds of the sequence of points, determine whether each sequence of points is different in the time series The testing result often put.
7. a kind of labelling apparatus of time series characterized by comprising
Sequence of points obtains module, for the sequence of points in acquisition time sequence;
Definitive result obtains module, for by the statistical model that constructs in advance obtain sequence of points whether be abnormal point first really It is fixed as a result, by the unsupervised learning model constructed in advance obtain sequence of points whether be abnormal point the second definitive result;
Sample determining module will determine as just if consistent with second definitive result for first definitive result The sequence of points often put will determine as the sequence of points of abnormal point as exceptional sample as normal sample;
Abnormal point mark module, for obtaining the testing result of each sequence of points in the time series, and root by disaggregated model The abnormal point in the time series is marked according to the testing result, the disaggregated model passes through the exceptional sample and down-sampling Normal sample training afterwards obtains.
8. device according to claim 7, which is characterized in that the definitive result obtains module, comprising:
Statistical result acquiring unit, the quantity for the statistical model are one, obtain sequence of points by the statistical model It whether is the initial detecting result of abnormal point as first definitive result.
9. device according to claim 7, which is characterized in that the statistical result acquiring unit is specifically used for:
The statistical model includes two or more statistics submodels, then respectively obtains sequence by the statistics submodel Column point whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the first determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that sequence of points is abnormal point are more than or equal to default statistical threshold, obtaining the sequence of points is First definitive result of abnormal point, the default statistical threshold are determined by the quantity of the statistics submodel.
10. device according to claim 7, which is characterized in that the definitive result obtains module, comprising:
Unsupervised result acquiring unit, the quantity for the unsupervised learning model are one, pass through the unsupervised learning Model obtains whether sequence of points is the initial detecting result of abnormal point as second definitive result.
11. device according to claim 7, which is characterized in that the unsupervised result acquiring unit is specifically used for:
The unsupervised learning model includes two or more unsupervised learning submodels, passes through the unsupervised learning Submodel respectively obtain sequence of points whether be abnormal point initial detecting result;
If it is normal point that the initial detecting result, which is the sequence of points, the second determination that the sequence of points is normal point is obtained As a result;
If the initial detecting fruiting quantities that the sequence of points is abnormal point, which are more than or equal to, presets unsupervised threshold value, the sequence is obtained Point is the second definitive result of abnormal point, described to preset unsupervised threshold value and determined by the quantity of the unsupervised learning submodel.
12. device according to claim 7, which is characterized in that the abnormal point mark module, comprising:
Abnormal probability acquiring unit obtains institute for inputting each sequence of points in the time series in the disaggregated model State the abnormal probability of sequence of points;
Classification thresholds determination unit for being ranked up according to the abnormal probability to each sequence of points, and is being arranged using Top algorithm Target sequence point is determined in sequence of points after sequence, using the abnormal probability of the target sequence point as the classification of the disaggregated model Threshold value;
Testing result determination unit determines the time for the abnormal probability and the classification thresholds according to the sequence of points In sequence each sequence of points whether be abnormal point testing result.
13. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as the labeling method of time series as claimed in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The labeling method such as time series as claimed in any one of claims 1 to 6 is realized when execution.
CN201811648187.0A 2018-12-30 2018-12-30 Time sequence marking method, device, equipment and storage medium Active CN109739904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811648187.0A CN109739904B (en) 2018-12-30 2018-12-30 Time sequence marking method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811648187.0A CN109739904B (en) 2018-12-30 2018-12-30 Time sequence marking method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109739904A true CN109739904A (en) 2019-05-10
CN109739904B CN109739904B (en) 2021-08-10

Family

ID=66362886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811648187.0A Active CN109739904B (en) 2018-12-30 2018-12-30 Time sequence marking method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109739904B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178456A (en) * 2020-01-15 2020-05-19 腾讯科技(深圳)有限公司 Abnormal index detection method and device, computer equipment and storage medium
CN111277459A (en) * 2020-01-16 2020-06-12 新华三信息安全技术有限公司 Equipment anomaly detection method and device and machine-readable storage medium
CN111325260A (en) * 2020-02-14 2020-06-23 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and computer readable medium
CN111353890A (en) * 2020-03-30 2020-06-30 中国工商银行股份有限公司 Application log-based application anomaly detection method and device
CN111614578A (en) * 2020-05-09 2020-09-01 北京邮电大学 Network resource allocation method and device based on exponential weighting and inflection point detection
CN112070155A (en) * 2020-09-07 2020-12-11 常州微亿智造科技有限公司 Time series data labeling method and device
CN112328425A (en) * 2020-12-04 2021-02-05 杭州谐云科技有限公司 Anomaly detection method and system based on machine learning
CN113409560A (en) * 2021-07-30 2021-09-17 佛山市墨纳森智能科技有限公司 Monitoring method and device of electronic equipment, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794192A (en) * 2015-04-17 2015-07-22 南京大学 Multi-level anomaly detection method based on exponential smoothing and integrated learning model
CN105678409A (en) * 2015-12-31 2016-06-15 哈尔滨工业大学 Adaptive and distribution-free time series abnormal point detection method
CN106411597A (en) * 2016-10-14 2017-02-15 广东工业大学 Network traffic abnormality detection method and system
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794192A (en) * 2015-04-17 2015-07-22 南京大学 Multi-level anomaly detection method based on exponential smoothing and integrated learning model
CN105678409A (en) * 2015-12-31 2016-06-15 哈尔滨工业大学 Adaptive and distribution-free time series abnormal point detection method
CN106411597A (en) * 2016-10-14 2017-02-15 广东工业大学 Network traffic abnormality detection method and system
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178456A (en) * 2020-01-15 2020-05-19 腾讯科技(深圳)有限公司 Abnormal index detection method and device, computer equipment and storage medium
CN111178456B (en) * 2020-01-15 2022-12-13 腾讯科技(深圳)有限公司 Abnormal index detection method and device, computer equipment and storage medium
CN111277459A (en) * 2020-01-16 2020-06-12 新华三信息安全技术有限公司 Equipment anomaly detection method and device and machine-readable storage medium
CN111325260A (en) * 2020-02-14 2020-06-23 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and computer readable medium
CN111325260B (en) * 2020-02-14 2023-10-27 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and computer readable medium
CN111353890A (en) * 2020-03-30 2020-06-30 中国工商银行股份有限公司 Application log-based application anomaly detection method and device
CN111614578A (en) * 2020-05-09 2020-09-01 北京邮电大学 Network resource allocation method and device based on exponential weighting and inflection point detection
CN112070155A (en) * 2020-09-07 2020-12-11 常州微亿智造科技有限公司 Time series data labeling method and device
CN112328425A (en) * 2020-12-04 2021-02-05 杭州谐云科技有限公司 Anomaly detection method and system based on machine learning
CN113409560A (en) * 2021-07-30 2021-09-17 佛山市墨纳森智能科技有限公司 Monitoring method and device of electronic equipment, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109739904B (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN109739904A (en) A kind of labeling method of time series, device, equipment and storage medium
US20230162096A1 (en) Characterizing failures of a machine learning model based on instance features
JP2022105263A (en) Multi-source timing data fault diagnosis method based on graph neural network, and medium
CN113282461B (en) Alarm identification method and device for transmission network
CN111915020B (en) Updating method and device of detection model and storage medium
EP3422518B1 (en) A method for recognizing contingencies in a power supply network
CN105372581A (en) Flexible circuit board manufacturing process automatic monitoring and intelligent analysis system and method
CN103617469A (en) Equipment failure prediction method and system of electrical power system
CN111459700A (en) Method and apparatus for diagnosing device failure, diagnostic device, and storage medium
CN107203467A (en) The reference test method and device of supervised learning algorithm under a kind of distributed environment
CN113377567A (en) Distributed system fault root cause tracing method based on knowledge graph technology
US11783474B1 (en) Defective picture generation method and apparatus applied to industrial quality inspection
CN111340054A (en) Data labeling method and device and data processing equipment
CN112859822A (en) Equipment health analysis and fault diagnosis method and system based on artificial intelligence
US20190004490A1 (en) Method for recognizing contingencies in a power supply network
CN111158964B (en) Disk failure prediction method, system, device and storage medium
CN112419268A (en) Method, device, equipment and medium for detecting image defects of power transmission line
CN112036249B (en) Method, system, medium and terminal for end-to-end pedestrian detection and attribute identification
CN115456107A (en) Time series abnormity detection system and method
CN115187772A (en) Training method, device and equipment of target detection network and target detection method, device and equipment
CN115047322A (en) Method and system for identifying fault chip of intelligent medical equipment
CN117034143B (en) Distributed system fault diagnosis method and device based on machine learning
CN115171045A (en) YOLO-based power grid operation field violation identification method and terminal
CN113592939B (en) Deep learning method for judging size of narrow blood vessel based on coronary angiography image
Li et al. Signal anomaly detection of bridge SHM system based on two-stage deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant