CN110275809A - A kind of data fluctuations recognition methods, device and storage medium - Google Patents
A kind of data fluctuations recognition methods, device and storage medium Download PDFInfo
- Publication number
- CN110275809A CN110275809A CN201810214976.7A CN201810214976A CN110275809A CN 110275809 A CN110275809 A CN 110275809A CN 201810214976 A CN201810214976 A CN 201810214976A CN 110275809 A CN110275809 A CN 110275809A
- Authority
- CN
- China
- Prior art keywords
- data
- fluctuation
- value
- fluctuation parameters
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
Abstract
The embodiment of the invention discloses a kind of data fluctuations recognition methods, device and storage mediums;The embodiment of the present invention is using the data value for obtaining current data;Obtain the first fluctuation parameters between data value and historical data values;Nonlinear regression model (NLRM) is trained according to training data sequence;The first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtains the second fluctuation parameters between data value and the first prediction data value;According to the first fluctuation parameters and the second fluctuation parameters, determine whether the fluctuation of current data is abnormal.Fluctuation parameters of the available data of the program in multiple dimensions determine whether data fluctuations are abnormal, therefore, can promote the identification accuracy of data fluctuations based on multidimensional fluctuation parameters.
Description
Technical field
The present invention relates to field of computer technology, and in particular to a kind of data fluctuations recognition methods, device and storage medium.
Background technique
For the quality for guaranteeing service, needs using each achievement data of data monitoring technical solution monitoring business and find to report
It accuses abnormal.
Current data monitoring technical solution is concentrated mainly on the data acquisition aspect of monitoring, mainly has based on cloud platform
Monitoring, i.e., summarize data collection to cloud platform, by the fluctuation of cloud computing platform calculating data, to determine that data are
No exception.
However, data monitoring scheme is only concerned about on the hardware and software platform of monitoring data and the real-time of monitoring data at present
Face, but the monitoring for external service business data, these schemes can only play the role of " perceiving " to data variation, for
The fluctuation of business datum normalization once in a while, these schemes can not accurately identify these fluctuations, such as the wave of current data
Dynamic whether to belong to normal range (NR), therefore, available data monitoring scheme is lower to the identification accuracy of data fluctuations.
Summary of the invention
The embodiment of the present invention provides a kind of data fluctuations recognition methods, device and storage medium, can promote data fluctuations
Identification accuracy.
The embodiment of the present invention provides a kind of data fluctuations recognition methods, comprising:
Obtain the data value of current data;
Obtain the first fluctuation parameters between the data value and historical data values;
Nonlinear regression model (NLRM) is trained according to training data sequence;
The first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtain the data value with
The second fluctuation parameters between the first prediction data value;
According to first fluctuation parameters and second fluctuation parameters, determine whether the fluctuation of the current data is different
Often.
Correspondingly, the embodiment of the present invention also provides a kind of data fluctuations identification device, comprising:
Data capture unit, for obtaining the data value of current data;
First parameter acquiring unit, for obtaining the first fluctuation parameters between the data value and historical data values;
Training unit, for being trained according to training data sequence to nonlinear regression model (NLRM);
Second parameter acquiring unit, for obtaining the first current prediction data according to the nonlinear regression model (NLRM) after training
Value, and obtain the second fluctuation parameters between the data value and the first prediction data value;
Determination unit, for determining the current number according to first fluctuation parameters and second fluctuation parameters
According to fluctuation it is whether abnormal.
Correspondingly, the embodiment of the present invention also provides a kind of storage medium, the storage medium is stored with instruction, described instruction
The step of method of any offer of the embodiment of the present invention is provided when being executed by processor.
The embodiment of the present invention is using the data value for obtaining current data;Obtain first between data value and historical data values
Fluctuation parameters;Nonlinear regression model (NLRM) is trained according to training data sequence;According to the nonlinear regression model (NLRM) after training
The first current prediction data value is obtained, and obtains the second fluctuation parameters between data value and the first prediction data value;According to
First fluctuation parameters and the second fluctuation parameters determine whether the fluctuation of current data is abnormal.The available data of the program exist
Fluctuation parameters in multiple dimensions determine whether data fluctuations are abnormal, therefore, can promote data wave based on multidimensional fluctuation parameters
Dynamic identification accuracy.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 a is the schematic diagram of a scenario of data monitoring system provided in an embodiment of the present invention;
Fig. 1 b is the flow diagram of data fluctuations recognition methods provided in an embodiment of the present invention;
Fig. 1 c is the solution schematic diagram of nonlinear regression model (NLRM) provided in an embodiment of the present invention;
Fig. 2 is that lag order provided in an embodiment of the present invention determines flow diagram;
Fig. 3 a is another flow diagram of data fluctuations recognition methods provided in an embodiment of the present invention;
Fig. 3 b is the logical architecture schematic diagram of data fluctuations recognition methods provided in an embodiment of the present invention;
Fig. 4 a is the first structural schematic diagram of data fluctuations identification device provided in an embodiment of the present invention;
Fig. 4 b is second of structural schematic diagram of data fluctuations identification device provided in an embodiment of the present invention;
Fig. 5 is the structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those skilled in the art's every other implementation obtained without creative efforts
Example, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a kind of data fluctuations recognition methods, device and storage mediums.
The embodiment of the invention provides a kind of data monitoring system, which may include any offer of the embodiment of the present invention
Data fluctuations identification device.Wherein, data fluctuations identification device can be in server, such as monitoring server.
In addition, data monitoring system can also include other equipment, it include such as terminal, which can be mobile phone, plate
The equipment such as computer, laptop.
For example, with reference to Fig. 1 a, a kind of data monitoring system is provided, which includes: terminal 10 and service
Device 20, terminal 10 are connect with server 20 by network 30.It wherein, include router, gateway etc. network entity in network 30,
In figure and to illustrate.Terminal 10 can be communicated by cable network or wireless network with server 20, to request to service
Service on device 20, for example, can from server 20 download application and/or application updated data package and/or to apply relevant number
It is believed that breath or business information.Wherein, terminal 10 can be with terminal for equipment, Fig. 1 a such as mobile phone, tablet computer, laptops
10 is for laptops.Application needed for being also equipped with various users in the terminal 10, for example have amusement function
Application (such as image processing application, audio play application, game application, ocr software), for another example have the application of service function.
Terminal 10 can obtain the data value of the data to 20 reported data of server, server 20 from local;So
Afterwards, the first fluctuation parameters between data value and historical data values are obtained;According to training data sequence to nonlinear regression model (NLRM)
It is trained;The first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtains data value and the
The second fluctuation parameters between one prediction data value;According to the first fluctuation parameters and the second fluctuation parameters, current data is determined
Fluctuation it is whether abnormal.
It is reminded in addition, server 20 in the fluctuation exception for determining data value, can be sent out alarm.
It will be described in detail respectively below.
The present embodiment will be described from the angle of data fluctuations identification device, which specifically can be with
Server etc..
As shown in Figure 1 b, a kind of data fluctuations recognition methods is provided, this method can be held by the processor in server
Row, detailed process can be such that
101, the data value of current data is obtained.
Wherein, current data is the data currently got from data source, the number such as currently got from monitoring data source
According to.Wherein, current data can be the data got from data source today.
In view of the data source of monitoring is varied, for example, can be mysql, the databases such as hive are also possible to list
A file (file) or distributed document (hdfs), the even one section code (shell script) that can be executed.Therefore,
The data type or format of acquisition be not identical.
In order to promote data fluctuations recognition efficiency, optionally, standardization processing can also be carried out to data format or type.
That is, step " data value for obtaining current data " may include:
Data are obtained from data source, obtain current data;
Current data is converted into the data of Uniform data format, obtains the data value of translated data.
Specifically, the data source of monitoring can be abstracted into corresponding Data Generator (generator), it is raw by data
Grow up to be a useful person (generator) from corresponding data source obtain the corresponding data of current time, by Data Generator by data conversion
At the data of Uniform data format.
Wherein, the abstract method of data source can be abstracted based on the mode of jdbc.It in practical applications, can be in data
Level of abstraction realizes that generator is abstract.The core function of data abstraction layer is exactly to analyze various data sources, and adjust
Corresponding generator is adapted to and generates externally all to appear as after data sources all in this way carries out data abstraction layer
A kind of this data format of generator.
As it can be seen that data abstraction layer has been mainly the standardization of paired data type, the data source of monitoring be it is changeable, therefore
To need to carry out the data source of these various formats a unified structuring before carrying out fluctuation identification, redescribe for
Generator, generator, which are realized from data source, obtains data, and mentions to monitoring logic layer (layer fluctuated for identification)
Format is reported for unified data.
102, the first fluctuation parameters between data value and historical data values are obtained.
Wherein, the value for the data that historical data values are got before being from data source, i.e., from data before current time
The value for the data that source is got.For example, can be the value of the last data got from data source.
For example, historical data values can be the value of the data obtained from data source yesterday.
Wherein, the first fluctuation parameters are the parameter for measuring data value amplitude of variation, for example, can be used for measuring current
Data value amplitude of variation of the data value of data relative to historical data values.For example, first fluctuation parameters may include fluctuation
Rate confidence, stability bandwidth confidence can be by the data values of current data and the difference of historical data values divided by history
Data value obtains.It is as follows:
Stability bandwidth confidence=(x-x-1)/x-1, wherein x is the data value of current data, x-1For historical data
Value.
103, nonlinear regression model (NLRM) is trained according to training data sequence.
Wherein, linear regression are as follows: known a series of linear data column, such as time series (time series or
Dynamic series refer to ordered series of numbers made of the chronological order arrangement by the numerical value of same statistical indicator by its generation), if the time
Sequence meets linear character, it can (i.e. linear regression model (LRM) expression formula) is indicated with Y=WX+b, wherein W and b is wait estimate ginseng
Number, linear regression minimize two norms and make recurrence side by calculating direct two norm of known sample and function Y=WX+b
Journey Y=WX+b is closest to existing timed sample sequence.
Wherein, nonlinear regression are as follows: it is similar with linear regression, only in nonlinear regression, function Y=f to be assessed
(X) it is a nonlinear function, also makes regression equation Y=f (X) and existing time sequence by minimizing two norms
Column sample is closest.
Wherein, nonlinear regression model (NLRM) can there are many, for example, hyperbolic model, power function model, nonlinear polynomial
Model etc..
For example, by taking nonlinear regression model (NLRM) is nonlinear polynomial model as an example, the model table of nonlinear polynomial model
Up to formula are as follows:
YT=a0+a1T1+a2T2+...+apTp
Wherein, what p was indicated is power series.Because of the problem of any curve, curved surface, hypersurface, in a certain range all
It can arbitrarily be approached with multinomial.What p was represented is the degree approached, in the embodiment of the present invention in preferably p=4 i.e. 4 formula, a be
Model parameter to be estimated, solution mode can be completed using least square method.
Wherein, training data sequence is a time series, including multiple historical data values, and historical data is corresponding by its
Chronological order arrangement.
In the embodiment of the present invention, nonlinear regression model (NLRM) is trained according to training data sequence, is exactly according to training
The model parameter to be estimated of data sequence solution nonlinear regression model (NLRM).For example, using nonlinear regression model (NLRM) as nonlinear polynomial
For model, the training model is just to solve for model parameter a to be estimated.
Specifically, step " being trained according to training data sequence to nonlinear regression model (NLRM) " may include:
Determine the number of model parameter to be estimated in nonlinear regression model (NLRM);
The model parameter to be estimated that nonlinear regression model (NLRM) is solved based on least square method, training data sequence, is trained
Nonlinear regression model (NLRM) afterwards.
For example, by taking nonlinear regression model (NLRM) is nonlinear polynomial model as an example, the model table of nonlinear polynomial model
Up to formula are as follows:
YT=a0+a1T1+a2T2+...+apTp
Wherein, a is model parameter to be estimated, and directly determining number of parameters here is 4, i.e. p=4, is because in the present invention
In embodiment, number of parameters can fit reasonable curve when being 4, and theoretically parameter is more, the closer actual value of meeting,
But it is not necessarily most reasonably, because of the phenomenon that actual value each time is not necessarily most reliable, and here it is over-fittings.
So it is to reach optimal fitting state that the number taken in the embodiment of the present invention, which is 4,.
With reference to Fig. 1 c, if known actual value is yt (i.e. training data sequence), by the least square for calling python
Method finds out all a, so that (Yt-yt)2Value is minimum out.
104, the first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtain data value with
The second fluctuation parameters between first prediction data value.
Wherein, the first current prediction data value is the prediction data value of current time, for example, the prediction data value of today
Etc..
In the embodiment of the present invention, after nonlinear regression model (NLRM) training, can obtain nonlinear regression model (NLRM) wait estimate
Model parameter, thus the nonlinear regression model (NLRM) after being trained.
For example, using nonlinear regression model (NLRM) as nonlinear polynomial model: YT=a0+a1T1+a2T2+...+apTpFor,
After solving all a, the nonlinear polynomial model after being trained is according to (being denoted as G (X)), then, after training
Nonlinear polynomial model: available the first current prediction data value of YT=a0+a1T1+a2T2+...+apTp, namely
Current data value is predicted, for example, predicting the first prediction data value of today.
In embodiments of the present invention, after obtaining the first current prediction data value, the number of current data can also be obtained
According to the second fluctuation parameters between value and the first prediction data value.
Wherein, the second fluctuation parameters are used to measure the parameter of data value amplitude of variation, for example, can be used for measuring current number
According to amplitude of variation of the data value relative to prediction data value.For example, second fluctuation parameters can be denoted as confidence ', the
Two fluctuation parameters confidence ' can be by the difference of the data value of current data and prediction data value divided by the number of current data
It is obtained according to value.It is as follows:
Second fluctuation parameters confidence '=(x-G (X))/x, x are the data value of current data, and G (X) is non-linear
The prediction data value of regression model.
In the embodiment of the present invention, the timing of the acquisition process of the acquisition process of the first fluctuation parameters and the second fluctuation parameters can
There are many, for example, may be performed simultaneously, or successively execute.
105, according to the first fluctuation parameters and the second fluctuation parameters, determine whether the fluctuation of current data is abnormal.
It is alternatively possible to obtain the final fluctuation parameters of current data according to the first fluctuation parameters and the second fluctuation parameters
Then value determines whether the fluctuation of current data is abnormal according to final fluctuation parameters value.
For example, when final fluctuation parameters value within a preset range when, determine that the fluctuation of current data is normal;
When final fluctuation parameters value not within a preset range when, determine that the fluctuation of current data is abnormal.
Wherein, the parameter value based on the first fluctuation parameters and second fluctuation parameters the two parameters generates final fluctuation ginseng
There are many modes of numerical value, for example, aggregation process can be carried out with the parameter value of the first fluctuation parameters and the second fluctuation parameters, will converge
Final fluctuation parameters value of the parameter value as current data that always treated.
For example, processing can be weighted and averaged to the parameter value of the first fluctuation parameters and the second fluctuation parameters, will weight
Average value is as final fluctuation parameters value.
It is assumed that the first fluctuation parameters are confidence, the second fluctuation parameters are confidence ', at this point it is possible to finally
Fluctuation parameters value confidence final=q1*confidence+q2confidence ', wherein q1 and q2 is weight, can
To set according to actual needs, for example, q1=0.3, q2=0.7 etc..
In one embodiment, in order to further enhance the accuracy of data fluctuations identification, it may be incorporated into autoregression sliding
Averaging model (Auto-Regressive and Moving Average Model, ARMA) carry out number it was predicted that and obtain work as
Third fluctuation parameters between the data value of preceding data and the model predication value, then, based on the fluctuation parameters in three dimensions
Determine whether the fluctuation of current data is abnormal.
It must be the prediction based on stationary sequence, for non-stationary sequence since arma modeling is restricted in prediction
Arranging unavailable namely arma modeling fitting sequence i.e. training data sequence must be a stationary sequence.Therefore, ARMA is being used
Before model prediction, it must be determined that whether the training data sequence of the model is a stationary sequence.
Optionally, step " according to the first fluctuation parameters and the second fluctuation parameters, determines whether the fluctuation of current data is different
Often " may include:
When training data sequence is stationary sequence, autoregressive moving-average model is instructed according to training data sequence
Practice;
The second current prediction data value is obtained according to the autoregressive moving-average model after training, and obtain data value with
Third fluctuation parameters between second prediction data value;
According to the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters, determine current data fluctuation whether
It is abnormal;
When training data sequence is not stationary sequence, according to the first fluctuation parameters and the second fluctuation parameters, determination is worked as
Whether the fluctuation of preceding data is abnormal.
Wherein, stationary sequence refers to a time series, if sequence desired value does not have the variation of tendency, variance not to have
Have and significantly change very much, weak related to current time point, periodic feature is unobvious, is just referred to as to be stable sequence, such as
Simplest arithmetic progression.
Wherein, arma modeling is the model for being usually used in analyzing the trend of Future Data at present in econometrics, the model
It is to be simulated by AR (p) (Auto-regressive, autoregression model) and MA (Moving-Average, moving average model(MA model))
The situation of change of one group of data.The expression formula of arma modeling are as follows:
Yt=a0+a1Yt-1+a2Yt-2+...+apYt-p+b1et+b2et-2+...+bqet-q
Wherein, et is to obey desired value E (et)=0, and variance is the distribution of D (et)=d2, and et and et+n are mutually only
It is vertical.A and b is parameter to be estimated, after giving this model, it is necessary to find out a in this model, the value of b can just obtain Yt's
Expression formula F (Yt).Solution mode can be completed using least square method.
Wherein, autoregression model: (AutoRegression, AR) is abbreviated as AR (P), refers to the random mistake of following form
Journey: YT=A1YT-1+A2YT-2+....+APYT-P+UTA therein1、A2、...、APIt is P parameters to be asked;P is the number for lagging the time limit
Mesh.
Moving average model(MA model): equally by taking above-mentioned autoregressive process as an example, YTIt is YT,, YT-pFunction, pass through calculus of differences
Y can be calculatedT=UT-A1UT-1-APUT-P。
In the embodiment of the present invention, when training data sequence is that (i.e. above-mentioned Yt is stationary sequence to stationary sequence, that is, Yt
It is expected that variance, auto-correlation function is unrelated with t) when, which can be fitted using arma modeling, it specifically, can basis
Stationary sequence is trained arma modeling, that is, solves the model parameter to be estimated of arma modeling, such as above-mentioned arma modeling expression formula
In a and b.
When training data sequence is not stationary sequence, the stationary sequence, therefore, this hair cannot be fitted with arma modeling
Bright embodiment does not use arma modeling prediction data value when training data sequence is non-stationary series;At this point it is possible to according to non-
Linear regression model (LRM) is fitted training data sequence, nonlinear regression model (NLRM) prediction data value is based on, then, according to current
The fluctuation parameters and current data of data and the prediction data value and the fluctuation parameters of historical data values determine the fluctuation of data
It is whether abnormal.
In the embodiment of the present invention, on the one hand nonlinear regression model (NLRM) is used as the supplement of ARMA, i.e., when data sequence is unstable
In the case where, pass through the model prediction data value;On the other hand it in the case where sequence stationary, can go forward side by side with arma modeling one
Prediction obtains more reasonable predicted value, generates multiple dimension fluctuation parameters, promotes the accuracy of data fluctuations identification.
Wherein, arma modeling is trained based on training data sequence, is just to solve for model the to be estimated ginseng of arma modeling
Number.
The solution of the model parameter to be estimated of arma modeling most importantly determines the lag order of arma modeling, also referred to as
Lag period, i.e., p, q parameter in above-mentioned arma modeling expression formula.
The embodiment of the present invention can be analyzed by auto-correlation function and partial autocorrelation function analysis mode determines lag order;
Then, model parameter to be estimated is solved by least square method.
Assuming that arma modeling are as follows: R=Xt+Yt, wherein Yt=b1et+b2et-2+...+bqet-q;Xt=a0+a1Yt-1+
a2Yt-2+...+apYt-p。
With reference to Fig. 2, the lag order determination process of arma modeling is as follows:
201, lag order q=1, p=1 are set.
202, judge whether q is greater than 5, if it is not, 203 are thened follow the steps, if so, step 207.
203, the auto-correlation coefficient of lag order q is obtained.
That is, sequence of calculation YtAnd Yt+qBetween related coefficient:
Cov(Yt,Yt+q)=E (Yt-ut)(Yt+q-ut)/D(Yt)。
204, judge whether auto-correlation coefficient is zero, if so, step 206 is executed, if it is not, thening follow the steps 205.
205, the value of q is added 1, and returns to step 202.
206, it determines that q is the value of current setting, goes to step 208.
After lag order q is determined, Yt=b can be determined1et+b2et-2+...+bqet-q。
207,0 is set by Yt, goes to step 208.
As q > 5, show that fitting degree is too low, at this point it is possible to zero is set by Yt, arma modeling are as follows: R=Xt+0.
208, judge whether lag order p is greater than 5, if it is not, 209 are thened follow the steps, if so, thening follow the steps 213.
209, the PARCOR coefficients of lag period p are obtained.
Partial correlation coefficient between the sequence of calculation Xt and Xt+p:
E{[(x(t)-Ex(t)][x(t-k)-Ex(t-k)])}/E{[x(t-k)-Ex(t-k)]^2}。
210, judge whether PARCOR coefficients are zero, if so, 211 are thened follow the steps, if it is not, then executing
211, it determines that p is the value of current setting, terminates process.
After p is determined, it can obtain determining Xt=a0+a1Yt-1+a2Yt-2+...+apYt-p。
212, the value of p is added 1, and returns to step 208.
213,0 is set by setting Xt, terminates process.
As p > 5, show that fitting degree is too low, at this point it is possible to zero is set by Xt, arma modeling are as follows: R=0+Yt.
Lag order p, the q that can determine arma modeling through the above way can after lag order p, q are determined
Construct arma modeling:
Yt=a0+a1Yt-1+a2Yt-2+...+apYt-p+b1et+b2et-2+...+bqet-qHere R is assigned to Yt.
After arma modeling building, model parameter to be estimated can be solved using least square method.For example, setting known reality
Actual value is yt (i.e. training data sequence), by calling the least square method of python to find out all a and b, so that (Yt-yt
)2Value is minimum out.
Solve arma modeling after estimating model parameter such as a, b, can obtain arma modeling cuts true model
Expression formula, such as data YtExpression formula F (Yt);Then, current data value is predicted based on arma modeling, that is, obtains current
Two prediction data values such as predict the data value of today.For example, Yt+1Value can use F (Yt+1) ask.
After obtaining the second prediction data value by arma modeling, the data value of available current data and this is second pre-
Third fluctuation parameters between measured data value.Join finally, being fluctuated in conjunction with the first fluctuation parameters, the second fluctuation parameters and third
Number determines whether the fluctuation of current data is abnormal.
Wherein, third fluctuation parameters are the parameter for measuring data value amplitude of variation, for example, can be used for measuring current
Amplitude of variation of the data value of data relative to prediction data value.For example, the third fluctuation parameters can be denoted as confidence ",
Third fluctuation parameters confidence " can be by the data value of current data and the difference of prediction data value divided by current data
Data value obtains.It is as follows:
Third fluctuation parameters confidence "=(x-ARMA (x))/x, x are the data value of current data, and ARMA (x) is
The prediction data value of arma modeling.
Wherein, it based on the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters these three fluctuation parameters, determines
Current data fluctuation whether Yi Chang mode can there are many, for example, also according to the parameter value of this three fluctuation parameters acquisition
Then final fluctuation parameters value determines whether the broadcasting of data is abnormal based on final fluctuation parameters value.
For example, step " according to the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters, determines current data
Fluctuation it is whether abnormal ", may include:
Current data is obtained most according to the parameter value of the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters
Whole fluctuation parameters value;
When final fluctuation parameters value within a preset range when, determine that the fluctuation of current data is normal;
When final fluctuation parameters value not within a preset range when, determine that the fluctuation of current data is abnormal.
Wherein, the parameter value based on the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters these three parameters,
There are many modes for generating final fluctuation parameters value, joins for example, can be fluctuated with the first fluctuation parameters, the second fluctuation parameters and third
Several parameter values carries out aggregation process, using the parameter value after aggregation process as the final fluctuation parameters value of current data.
For example, can join to the first fluctuation parameters, the second fluctuation to promote the accuracy and efficiency of data fluctuations identification
Several and third fluctuation parameters parameter values are weighted and averaged processing, using weighted average as final fluctuation parameters value.
It is assumed that the first fluctuation parameters are confidence, the second fluctuation parameters are confidence ', third fluctuation parameters
For confidence ", at this point it is possible to final fluctuation parameters value confidence final=q1*confidence+q2*
Confidence '+q3*confidence ", wherein q1, q2, q3 are weight, can be set according to actual needs, for example, q1=
0.3, q2=0.3, q3=0.3 etc..
From the foregoing, it will be observed that the embodiment of the present invention is using the data value for obtaining current data;Obtain data value and historical data values
Between the first fluctuation parameters;Nonlinear regression model (NLRM) is trained according to training data sequence;According to non-thread after training
Property regression model obtain the first current prediction data value, and obtain the second fluctuation between data value and the first prediction data value
Parameter;According to the first fluctuation parameters and the second fluctuation parameters, determine whether the fluctuation of current data is abnormal.The program can obtain
Access is based on multidimensional fluctuation parameters according to the fluctuation parameters (such as the first fluctuation parameters and the second fluctuation parameters) in multiple dimensions
Determine whether data fluctuations are abnormal, therefore, can promote the identification accuracy of data fluctuations.
In addition, the program, which can also increase arma modeling, count the dimension it was predicted that increase fluctuation index, it can be to industry
Data of being engaged in carry out multidimensional prediction, obtain the fluctuation index (such as the first, second, third fluctuation parameters) of multiple dimensions, based on more
The fluctuation index of a dimension determines whether data fluctuations are abnormal, further improves the identification accuracy and flexibly of data fluctuations
Property,
The method according to described in above-described embodiment, will now be described in further detail below.
With reference to Fig. 3 a and Fig. 3 b, a kind of data fluctuations recognition methods, detailed process is as follows:
301, the data that Data Generator currently reports are received, and obtain the data value of current data.
In view of the data source of monitoring is varied, for example, can be mysql, the databases such as hive are also possible to list
A file (file) or distributed document (hdfs), the even one section code (shell script) that can be executed.Therefore,
The data type or format of acquisition be not identical.
In order to promote data fluctuations recognition efficiency, optionally, standardization processing can also be carried out to data format or type.
The data source of monitoring can be abstracted into corresponding Data Generator (generator), be passed through Data Generator (generator)
The corresponding data of current time are obtained from corresponding data source, Uniform data format is converted the data by Data Generator
Data report.
Wherein, the abstract method of data source can be abstracted based on the mode of jdbc.It in practical applications, can be in data
Level of abstraction realizes that generator is abstract.The core function of data abstraction layer is exactly to analyze various data sources, and adjust
Corresponding generator is adapted to and generates externally all to appear as after data sources all in this way carries out data abstraction layer
A kind of this data format of generator.
Generator, which is realized, obtains data from data source, and provides to monitoring logic layer (layer fluctuated for identification)
Uniform data reports the data of format.
302, the first fluctuation parameters between data value and historical data values are obtained, go to step 307.
Wherein, the value for the data that historical data values are got before being from data source, i.e., from data before current time
The value for the data that source is got.For example, can be the value of the last data got from data source.
For example, historical data values can be the value of the data obtained from data source yesterday.
Wherein, the first fluctuation parameters are the parameter for measuring data value amplitude of variation, for example, can be used for measuring current
Data value amplitude of variation of the data value of data relative to historical data values.For example, first fluctuation parameters may include fluctuation
Rate confidence, stability bandwidth confidence can be by the data values of current data and the difference of historical data values divided by history
Data value obtains.It is as follows:
Stability bandwidth confidence=(x-x-1)/x-1, wherein x is the data value of current data, x-1For historical data
Value.
303, nonlinear regression model (NLRM) is trained according to training data sequence, and according to the nonlinear regression after training
Model obtains the first current prediction data value.
Wherein, linear regression are as follows: known a series of linear data column, such as time series (time series or
Dynamic series refer to ordered series of numbers made of the chronological order arrangement by the numerical value of same statistical indicator by its generation), if the time
Sequence meets linear character, it can (i.e. linear regression model (LRM) expression formula) is indicated with Y=WX+b, wherein W and b is wait estimate ginseng
Number, linear regression minimize two norms and make recurrence side by calculating direct two norm of known sample and function Y=WX+b
Journey Y=WX+b is closest to existing timed sample sequence.
Wherein, nonlinear regression are as follows: it is similar with linear regression, only in nonlinear regression, function Y=f to be assessed
(X) it is a nonlinear function, also makes regression equation Y=f (X) and existing time sequence by minimizing two norms
Column sample is closest.
Wherein, nonlinear regression model (NLRM) can there are many, for example, hyperbolic model, power function model, nonlinear polynomial
Model etc..
For example, by taking nonlinear regression model (NLRM) is nonlinear polynomial model as an example, the model table of nonlinear polynomial model
Up to formula are as follows:
YT=a0+a1T1+a2T2+...+apTp
Wherein, what p was indicated is power series.Because of the problem of any curve, curved surface, hypersurface, in a certain range all
It can arbitrarily be approached with multinomial.What p was represented is the degree approached, in the embodiment of the present invention in preferably p=4 i.e. 4 formula, a be
Model parameter to be estimated, solution mode can be completed using least square method.
Wherein, training data sequence is a time series, including multiple historical data values, and historical data is corresponding by its
Chronological order arrangement.
Wherein, the training process of nonlinear regression model (NLRM) can refer to the description of above-described embodiment.
304, the second fluctuation parameters between data value and the first prediction data value are obtained, go to step 307.
Wherein, the second fluctuation parameters are used to measure the parameter of data value amplitude of variation, for example, can be used for measuring current number
According to amplitude of variation of the data value relative to prediction data value.For example, second fluctuation parameters can be denoted as confidence ', the
Two fluctuation parameters confidence ' can be by the difference of the data value of current data and prediction data value divided by the number of current data
It is obtained according to value.It is as follows:
Second fluctuation parameters confidence '=(x-G (X))/x, x are the data value of current data, and G (X) is non-linear
The prediction data value of regression model.
305, when training data sequence be stationary sequence when, according to training data sequence to autoregressive moving-average model into
Row training;The second current prediction data value is obtained according to the autoregressive moving-average model after training;When training data sequence
When not being stationary sequence, autoregressive moving-average model training and prediction are not executed.
Wherein, arma modeling is the model for being usually used in analyzing the trend of Future Data at present in econometrics, the model
It is to be simulated by AR (p) (Auto-regressive, autoregression model) and MA (Moving-Average, moving average model(MA model))
The situation of change of one group of data.The expression formula of arma modeling are as follows:
Yt=a0+a1Yt-1+a2Yt-2+...+apYt-p+b1et+b2et-2+...+bqet-q
Wherein, et is to obey desired value E (et)=0, and variance is the distribution of D (et)=d2, and et and et+n are mutually only
It is vertical.A and b is parameter to be estimated, after giving this model, it is necessary to find out a in this model, the value of b can just obtain Yt's
Expression formula F (Yt).Solution mode can be completed using least square method.
Wherein, autoregression model: (AutoRegression, AR) is abbreviated as AR (P), refers to the random mistake of following form
Journey: YT=A1YT-1+A2YT-2+....+APYT-P+UTA therein1、A2、...、APIt is P parameters to be asked;P is the number for lagging the time limit
Mesh.
Moving average model(MA model): equally by taking above-mentioned autoregressive process as an example, YTIt is YT,, YT-pFunction, pass through calculus of differences
Y can be calculatedT=UT-A1UT-1-APUT-P。
In the embodiment of the present invention, when training data sequence is that (i.e. above-mentioned Yt is stationary sequence to stationary sequence, that is, Yt
It is expected that variance, auto-correlation function is unrelated with t) when, which can be fitted using arma modeling, it specifically, can basis
Stationary sequence is trained arma modeling, that is, solves the model parameter to be estimated of arma modeling, such as above-mentioned arma modeling expression formula
In a and b.
When training data sequence is not stationary sequence, the stationary sequence, therefore, this hair cannot be fitted with arma modeling
Bright embodiment does not use arma modeling prediction data value when training data sequence is non-stationary series;At this point it is possible to according to non-
Linear regression model (LRM) is fitted training data sequence, nonlinear regression model (NLRM) prediction data value is based on, then, according to current
The fluctuation parameters and current data of data and the prediction data value and the fluctuation parameters of historical data values determine the fluctuation of data
It is whether abnormal.
As it can be seen that a in the embodiment of the present invention, on the one hand nonlinear regression model (NLRM) is used as the supplement of ARMA, that is, works as data sequence
In jiggly situation, pass through the model prediction data value;On the other hand in the case where sequence stationary, can and arma modeling
One prediction of going forward side by side obtains more reasonable predicted value, generates multiple dimension fluctuation parameters, promotes the accuracy of data fluctuations identification.
Wherein, arma modeling is trained based on training data sequence, is just to solve for model the to be estimated ginseng of arma modeling
Number.
The solution of the model parameter to be estimated of arma modeling most importantly determines the lag order of arma modeling, also referred to as
Lag period, i.e., p, q parameter in above-mentioned arma modeling expression formula.
Specifically, lag order is determining and model parameter solves the description that can refer to above-described embodiment.
306, the third fluctuation parameters between data value and the second prediction data value are obtained, go to step 307.
Wherein, third fluctuation parameters are the parameter for measuring data value amplitude of variation, for example, can be used for measuring current
Amplitude of variation of the data value of data relative to prediction data value.For example, the third fluctuation parameters can be denoted as confidence ",
Third fluctuation parameters confidence " can be by the data value of current data and the difference of prediction data value divided by current data
Data value obtains.It is as follows:
Third fluctuation parameters confidence "=(x-ARMA (x))/x, x are the data value of current data, and ARMA (x) is
The prediction data value of arma modeling.
307, determine whether the fluctuation of current data is abnormal according to current fluctuation parameters.
In the embodiment of the present invention, if training data sequence is stationary sequence, current fluctuation parameters may include: the
One fluctuation parameters, the second fluctuation parameters and third fluctuation parameters;
If training data sequence is not stationary sequence, current fluctuation parameters may include: the first fluctuation parameters,
Two fluctuation parameters.
After getting current fluctuation parameters, the final fluctuation of current data can be obtained according to current fluctuation parameters
Parameter value;
When final fluctuation parameters value within a preset range when, determine that the fluctuation of current data is normal;
When final fluctuation parameters value not within a preset range when, determine that the fluctuation of current data is abnormal.
Work as example, can be obtained according to the parameter value of the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters
The final fluctuation parameters value of preceding data, or according to the first fluctuation parameters, the final fluctuation of the second fluctuation parameters acquisition current data
Parameter value;
When final fluctuation parameters value within a preset range when, determine that the fluctuation of current data is normal;
When final fluctuation parameters value not within a preset range when, determine that the fluctuation of current data is abnormal.
Wherein, the parameter value based on the first fluctuation parameters, the second fluctuation parameters and third fluctuation parameters these three parameters,
There are many modes for generating final fluctuation parameters value, joins for example, can be fluctuated with the first fluctuation parameters, the second fluctuation parameters and third
Several parameter values carries out aggregation process, using the parameter value after aggregation process as the final fluctuation parameters value of current data.
For example, with reference to Fig. 3 b, in order to promote the accuracy and efficiency of data fluctuations identification, can to the first fluctuation parameters,
The parameter value of second fluctuation parameters and third fluctuation parameters is weighted and averaged processing, using weighted average as final fluctuation
Parameter value.
It is assumed that the first fluctuation parameters are confidence, the second fluctuation parameters are confidence ', third fluctuation parameters
For confidence ", at this point it is possible to final fluctuation parameters value confidence final=q1*confidence+q2*
Confidence '+q3*confidence ", wherein q1, q2, q3 are weight, can be set according to actual needs, for example, q1=
0.3, q2=0.3, q3=0.3 etc..
In practical application, identifying schemes provided in an embodiment of the present invention can be realized in monitoring logic layer.
From the foregoing, it will be observed that scheme provided in an embodiment of the present invention can be based on ARMA prediction model and nonlinear regression model (NLRM)
Data are predicted in combination, are obtained the fluctuation parameters of multiple dimensions, then, are determined data based on the fluctuation parameters of multiple dimensions
Whether fluctuation is abnormal, can promote the identification accuracy and authenticity of data fluctuations.
Scheme provided in an embodiment of the present invention can detecte the weekend effect of data in business, such as our online interrogation
Business, the order of interrogation can be fallen after rise much when weekend, if according to existing identification method will business datum this
Kind fluctuation is alerted as abnormal.But after using the scheme of the embodiment of the present invention.Since predicted value is inherently based on history
What data were learnt, then determining industry in conjunction with the fluctuation parameters of multiple dimensions just in normal interval with the stability bandwidth of actual value
The fluctuation for data of being engaged in is normal.
In addition, the abstract in monitoring data source can be abstracted into Data Generator by the program, user is not limited and inputs tool
The data type of body, it is only necessary to user provides the interface with data output capacities, is configured to monitoring system, including
But it is not limited to mysql, hive data source, the executable script such as python of others, shell, perl etc. can directly pass through
The script that shell tune rises.
In order to better implement above method, the embodiment of the present invention also provides a kind of data fluctuations identification device, such as Fig. 4 a
Shown, which may include: data capture unit 401, the first parameter acquiring unit 402, training unit
403, the second parameter acquiring unit 404 and determination unit 405 are as follows:
Data capture unit 401, for obtaining the data value of current data;
First parameter acquiring unit 402, for obtaining the first fluctuation parameters between the data value and historical data values;
Training unit 403, for being trained according to training data sequence to nonlinear regression model (NLRM);
Second parameter acquiring unit 404, for obtaining the first current prediction according to the nonlinear regression model (NLRM) after training
Data value, and obtain the second fluctuation parameters between the data value and the first prediction data value;
Determination unit 405, for determining described current according to first fluctuation parameters and second fluctuation parameters
Whether the fluctuation of data is abnormal.
In one embodiment, with reference to Fig. 4 b, wherein determination unit 405 may include:
Training subelement 4051 is used for when the training data sequence is stationary sequence, according to the training data sequence
Column are trained autoregressive moving-average model;
Parameter obtains subelement 4052, pre- for obtaining current second according to the autoregressive moving-average model after training
Measured data value, and obtain the third fluctuation parameters between the data value and the second prediction data value;
First abnormal determining subelement 4053, for according to first fluctuation parameters, second fluctuation parameters and
Third fluctuation parameters determine whether the fluctuation of the current data is abnormal;
Second abnormal determining subelement 4054, for when the training data sequence is not stationary sequence, according to described
First fluctuation parameters and second fluctuation parameters determine whether the fluctuation of the current data is abnormal.
In one embodiment, the described first abnormal determining subelement 4053, is used for:
According to the acquisition of the parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters
The final fluctuation parameters value of current data;
When the final fluctuation parameters value within a preset range when, determine that the fluctuation of the current data is normal;
When the final fluctuation parameters value not within a preset range when, determine that the fluctuation of the current data is abnormal.
In one embodiment, the described first abnormal determining subelement 4053, can be specifically used for:
The parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters is weighted flat
It handles, obtains weighted average parameter value;
Using the weighted average parameter value as the final fluctuation parameters value of the current data.
In one embodiment, data capture unit 401 can be used for:
Data are obtained from data source, obtain current data;
The current data is converted into the data of Uniform data format, obtains the data value of translated data.
In one embodiment, training unit 403 can be used for:
Determine the number of model parameter to be estimated in nonlinear regression model (NLRM);
The model parameter to be estimated that nonlinear regression model (NLRM) is solved based on least square method, the training data sequence, is obtained
Nonlinear regression model (NLRM) after training.
In one embodiment, training subelement 4051, can be used for:
Based on auto-correlation function analysis and partial autocorrelation function analysis mode, the lag of autoregressive moving-average model is determined
Order;
Based on least square method, the lag order and the training data sequence, the autoregressive moving average is solved
The model parameter to be estimated of model, the autoregressive moving-average model after being trained.
The description that the step of execution of the above each unit, reference can be made to the above method embodiment.
When it is implemented, above each unit can be used as independent entity to realize, any combination can also be carried out, is made
It is realized for same or several entities, the specific implementation of above each unit can be found in the embodiment of the method for front, herein not
It repeats again.
The data fluctuations identification device specifically can integrate in server, such as monitoring server.
From the foregoing, it will be observed that the data fluctuations identification device of the embodiment of the present invention obtains current number by data capture unit 401
According to data value;The first fluctuation parameters between the data value and historical data values are obtained by the first parameter acquiring unit 402;
Nonlinear regression model (NLRM) is trained according to training data sequence by training unit 403;By the second parameter acquiring unit 404
The first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtains the data value and described first in advance
The second fluctuation parameters between measured data value;By determination unit 405 according to first fluctuation parameters and second fluctuation
Parameter determines whether the fluctuation of the current data is abnormal.Fluctuation parameters of the available data of the program in multiple dimensions
(such as the first fluctuation parameters and the second fluctuation parameters) determine whether data fluctuations are abnormal based on multidimensional fluctuation parameters, therefore, can
To promote the identification accuracy of data fluctuations.
In addition, the program, which can also increase arma modeling, count the dimension it was predicted that increase fluctuation index, it can be to industry
Data of being engaged in carry out multidimensional prediction, obtain the fluctuation index (such as the first, second, third fluctuation parameters) of multiple dimensions, based on more
The fluctuation index of a dimension determines whether data fluctuations are abnormal, further improves the identification accuracy and flexibly of data fluctuations
Property,
With reference to Fig. 5, it may include one or more than one processing that the embodiment of the invention provides a kind of servers 500
The processor 501 of core, the memory 502 of one or more computer readable storage mediums, radio frequency (Radio
Frequency, RF) components such as circuit 503, power supply 504, input unit 505.It will be understood by those skilled in the art that showing in Fig. 5
Server architecture out does not constitute the restriction to server, may include than illustrating more or fewer components, or combination
Certain components or different component layouts.Wherein:
Processor 501 is the control centre of the server, utilizes each of various interfaces and the entire server of connection
Part by running or execute the software program and/or module that are stored in memory 502, and calls and is stored in memory
Data in 502, the various functions and processing data of execute server, to carry out integral monitoring to server.Optionally, locate
Managing device 501 may include one or more processing cores;Preferably, processor 501 can integrate application processor and modulatedemodulate is mediated
Manage device, wherein the main processing operation system of application processor, user interface and application program etc., modem processor is main
Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 501.
Memory 502 can be used for storing software program and module, and processor 501 is stored in memory 502 by operation
Software program and module, thereby executing various function application and data processing.
During RF circuit 503 can be used for receiving and sending messages, signal is sended and received, and particularly, the downlink of base station is believed
After breath receives, one or the processing of more than one processor 501 are transferred to;In addition, the data for being related to uplink are sent to base station.
Server further includes the power supply 504 (such as battery) powered to all parts, it is preferred that power supply can pass through power supply
Management system and processor 501 are logically contiguous, to realize management charging, electric discharge and power consumption pipe by power-supply management system
The functions such as reason.Power supply 504 can also include one or more direct current or AC power source, recharging system, power failure
The random components such as detection circuit, power adapter or inverter, power supply status indicator.
The server may also include input unit 505, which can be used for receiving the number or character letter of input
Breath.
Specifically in the present embodiment, the processor 501 in server can be according to following instruction, by one or more
The corresponding executable file of process of application program be loaded into memory 502, and run and be stored in by processor 501
Application program in reservoir 502, thus realize various functions, it is as follows:
Obtain the data value of current data;Obtain the first fluctuation parameters between the data value and historical data values;Root
Nonlinear regression model (NLRM) is trained according to training data sequence;Current the is obtained according to the nonlinear regression model (NLRM) after training
One prediction data value, and obtain the second fluctuation parameters between the data value and the first prediction data value;According to described
First fluctuation parameters and second fluctuation parameters determine whether the fluctuation of the current data is abnormal.
In some embodiments, work as described in determine according to first fluctuation parameters and second fluctuation parameters
When whether the fluctuation of preceding data is abnormal, the processor 501 specifically executes following steps:
When the training data sequence is stationary sequence, according to the training data sequence to autoregressive moving average mould
Type is trained;
The second current prediction data value is obtained according to the autoregressive moving-average model after training, and obtains the data
Third fluctuation parameters between value and the second prediction data value;
According to first fluctuation parameters, second fluctuation parameters and third fluctuation parameters, the current number is determined
According to fluctuation it is whether abnormal;
When the training data sequence is not stationary sequence, according to first fluctuation parameters and second fluctuation
Parameter determines whether the fluctuation of the current data is abnormal.
In some embodiments, join when according to the fluctuation of first fluctuation parameters, second fluctuation parameters and third
Number, when determining whether the fluctuation of the current data is abnormal, the processor 501 specifically executes following steps:
According to the acquisition of the parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters
The final fluctuation parameters value of current data;
When the final fluctuation parameters value within a preset range when, determine that the fluctuation of the current data is normal;
When the final fluctuation parameters value not within a preset range when, determine that the fluctuation of the current data is abnormal.
In some embodiments, when obtaining the data value of current data, the processor 501 specifically executes following step
It is rapid:
Data are obtained from data source, obtain current data;
The current data is converted into the data of Uniform data format, obtains the data value of translated data.
In some embodiments, when being trained according to training data sequence to nonlinear regression model (NLRM), the processing
Device 501 specifically executes following steps:
Determine the number of model parameter to be estimated in nonlinear regression model (NLRM);
The model parameter to be estimated that nonlinear regression model (NLRM) is solved based on least square method, the training data sequence, is obtained
Nonlinear regression model (NLRM) after training.
In some embodiments, when being trained according to the training data sequence to autoregressive moving-average model,
The processor 501 specifically executes following steps:
Based on auto-correlation function analysis and partial autocorrelation function analysis mode, the lag of autoregressive moving-average model is determined
Order;
Based on least square method, the lag order and the training data sequence, the autoregressive moving average is solved
The model parameter to be estimated of model, the autoregressive moving-average model after being trained.
The data value of the available current data of the server of the embodiment of the present invention;Obtain the data value and historical data
The first fluctuation parameters between value;Nonlinear regression model (NLRM) is trained according to training data sequence;According to non-after training
Linear regression model (LRM) obtains the first current prediction data value, and obtains between the data value and the first prediction data value
The second fluctuation parameters;According to first fluctuation parameters and second fluctuation parameters, the wave of the current data is determined
It is dynamic whether abnormal.Fluctuation parameters (such as first fluctuation parameters and second fluctuation of the available data of the program in multiple dimensions
Parameter etc.), determine whether data fluctuations are abnormal, and therefore, the identification that can promote data fluctuations is accurate based on multidimensional fluctuation parameters
Property.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage
Medium may include: read-only memory (ROM, Read Only Memory), random access memory (RAM, Random
Access Memory), disk or CD etc..
A kind of data fluctuations recognition methods, device and storage medium is provided for the embodiments of the invention above to have carried out in detail
Thin to introduce, used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said
It is bright to be merely used to help understand method and its core concept of the invention;Meanwhile for those skilled in the art, according to this hair
Bright thought, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not manage
Solution is limitation of the present invention.
Claims (14)
1. a kind of data fluctuations recognition methods characterized by comprising
Obtain the data value of current data;
Obtain the first fluctuation parameters between the data value and historical data values;
Nonlinear regression model (NLRM) is trained according to training data sequence;
The first current prediction data value is obtained according to the nonlinear regression model (NLRM) after training, and obtain the data value with it is described
The second fluctuation parameters between first prediction data value;
According to first fluctuation parameters and second fluctuation parameters, determine whether the fluctuation of the current data is abnormal.
2. data fluctuations recognition methods as described in claim 1, which is characterized in that according to first fluctuation parameters and institute
The second fluctuation parameters are stated, determine whether the fluctuation of the current data is abnormal, comprising:
When the training data sequence be stationary sequence when, according to the training data sequence to autoregressive moving-average model into
Row training;
The second current prediction data value is obtained according to the autoregressive moving-average model after training, and obtain the data value with
Third fluctuation parameters between the second prediction data value;
According to first fluctuation parameters, second fluctuation parameters and third fluctuation parameters, the current data is determined
Whether fluctuation is abnormal;
When the training data sequence is not stationary sequence, joined according to first fluctuation parameters and second fluctuation
Number determines whether the fluctuation of the current data is abnormal.
3. data fluctuations recognition methods as described in claim 1, which is characterized in that according to first fluctuation parameters, described
Second fluctuation parameters and third fluctuation parameters determine whether the fluctuation of the current data is abnormal, comprising:
It is obtained according to the parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters described current
The final fluctuation parameters value of data;
When the final fluctuation parameters value within a preset range when, determine that the fluctuation of the current data is normal;
When the final fluctuation parameters value not within a preset range when, determine that the fluctuation of the current data is abnormal.
4. data fluctuations recognition methods as claimed in claim 3, which is characterized in that according to first fluctuation parameters, described
Second fluctuation parameters and third fluctuation parameters obtain the final fluctuation parameters of the current data, comprising:
Place is weighted and averaged to the parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters
Reason obtains weighted average parameter value;
Using the weighted average parameter value as the final fluctuation parameters value of the current data.
5. data fluctuations recognition methods as described in claim 1, which is characterized in that obtain the data value of current data, comprising:
Data are obtained from data source, obtain current data;
The current data is converted into the data of Uniform data format, obtains the data value of translated data.
6. data fluctuations recognition methods as described in claim 1, which is characterized in that according to training data sequence to non-linear time
Model is returned to be trained, comprising:
Determine the number of model parameter to be estimated in nonlinear regression model (NLRM);
The model parameter to be estimated that nonlinear regression model (NLRM) is solved based on least square method, the training data sequence, is trained
Nonlinear regression model (NLRM) afterwards.
7. data fluctuations recognition methods as claimed in claim 2, which is characterized in that returned according to the training data sequence to oneself
Moving average model is returned to be trained, comprising:
Based on auto-correlation function analysis and partial autocorrelation function analysis mode, the lag rank of autoregressive moving-average model is determined
Number;
Based on least square method, the lag order and the training data sequence, the autoregressive moving-average model is solved
Model parameter to be estimated, the autoregressive moving-average model after being trained.
8. a kind of data fluctuations identification device characterized by comprising
Data capture unit, for obtaining the data value of current data;
First parameter acquiring unit, for obtaining the first fluctuation parameters between the data value and historical data values;
Training unit, for being trained according to training data sequence to nonlinear regression model (NLRM);
Second parameter acquiring unit, for obtaining the first current prediction data value according to the nonlinear regression model (NLRM) after training,
And obtain the second fluctuation parameters between the data value and the first prediction data value;
Determination unit, for determining the current data according to first fluctuation parameters and second fluctuation parameters
Whether fluctuation is abnormal.
9. data fluctuations identification device as claimed in claim 8, which is characterized in that the determination unit, comprising:
Training subelement, for when the training data sequence be stationary sequence when, according to the training data sequence to from return
Moving average model is returned to be trained;
Parameter obtains subelement, for obtaining the second current prediction data according to the autoregressive moving-average model after training
Value, and obtain the third fluctuation parameters between the data value and the second prediction data value;
First abnormal determining subelement, for being fluctuated according to first fluctuation parameters, second fluctuation parameters and third
Parameter determines whether the fluctuation of the current data is abnormal;
Second abnormal determining subelement, for being fluctuated according to described first when the training data sequence is not stationary sequence
Parameter and second fluctuation parameters determine whether the fluctuation of the current data is abnormal.
10. data fluctuations identification device as claimed in claim 9, which is characterized in that the described first abnormal determining subelement is used
In:
It is obtained according to the parameter value of first fluctuation parameters, second fluctuation parameters and third fluctuation parameters described current
The final fluctuation parameters value of data;
When the final fluctuation parameters value within a preset range when, determine that the fluctuation of the current data is normal;
When the final fluctuation parameters value not within a preset range when, determine that the fluctuation of the current data is abnormal.
11. data fluctuations identification device as claimed in claim 8, which is characterized in that data capture unit is used for:
Data are obtained from data source, obtain current data;
The current data is converted into the data of Uniform data format, obtains the data value of translated data.
12. data fluctuations identification device as claimed in claim 8, which is characterized in that training unit is used for:
Determine the number of model parameter to be estimated in nonlinear regression model (NLRM);
The model parameter to be estimated that nonlinear regression model (NLRM) is solved based on least square method, the training data sequence, is trained
Nonlinear regression model (NLRM) afterwards.
13. data fluctuations identification device as claimed in claim 9, which is characterized in that training subelement is used for:
Based on auto-correlation function analysis and partial autocorrelation function analysis mode, the lag rank of autoregressive moving-average model is determined
Number;
Based on least square method, the lag order and the training data sequence, the autoregressive moving-average model is solved
Model parameter to be estimated, the autoregressive moving-average model after being trained.
14. a kind of storage medium, which is characterized in that the storage medium is stored with instruction, when described instruction is executed by processor
It realizes such as the step of any one of claim 1-7 the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810214976.7A CN110275809B (en) | 2018-03-15 | 2018-03-15 | Data fluctuation identification method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810214976.7A CN110275809B (en) | 2018-03-15 | 2018-03-15 | Data fluctuation identification method and device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110275809A true CN110275809A (en) | 2019-09-24 |
CN110275809B CN110275809B (en) | 2022-07-08 |
Family
ID=67957686
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810214976.7A Active CN110275809B (en) | 2018-03-15 | 2018-03-15 | Data fluctuation identification method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110275809B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113419141A (en) * | 2021-08-26 | 2021-09-21 | 中国南方电网有限责任公司超高压输电公司广州局 | Direct-current line fault positioning method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103685347A (en) * | 2012-09-03 | 2014-03-26 | 阿里巴巴集团控股有限公司 | Method and device for allocating network resources |
CN106991285A (en) * | 2017-04-01 | 2017-07-28 | 广东工业大学 | A kind of short-term wind speed multistep forecasting method and device |
CN107506871A (en) * | 2017-09-08 | 2017-12-22 | 广东工业大学 | A kind of method and system of interval prediction |
US20180041527A1 (en) * | 2013-03-15 | 2018-02-08 | Shape Security, Inc. | Using instrumentation code to detect bots or malware |
-
2018
- 2018-03-15 CN CN201810214976.7A patent/CN110275809B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103685347A (en) * | 2012-09-03 | 2014-03-26 | 阿里巴巴集团控股有限公司 | Method and device for allocating network resources |
US20180041527A1 (en) * | 2013-03-15 | 2018-02-08 | Shape Security, Inc. | Using instrumentation code to detect bots or malware |
CN106991285A (en) * | 2017-04-01 | 2017-07-28 | 广东工业大学 | A kind of short-term wind speed multistep forecasting method and device |
CN107506871A (en) * | 2017-09-08 | 2017-12-22 | 广东工业大学 | A kind of method and system of interval prediction |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113419141A (en) * | 2021-08-26 | 2021-09-21 | 中国南方电网有限责任公司超高压输电公司广州局 | Direct-current line fault positioning method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110275809B (en) | 2022-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107704387B (en) | Method, device, electronic equipment and computer readable medium for system early warning | |
EP4020315A1 (en) | Method, apparatus and system for determining label | |
CN114298322B (en) | Federal learning method and apparatus, system, electronic device, and computer readable medium | |
CN110474896A (en) | Data communications method and relevant device based on Modbus consensus standard | |
CN114500339B (en) | Node bandwidth monitoring method and device, electronic equipment and storage medium | |
CN109558248A (en) | A kind of method and system for the determining resource allocation parameters calculated towards ocean model | |
CN110275809A (en) | A kind of data fluctuations recognition methods, device and storage medium | |
CN110688098A (en) | Method and device for generating system framework code, electronic equipment and storage medium | |
CN113886006A (en) | Resource scheduling method, device and equipment and readable storage medium | |
CN114019400A (en) | Lithium battery life cycle monitoring and management method, system and storage medium | |
CN111901405B (en) | Multi-node monitoring method and device, electronic equipment and storage medium | |
CN110389876B (en) | Method, device and equipment for supervising basic resource capacity and storage medium | |
CN110175083A (en) | The monitoring method and device of operating system | |
CN111046082A (en) | Data source determination method, device, server and storage medium | |
CN107590012B (en) | Equipment disconnection reason analysis method and device, storage medium and electronic equipment | |
CN113449008B (en) | Modeling method and device | |
CN114446427A (en) | Electronic equipment and health data attribution identification method | |
CN110220639A (en) | Pressure gauge meter register method, device and terminal device in substation | |
CN109067620A (en) | The monitoring method and device of gateway | |
CN115473343B (en) | Intelligent gateway multi-master-station parallel access test method | |
CN116703248B (en) | Data auditing method, device, electronic equipment and computer readable storage medium | |
CN115081942B (en) | Data processing method and related device | |
CN113656270B (en) | Method, device, medium and computer program product for testing application performance | |
CN109031041B (en) | Distribution network voltage monitoring device point distribution method and system | |
CN117236885A (en) | Emergency treatment method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |