CN102025531A - Filling method and device thereof for performance data - Google Patents

Filling method and device thereof for performance data Download PDF

Info

Publication number
CN102025531A
CN102025531A CN2010102563686A CN201010256368A CN102025531A CN 102025531 A CN102025531 A CN 102025531A CN 2010102563686 A CN2010102563686 A CN 2010102563686A CN 201010256368 A CN201010256368 A CN 201010256368A CN 102025531 A CN102025531 A CN 102025531A
Authority
CN
China
Prior art keywords
model
data
field
performance data
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102563686A
Other languages
Chinese (zh)
Other versions
CN102025531B (en
Inventor
于艳华
宋俊德
王海清
解新民
杨金莲
周政红
吴京川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Bright Oceans Inter Telecom Co Ltd
Bright Oceans Corp
Bright Oceans Inter Telecom Software Research Institute Co Ltd
Original Assignee
Beijing University of Posts and Telecommunications
Bright Oceans Inter Telecom Co Ltd
Bright Oceans Inter Telecom Software Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications, Bright Oceans Inter Telecom Co Ltd, Bright Oceans Inter Telecom Software Research Institute Co Ltd filed Critical Beijing University of Posts and Telecommunications
Priority to CN201010256368.6A priority Critical patent/CN102025531B/en
Publication of CN102025531A publication Critical patent/CN102025531A/en
Application granted granted Critical
Publication of CN102025531B publication Critical patent/CN102025531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a filing method and a device thereof for performance data, and the method comprises the following steps: getting a sequence of historical performance data records; detecting the internal association among different data items with specific relationship according to the sequence of the historical performance data records; establishing a regression model for mutual fitting for the associated data items with the internal association; and using the corresponding regression model to calculate estimation values of the data items in deletion according to the known values of the associated data items if the deletion of the data items occurs in the performance data records and filling the estimation values to the performance data records in deletion. The modeling method based on mathematical statistics and data mining technology is firstly applied in filling of the performance data in a network management system, thereby realizing scientification, intelligentization and automation of the filling method, not only effectively ensuring the accuracy of data filling, but also greatly improving the filling efficiency of the batch data in deletion.

Description

A kind of complementing method of performance data and device thereof
Technical field
The present invention relates to field of network management, relate in particular to a kind of complementing method and device thereof of performance data.
Background technology
In network management system, the disappearance of network element initial data in gatherer process is ubiquitous problem, thereby can increase the complexity of application oriented statistics, analysis task, causes the bias of statistics, reduces the accuracy of data statistics, analysis.In the performance management of network management system, gathering the original performance data of coming from network element or EMS or OMC is the basis that a lot of user's application management functions realize.And because Network Transmission or even the reason in the OMC that gathers or network element data source itself, the performance data of being gathered tends to omit and disappearance, especially under the situation that the performance data integrality there are differences, the statistic analysis result of using conventional statistical method that the deficiency of data collection is made is to replace the result's that made when complete data set added up.The inaccurate meeting of performance data causes the inaccurate of data analysis, statistics, bring very big problem for the relevant a series of statistical works of network management performance data, for integrality and the accuracy that guarantees that network management performance data is gathered, for correlation analysis work provides data basis accurately, need fill up the network management performance data of disappearance.
Handling the method for filling up the disappearance performance data in the network management system at present generally is to adopt data to repay and the manual method of filling up, and these two kinds of method defectives are all bigger.It is that data are gathered once again that data are repaid, and owing to adopt same data acquisition mechanism, therefore also can produce inevitable data disappearance, can not reach effect preferably on ageing and accuracy, and efficient is also very low.Particularly when data were difficult to reproduce for a certain reason, the filling mining data can't be carried out.Manual fill data need spend great amount of manpower, a large amount of time and carry out data check and data filling, inefficiency not only, and can be owing to human error causes a deviation.
Therefore current needs provide a kind of technical scheme that performance data is filled up, and solve the demand that current performance data need promptly and accurately be filled up, and overcome can't fill up in the past or there is the problem of poor accuracy, inefficiency in complementing method.
Summary of the invention
Technical problem to be solved by this invention provides a kind of complementing method of performance data, solve current can't fill data predicament and performance data complementing method in the past in have the problem of poor accuracy, inefficiency.The present invention also provides a kind of device of filling up of performance data, to guarantee said method application in practice.
In order to address the above problem, the invention provides a kind of complementing method of performance data, comprising: obtain the History Performance Data records series; The described History Performance Data records series of foundation is surveyed the internal correlation between the different pieces of information item with particular kind of relationship; Described different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually; For the described associated data item with internal correlation is set up the regression model of match each other; If the data item disappearance is arranged in the performance data record, then, use corresponding regression model according to the value of known associated data item, calculate the estimated value of missing data item, described estimated value is filled up in the performance data record of disappearance.
According to another preferred embodiment of the present invention, a kind of device of filling up of performance data also is provided, comprising: the historical data acquiring unit is used to obtain the History Performance Data records series; Regression model is set up the unit, be used for the History Performance Data records series that the described historical data acquiring unit of foundation obtains, detection has the internal correlation between the different pieces of information item of particular kind of relationship, for the associated data item with internal correlation is set up the regression model of match each other; Described different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually; The data filling unit, be used for value according to the associated data item of the situation of data item disappearance and known missing data item, use regression model to set up the correlation regression model that the unit is set up, calculate the estimated value of this missing data item, described estimated value is filled up in the performance data record of disappearance.
Compared with prior art, the preferred embodiment of the present invention solved current can't fill data predicament and the problem that the past data complementing method exists poor accuracy, inefficiency.The present invention adopts the performance data of certain period is in history carried out detection analysis, detection has the internal correlation between the different pieces of information item of particular kind of relationship, and be that the described associated data item with internal correlation is set up the regression model of match each other, set up the related regression model of field of match between the different field data item for same record, set up the autoregression model of match between the same field data item for different recording; If the data item disappearance is arranged in the performance data record, then, use corresponding regression model according to the value of known associated data item, calculate the estimated value of missing data item, described estimated value is filled up in the performance data record.The performance data that the present invention will be applied in the network management system based on the modeling method of mathematical statistics and data mining technology is first filled up, therefore the present invention has realized scientific, the intellectuality and the automation of complementing method, not only effectively guarantee the accuracy rate of data filling, and improved the efficient of filling up of batch missing data greatly.
Description of drawings
Fig. 1 is the flow chart of the complementing method embodiment one of performance data of the present invention;
Fig. 2-a to Fig. 2-c is 1700~1979 annual sunspot number autocorrelation function graphs;
Fig. 3 to Fig. 5 carries out the auto-correlation function schematic diagram that detecting periodically adopts to the historical data sequence in the complementing method example three of performance data of the present invention;
Fig. 6 is the schematic diagram of the AIC value of arma modeling correspondence in the complementing method example three of performance data of the present invention;
Fig. 7 is the structural representation that performance data of the present invention is filled up device one embodiment.
Embodiment
The invention will be further described below in conjunction with the drawings and specific embodiments.
In network management system, gather the performance data of coming up and comprise many data record, each bar data record is made up of a plurality of performance index fields, and network management system stores these some data record in the two-dimensional data table of database into one by one, data record of each row storage.Therefore the omission of performance data mainly shows as two kinds of situations: a kind of situation is the omission of certain or some index fields in delegation's performance data in the table, is referred to herein as the data field disappearance; Another kind of situation is the i.e. omission of a data record of full line performance data in the database table, is referred to herein as whole piece data record disappearance.As long as also have a known performance index value in the delegation's performance data in the database table, just belong to the situation of performance data field disappearance.
The present invention regards these performance datas as some stochastic variables, generally all has inherent rule to follow, and the data of different industries, different purposes can show different inherent laws.The present invention is by using mathematical statistics and theory of regression analysis, the relevance between the detection analysis known performance data inside and the variation tendency of performance data, and then new complementing method under the disappearance performance data situation has been proposed.
Under the situation of performance data disappearance, in order in time to carry out filling up of missing values, the performance data complementing method that the present invention proposes is:
Get the History Performance Data records series;
The above-mentioned History Performance Data records series of foundation is surveyed the internal correlation between the different pieces of information item with particular kind of relationship; Described different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually.
For the described associated data item with internal correlation is set up the regression model of match each other; Can set up two kinds of regression models, a kind of is according to the internal correlation between the different field of same data record, sets up the related regression model of field of match; A kind of is according to the variation tendency between the same field of different pieces of information record, sets up the autoregression model of match.
If the data item disappearance is arranged in the performance data record, then, use corresponding regression model according to the value of the associated data item of this known missing data item, calculate the estimated value of this missing data item, described estimated value is filled up in the performance data record of disappearance.
Utilize two kinds of models of above-mentioned foundation, can carry out the estimation of missing data item.
For fill data efficiently, need at first understand the position of the data item field of disappearance in the performance data record.The data record sequence number that in whole time series, is lacked, the field name of disappearance, whether disappearance field data item is associated with other field data item.If in same data record, exist another field that is associated with a field to be the associated data item, then according to the related regression model of described field, according to the estimated value of known field value calculating, with described estimated value value of filling up as this missing data field in the disappearance field of disappearance performance data LSN.Incidence relation between the general using field estimates that the field value of disappearance is succinctly and efficiently.If in same data record, there is not the associated data item, then can seek the associated data item according to the record of the performance data on the historical time sequence, set up the autoregression model of historical time sequence, in the autoregression model of setting up, observe the disappearance field data item of disappearance performance data LSN, according to described autoregression model, according to the estimated value of known associated field value calculating, with described estimated value value of filling up as this missing data field in the disappearance field of disappearance performance data sequence number.
In a preferred embodiment of the invention, preferentially adopt the related regression model of field to fill up the value of disappearance field; When disappearance field and known field do not exist incidence relation or whole piece record disappearance, adopt autoregression model to fill up the value that lacks field again.
Described to set up the method for model of fit according to the historical sample data varied, and this has a lot of examples to support in the prior art.The performance data that the present invention then will be applied in the network management system based on the modeling method of mathematical statistics and data mining technology is first filled up, and has improved accuracy and the efficient filled up greatly.
In order to solve filling up of missing data field effectively, the internal association between the first detection performance index word segment value when correlation has higher-strength between the performance index field, can obtain field regression function model to performance index with homing method.Therefore, under the situation that performance index word section lacks in the performance data record at a time on the point, by some performance data records in certain the recent period of this time point are carried out correlation analysis, can obtain corresponding field regression function model, according to the known performance index word segment value in the record of the performance data on this time point, the described field regression model of substitution calculates the performance index field value that obtains disappearance, fills up in the write performance database table.For comparatively simple field regression function model, monobasic linear function for example, adopting said method calculates can realize filling up of quickness and high efficiency.
Above-mentioned field association analysis method can comparable data mining analysis method, feature according to data is determined, the data of different industries have different characteristics, can check correlation intensity or definite correlation rule with parameters such as coefficient correlation, supports, analog computation by mass data, thereby find out suitable regression function model, the model of the relevance between this embodiment field just is called field regression function model.A kind of method of surveying the field relevance and setting up model will be specifically introduced in following examples two, a kind of field regression function model can be found.
Situation about lacking as if the whole piece performance data record that exists on certain time point in the described performance database table, the present invention is according to some performance data records on the early historical time sequence of this time point, utilization is based on seasonal effect in time series modeling and Forecasting Methodology, find out the variation tendency of performance data record and the autoregression model of adaptation time sequence, the utilization autoregression model is predicted key performance data field wherein, can obtain the predicted value of this field data item, fill up in the write performance database table.
To specifically introduce a kind of detection time sequence performance data autoregression in following examples three and set up the method for model, can find a kind of autoregression model.
According to above-mentioned field regression model and/or time series autoregression model, can predict and obtain the predicted value of all fields to all properties data field.Particularly, relation is more independent between field, does not have the field regression model of corresponding match, then needs adopting said method to fill up the independent field of disappearance one by one.
If have correlation between each field of same data record, in order to improve forecasting efficiency, according to above-mentioned correlation analysis, can find the regression model of corresponding match, then the key performance data field value that can come out according to prediction and the field regression function model between other performance data field values carry out filling up of other field missing values.
In order to use this method better, the present invention has introduced the detection method of data deletion conditions again, comprises detection data field disappearance and surveys whole piece data record disappearance.According to the situation of data disappearance, use the complementing method of performance data neatly then.
The detection method of judgment data disappearance is, reads each field of every data record one by one, and whether be empty (NULL), if be empty, judge that then this field lacks, and writes down the title of this disappearance field and the data record sequence number at place if differentiating field contents.If all fields of whole piece data record all lack, then judge this data record disappearance, and the data record sequence number of record disappearance.
When being judged as field disappearance, and when the field regression function model of corresponding match is arranged, just can fill up the field of disappearance according to the field regression model.When being judged as whole piece record disappearance, the autoregression model of corresponding match is arranged, then fill up the field of disappearance according to autoregression model.The described process that circulates can be filled up all disappearance fields.
As shown in Figure 1, the present invention provides the complementing method embodiment one of network management system performance data, may further comprise the steps:
Step 110, obtain the History Performance Data records series;
For example be taken at the History Performance Data records series of early certain period of time point (as before one month, or the first two months) of disappearance performance data.
In the existing network management system data acquisition system is arranged all, finish the performance data sequence of gathering certain period, deposit in the two-dimensional data table of performance database by being about to each bar performance data record, data record of each row storage, each bar data record is made up of a plurality of performance index fields.Then corresponding performance index word section of each row in the table.
This step is from performance database table reading performance data record and each field data.
Step 120, the above-mentioned History Performance Data records series of foundation are surveyed the internal correlation between the same data record different field data item, set up the related regression model of field of match, and the supplemental characteristic of will be mutually related field name and model are preserved;
Step 130: according to above-mentioned History Performance Data records series, the variation tendency of performance data is set up the autoregression model of match between the same field of detection different pieces of information record, and the supplemental characteristic of this field name and model is preserved;
Step 140: in the performance data records series, search the data field of disappearance and determine the position;
The sequence number of the data record at the data field place of definite disappearance, the title of disappearance field.
Step 150: judge whether the disappearance field is part field disappearance, and have relevance with known field? if then change step 160; Otherwise, change step 170;
If the disappearance field can be set up the related regression model of field with known field, can judge that the disappearance field exists related with known field.
Step 160: according to the related regression model of field, calculate the predicted value that lacks field, fill up in the disappearance field of missing data record with known associated field value;
Read the model parameter data of disappearance performance data field and the related regression model of field of known performance data field, sequence number according to the missing data record is learnt known field data item value, the related regression model of substitution field, calculate the estimated value of the disappearance field in this missing data record, with described estimated value value of filling up as this missing data field.
Step 170: according to autoregression model, the known word segment value of certain data record in the usefulness historical data sequence calculates the estimated value of the disappearance field of missing data record, fills up in the missing data record;
Described given data record and missing data record have same data structure, and same data field number and same field name are promptly arranged, and in the sequence of data record, the same field of different recording is exactly the field that is associated.
Read the supplemental characteristic of the autoregression model of missing data field, and this field value of certain historgraphic data recording, the substitution autoregression model calculates the estimated value of the disappearance field in this missing data record, with described estimated value value of filling up as this missing data field.
Step 180: the missing data field of whether not filling up in addition? if have, then change step 150; Otherwise end data is filled up flow process.
Wherein, in the step 120, according to the History Performance Data records series, survey the internal correlation of performance data between the different field, the method for setting up the related regression model of field of match is specially:
At first, get over the data of certain period, promptly from described performance database table, get the correlation analysis that continuous some data record lack field and known field, and create the regression model of corresponding interfield, and field regression model and the parameter of creating thereof deposited in the database table.
Such as: according near the missing data time point one month hour being that the network management performance data of granularity carries out correlation analysis.In the performance index data relevant, on business, often there is certain association with certain network element.Such as three performance index word sections in the exchange data record: switch number of call attempts (call_att), switch are connected number of times (call_setup), system's number of call attempts (sys_call_att).
The switch number of call attempts: switch sends " call proceeding " message and the total degree of receiving " IAM or IAI " message in the timing statistics section.
System's number of call attempts: " IAM, IAI " message number of times of " CM service request " and Incoming in the timing statistics section.
Switch is connected number of times: switch is received the total degree of " call confirmed " message and " ACM " message in the timing statistics section.
From the implication of above-mentioned 3 performance achievement datas, switch is connected number of times and switch number of call attempts to be had very strong relatedly, and system's number of call attempts should be than switch call attempt often, but difference is little.
More than analyze just qualitative analysis.The present invention is incorporated into the analysis of performance data interfield with correlation analysis and technology, promptly introduces coefficient correlation.Coefficient correlation is the index of degree of correlation between two variablees.The span of coefficient correlation is [1,1].The absolute value of coefficient correlation is big more, and error Q is more little, and the linear correlation degree between the variable is high more; The absolute value of coefficient correlation is more near 0, and Q is big more, and the linear correlation degree between the variable is low more.Coefficient correlation ρ XYBe defined as follows:
ρ XY = Cov ( X , Y ) D X D Y
(1)
= E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D X D Y
Wherein:
(X Y) is the covariance of stochastic variable X and Y to Cov, here is the covariance of field X and Y;
E (X) is the mathematic expectaion of stochastic variable X, generally obtains the estimated value of the mathematic expectaion of X from limited sample:
E ( X ) = 1 n Σ i = 1 n X i - - - ( 2 )
E (Y) is the mathematic expectaion of stochastic variable Y, generally obtains the estimated value of the mathematic expectaion of Y from limited sample:
E ( Y ) = 1 n Σ i = 1 n Y i
And D XThen be the standard deviation of stochastic variable X, generally obtain the estimated value of the variance of X from limited sample:
D X = 1 n - 1 Σ i - 1 n ( X i - E ( X ) ) 2 - - - ( 3 )
And D YThen be the standard deviation of stochastic variable Y, generally obtain the estimated value of the variance of Y from limited sample:
D Y = 1 n - 1 Σ i = 1 n ( Y i - E ( Y ) ) 2
For stochastic variable,, then think to have a regression function between these two variablees, and linear regression model (LRM) just can obtain to approach preferably (approximation) effect if the coefficient correlation absolute value approaches 1.The one-variable linear regression method is as follows:
Be provided with two variable X, Y, and have the sample { (X that comprises n group sample point 1, Y 1), (X 2, Y 2) ..., (X n, Y n), the one-variable linear regression function of being asked is:
Y=a+bX+ε,ε~N(0,σ 2) (4)
Sample above utilizing adopts least square method or maximum likelihood estimate, and the estimated value that can obtain parameter a, b in the following formula is as follows:
b = E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D ( X ) = Σ ( x i - x ‾ ) ( y i - y ‾ ) Σ ( x i - x ‾ ) 2 - - - ( 5 )
a = y ‾ - b x ‾ - - - ( 6 )
For the performance index field that direct or indirect relation is arranged on professional meaning, utilize the coefficient correlation in the above-mentioned formula (1), performance data is carried out correlation to be surveyed, then for the absolute value of the coefficient correlation index of (being that absolute value is near 1) between 0.9~1, then think and have the linear functional relation shown in the formula (4) between them, utilize the parameter value in least square method or the maximum likelihood estimate calculating linear function, promptly utilize formula (5), formula (6) calculates a in the linear function, the b value, determine linear functional relation, a with this linear functional relation, the b parameter value is stored in the field regression model database table as configuration data, and the correlation models and the parameter thereof that are about to the performance data field deposit in the field regression model database table.
After determining the intersegmental functional relation of two performance index words, promptly finished the establishment of these two performance index word segment models, when having a plurality of performance index data, need analyze the correlation between these performance index data respectively, promptly distinguish the coefficient correlation between the calculation of performance indicators data, determine the functional relation between the performance achievement data respectively, and then definite respectively field regression model, these field regression model models and parameter thereof are existed in the database table baseline configuration data when filling up as this performance data disappearance.
On the basis of the performance data that in a large amount of network management system of research, collects, there is stronger correlation between different as can be known performance datas.Its field regression function model presents the monobasic linear function more, therefore the missing values of estimating with monobasic linear function model want simple and fast many.
Wherein, in the step 130, according to the History Performance Data records series, the variation tendency of performance data between the same field of detection different pieces of information record, the method for setting up the autoregression model of match is specially:
Getting performance data sequence is carried out the autocorrelation analysis of same field by the time series models mode, settling time the sequence autoregression model; That is,,, then set up linear steadily Mathematical Modeling arma modeling if variation tendency presents linear character stably for the performance data record on the time series; There is dull rise or descend non-stationary if variation tendency presents non-stationary characteristic, then can realize tranquilization after the difference, then set up the ARIMA model; If variation tendency presents stably linear character and presents very obvious periodic feature, then set up the SARIMA model; If variation tendency presents very strong non-linear non-stationary characteristic, then setting up the Mathematical Modeling of non-linear non-stationary, can be the Mathematical Modeling of neural net (Nueral Networks) or SVMs (Support Vector Machines) model or other suitable non-linear non-stationary.
Method for the variation tendency of detection data sequence can adopt methods such as analysis such as auto-correlation function analysis, spectrum density to determine.
Auto-correlation function is defined as follows:
The definition establish Xt} is a given time series,
Figure BSA00000233959200101
It is its sample average.Then have
1.
Figure BSA00000233959200102
Be called sequence { the sample auto-covariance function of Xt};
2. r k=C k/ C 0Be called as sample autocorrelation function (ACF-AutoCorrelation Function).
According to this definition, 0 rank auto-correlation coefficient value r is arranged 0=1.With r kBe ordinate, k is that the figure of abscissa is called autocorrelogram.Utilize autocorrelogram, can check the k rank autocorrelation of a time series data.According to autocorrelation, variation tendency that can the judgement time sequence data.
Generally,, after the hysteresis rank are greater than 20, there is not the auto-correlation coefficient of remarkable non-zero, then presents the stationary time series feature, use ARIMA model or SARIMA model and carry out modeling if after the data sequence carried out 1 time, 2 times difference respectively.
Again such as, get the typical nonlinear nonstationary time series---the annual sunspot number is an example, the year sunspot number of getting 1700 to 1979 is formed a time series, shown in Fig. 2-a, then can obtain the autocorrelogram on its from 1 to 40 rank, at all remarkable non-zero of the place, major part hysteresis rank on preceding 40 rank auto-correlation coefficient, so these sequence right and wrong stably.
If after it is carried out 1 time, 2 times difference respectively, referring to the autocorrelation function graph after the average annual sunspot first difference of Fig. 2-b and Fig. 2-c autocorrelation function graph after the black mole second order difference too every year, on the hysteresis rank greater than the auto-correlation coefficient that still had remarkable non-zero at 20 o'clock, and there is not periodic feature, can't realize tranquilization, represent that then this sequence is non-linear non-stationary, this time series adopts SVMs or neural net to carry out modeling with regard to needs.Otherwise, if after the data sequence carried out 1 time, 2 times difference respectively, present steady feature, then use the ARIMA model modeling, periodic feature is perhaps arranged, use the SARIMA model modeling.
Time series analysis is one of several big research directions in the data mining technology with prediction, in seasonal effect in time series was analyzed, the main method of using comprised: be conceived to the analysis of regression relation between desired value and time t, comprise one-variable linear regression, index return, logarithm recurrence etc.; The dynamic system that is conceived to different regression relations between the field desired value constantly returns, and mainly comprises exponential smoothing, neural net, SVMs, arma modeling and Kalman filtering etc.According to the characteristics difference of performance data, the data variation trend that correspondingly shows is also different.Present trend performance data stably for variation tendency, the corresponding steadily Mathematical Modeling arma modeling of linearity that adopts, present non-stationary characteristic but dull rise or descend non-stationary of existence for variation tendency, then by realizing tranquilization after the difference, then adopt the ARIMA model, for the very performance data of obvious periodic is arranged, can further adopt the SARIMA model again, present very strong non-linear non-stationary characteristic for variation tendency, then adopt the Mathematical Modeling of non-linear non-stationary, for example the Mathematical Modeling of neural net (Nueral Networks) or SVMs (Support Vector Machines) model or other suitable non-linear non-stationary.
The process of setting up arma modeling can be specially:
Step 130a1, utilize exponent number arbitration criterion to determine the exponent number of arma modeling, carry out deciding rank, obtain model cluster;
For example utilize AIC to decide the rank method, BIC decides the rank method, the F method of inspection carries out the arma modeling identification.
The parameter of step 130a2, estimation and definite described model cluster;
The process of modeling is a circulation training process repeatedly, and up to finding suitable model parameter, the fitness that reaches prediction data and real data is best.
The model parameter that step 130a3, basis are determined is carried out the applicability inspection, determines optimum model parameter.
Can be by calculating whether residual sequence is that white noise sequence detects, till the training residual error is white noise.
The process of setting up ARIMA model and calculating estimated value can be specially:
Step 130b1, to presenting dull rise or the described History Performance Data sequence of decline feature is carried out difference processing, make the data sequence tranquilization after the processing;
For example by first-order difference or (with) season difference finish the tranquilization of described data sequence.
Step 130b2, to the data sequence after the tranquilization, utilize exponent number arbitration criterion to determine exponent number in the ARIMA model to carry out deciding rank, obtain model cluster;
The parameter of step 130b3, estimation and definite described model cluster;
The model parameter that step 130b4, basis are determined is carried out the applicability inspection, determines optimum model parameter;
Step 130b5, according to the value of known associated data item, go out to lack the first predicted value of data item according to the optimal models calculation of parameter that obtains, again first predicted value is carried out backward difference afterwards and handle calculating (inverse operation), recall estimated value for the missing data item of former data sequence.
The traffic performance data record that collects at network management system, owing to present very obvious periodic feature, can carry out the modeling of traffic performance data by time series SARIMA (promptly the ARIMA model being carried out the product in season) model mode, and to the disappearance Key Performance Indicator carry out time series forecasting, obtain the Key Performance Indicator predicted value, for simplifying forecasting process, when the field regression model of adaptation is arranged, the delivery shape parameter, with predicted value as the known performance data field, substitution field regression model calculates and obtains other performance data fields, process ends.
Because the periodicity of the traffic performance data that network management system collects is very obvious, therefore the further independent again periodically very significantly seasonal effect in time series SARIMA model that is applicable to that proposes to adopt based on arma modeling of the present invention comes the traffic performance data rule in the analog network management system.Its main modeling design is: when whole piece performance data record disappearance, the time series forecasting method is introduced filling up of performance data record, use time series forecasting according to History Performance Data and come in the predicted performance data record desired value in the value of following time.Concrete grammar is to carry out correlation and detecting periodically to gathering the performance data of coming in the network management system, finds typical correlation and periodic feature, and uses the SARIMA model modeling on this basis.
Therefore also can be specially at step 130, by season product ARIMA model the key performance data are carried out modeling, can adopt following method and step during prediction:
Step 130c1, on detection performance data history seasonal effect in time series periodic basis, carry out the difference processing in season, { zi} realizes tranquilization, has multiple seasonality as infructescence, then carries out repeatedly the difference processing in season to make sequence after the processing.
Step 130c2, { zi} carries out ARIMA, and (p, q) (p, q) (P, Q) modeling carry out deciding rank to * to model, obtain model cluster to time series stably.
Because ARIMA (P q) does not embody the periodicity of performance data, therefore by ARIMA (p, q) * (P Q) embodies the periodicity of performance data, (and P, Q) expression is with (p q) has the parameter of periodic relationship; In general, each exponent number is all less than 2, thus able to programmely travel through all combinations of 4 parameters with circulation, and guarantee p, q, P, Q are not 0 simultaneously.Can adopt AIC (red pond amount of information criterion) to choose only model cluster.
Step 130c3, try to achieve the parameter of described model cluster with maximum-likelihood method.
To resulting a plurality of models, try to achieve the parameter of each model with maximum-likelihood method.
The model parameter that step 130c4, basis are determined with the residual error applicability whether white noise comes testing model, is determined optimum model parameter.
After ARIMA model and SARIMA modelling, in the step 170: according to the described optimal models parameter of trying to achieve, calculate the initial predicted value, then, also to do opposite difference processing computing in season (being the alleged Integration algorithm of industry), thereby obtain the predicted value of the critical field of data record in the original performance data sequence according to the initial predicted value.
Present very strong non-linear non-stationary characteristic for variation tendency, then adopt the Mathematical Modeling of non-linear non-stationary, for example set up supporting vector machine model, modeling process can be specially:
Described History Performance Data sequence is trained, and serves as according to the optimum supporting vector machine model of choosing based on described History Performance Data sequence with the residual error white noise, comprising:
Step 130d1, described History Performance Data sequence is carried out preliminary treatment, phase space reconfiguration obtains the training data sequence;
Step 130d2, the free parameter value of supporting vector machine model is set;
Step 130d3, according to set free parameter value, according to structural risk minimization the training data sequence is trained, obtain a regression equation as modeling result;
Step 130d4, the actual value of training data sequence and the calculated value under the gained regression equation are asked poor, obtain the match residual sequence, calculate the auto-correlation function of residual sequence;
Step 130d5, according to the auto-correlation function result of calculation of described residual sequence, check whether described residual sequence is white noise sequence, if this model and corresponding free parameter value are preserved and exported to the SVMs training pattern optimum that is then obtained; Not white noise sequence else if, return step 130d2, reset the value of free parameter, train again, up to obtaining optimum supporting vector machine model according to above process.
The invention will be further described below in conjunction with the concrete grammar example.
Complementing method when method example two, performance data part field disappearance, the several Key Performance Indicator data that need to gather stores processor with MSC (mobile services switching centre, mobile service switching center) elaborate:
At first utilize the data of a period of time, carry out the establishment of missing data field correlation analysis and field regression model such as previous month hour granularity data.For example table 1 shows the part that 360 History Performance Datas records of two weeks since 1 day August in 2009 have been economized by certain operator that is stored in the relation database table.
Figure BSA00000233959200141
360 History Performance Data records of certain province of operator of table 1
Supposing has partial properties index word segment data disappearance in the data record on August 15th, 2009.At this moment utilize the field correlation of a plurality of Key Performance Indicator data to carry out regression modeling.At first utilized each index nearly 14 days 336 performance datas record of (selecting from August 14,1 day to 2009 August in 2009) carries out correlation and surveys, according to the ASSOCIATE STATISTICS principle, when the coefficient correlation absolute value between two key index values is 1, then exist definite linear functional relation between two indexs, promptly can be expressed as Y=a+bX, X, the Y variable is in two different performance index of this expression.
And when the coefficient correlation absolute value approaches 1, represent that the correlation between two indexs is very strong, can return with monobasic linear function model.Regression function is:
Y=a+bX+ε,ε~N(0,σ 2)
In above-mentioned Linear Regression Model in One Unknown, a is the straight line intercept, and b is a straight slope, also is called the regression coefficient of Y to X, the mean variation amount of a caused Y of unit of its expression every change of X; ε is residual error (being also referred to as regression error or predicated error), the random perturbation that expression other secondary causes except that X form, and when sample size was big, positive negative interference can be cancelled out each other, and can think that the average of ε is 0.
In the method example two, several Key Performance Indicators that MSC (mobile services switching centre, mobile service switching center) needs to gather stores processor are successively: system's call attempt total degree, switch call attempt total degree, systems response total degree, switch are connected total degree, system's paging total degree, called party answer total degree, called response total degree.Utilize above-mentioned formula (1) that these 7 indexs are carried out the detection of relevance in twos between the field, the gained correlation matrix is as follows:
matr _ corr = 1.0000 1.0000 0.9965 0.9996 0.9717 0.9996 0.9719 1.0000 1.0000 0.9968 0.9997 0.9710 0.9997 0.9713 0.9965 0.9968 1.0000 0.9982 0.9518 0.9983 0.9522 0.9996 0.9997 0.9982 1.0000 0.9668 1.0000 0.9671 0.9717 0.9710 0.9518 0.9668 1.0000 0.9667 1.0000 0.9996 0.9997 0.9983 1.0000 0.9667 1.0000 0.9670 0.9719 0.9713 0.9522 0.9671 1.0000 0.9670 1.0000
Because the coefficient correlation between two indexs is an antithesis, so above correlation matrix also is an antithesis, so as long as see that the last triangle of this matrix is all right.From the phase relation numerical value of this upper triangular matrix as can be seen:
Matr_corr[1,2]=coefficient correlation (system's call attempt total degree, switch call attempt total degree)=the 1st, maximum coefficient correlation;
Matr_corr[3,5]=coefficient correlation (systems response total degree, system's paging total degree)=the 0.9518th, minimum coefficient correlation, the coefficient correlation between other indexs all falls between.
Because coefficient correlation is all much larger than 0.8, can think, exist very strong linear dependence between the Key Performance Indicator data of these same switches, even can think and exist definite linear functional relation.As long as find out the linear function model between these indexs, just can under the situation that certain or some desired values lack, utilize other value substitution function models that do not have the same moment of disappearance to obtain missing values.The computational methods of linear function model adopt the one-variable linear regression method in the statistics, specifically see also formula (5) and (6).Calculate monobasic linear function between any two indexs with least square method, promptly calculated line slope b and straight line cut square a, and system's call attempt total degree is as follows to being worth for the regression parameter of other 6 Key Performance Indicators:
(1.02044901004931,8358.96681820601)
(1.38694158090591,349081.263333324)
(1.17911769111168,111979.431708182)
(1.19059921632276,-262418.199922068)
(1.41358719365854,81776.8365516228)
(1.24222987544457,-271216.466533611)
Therefore, parameter that can be by above-mentioned straight slope b and straight line intercept a is to value list, the one-variable linear regression function model of acquisition system call attempt total degree and other 6 performance data targets, system's call attempt total degree is a variable Y, other 6 performance data targets are variable X.
System's call attempt total degree, the one-variable linear regression function of exchange call attempt total degree is:
Y=8358.96681820601+1.02044901004931X
System's call attempt total degree, the one-variable linear regression function of systems response total degree is:
Y=349081.263333324+1.38694158090591X
System's call attempt total degree, the one-variable linear regression function that total degree is connected in exchange is:
Y=111979.431708182+1.17911769111168X
System's call attempt total degree, the one-variable linear regression function of system's paging total degree is:
Y=-262418.199922068+1.19059921632276X
System's call attempt total degree, the one-variable linear regression function of called party answer total degree is:
Y=81776.8365516228+1.41358719365854X
System's call attempt total degree, the one-variable linear regression function of called response total degree is:
Y=-271216.466533611+1.24222987544457X
This means, if system's call attempt total degree is empty, but as long as other six desired values be empty simultaneously, always can utilize the one-variable linear regression function of correspondence to obtain the value of filling up of system's call attempt total degree, and the error of this value of filling up is very little.Otherwise if system's call attempt total degree is not empty, other six desired values are empty, then utilize system's call attempt total degree also can calculate other six desired values as the value of filling up.
7 performance index values of the 337th actual acquisition that data record is shown with in the table 1.System's call attempt total degree disappearance in the 337th data record of hypothesis is for empty now, successively with above-mentioned 6 functions of index difference substitution in the table 1, obtain prediction value of filling up of system's number of call attempts and calculate corresponding APE (Absolute Percent Error absolute error percentage), as shown in table 2:
Figure BSA00000233959200171
The tabulation of table 2 predicted performance data
Wherein utilize the actual value of APE=(utilizing the actual value of predicted value-system's call attempt total degree of exchange call attempt total degree)/system's call attempt total degree of exchange call attempt total degree, other APE calculates in like manner.From predicting the outcome of last table, the APE great majority are in 10%, and other has minority in 20%, and MAPE (Mean Absolute Percent Error mean absolute error percentage) is worth less than 10%; And according to relevant prediction theory, 10% interior prediction belongs to good predict, and 20% interior prediction belongs to the acceptable prediction.
More than predict the method for filling up and fill up the result and show, when part field desired value lacks, if have internal association between the field index, do not predict that with there being the disappearance performance data method of disappearance performance data is feasible, can be lacked the performance data value of filling up more accurately, and computing is simple, fills up the efficient height.
Complementing method when method example three, whole piece performance data record disappearance:
Get the switch traffic data record of certain province the whole province, comprising system's call attempt total degree field, begin till the 2009-08-1423:00:00 from 2009-08-0100:00:00, totally 336 data record, carry out periodically and the detection of tendency at system's call attempt total degree field, detection process is as follows:
Select the historical data sequence of Key Performance Indicator data-system's call attempt total degree to carry out detecting periodically.Detection method is a periodicity of analyzing data variation with auto-correlation function.
Be illustrated in figure 3 as the historical data seasonal effect in time series autocorrelation function graph of system's call attempt total degree.In this autocorrelation function graph, transverse axis is represented seasonal effect in time series hysteresis exponent number k, and the definition Chinese style of when on behalf of hysteresis rank lag, the longitudinal axis be k k being brought into aforementioned auto-correlation function is gained auto-correlation coefficient C 1. kValue, as seen from Figure 3, the auto-correlation coefficient value is at lag=24, and extreme value appears in places such as 48,72, and it is 24 periodicity that there is the cycle in expression, and this point is also consistent with our daily experience.The performance index value of 24 hours every days has fluctuation, and from many days, and this change every day is again similar.In addition, Fig. 3 shows, is 100 o'clock up to the hysteresis exponent number, the still remarkable non-zero of most auto-correlation coefficient values, and this represents these sequence right and wrong stably.Fig. 3 result shows that this performance index data sequence presents very strong periodicity and non-stationary, and the cycle is 24 hours.
Because performance index data sequence right and wrong are stably, therefore need carry out the difference of s=24 to it according to its periodicity of 24 hours, obtain time series { y after the difference i, again this sequence is carried out stationarity and survey, as shown in Figure 4, as can be seen from Figure 4, the coefficient correlation absolute value is very big at the lag=168 place, this expression with weekly 168 hours as the another one periodic quantity, s=7*24=168, still right and wrong are stably through the sequence of a s=168 difference in season.
Therefore continue sequence { y iCarry out the difference in season of s=168, obtain twice season differentiated sequence { z i.This sequence is carried out stationarity surveys, its autocorrelation function graph as shown in Figure 5, as seen from Figure 5, auto-correlation coefficient becomes 0 soon, so sequence { z iBe stably.
In Fig. 5, at hysteresis rank lag=1,2,3 o'clock, the remarkable non-zero of the value of auto-correlation coefficient.So-called significantly non-zero is to have adopted the central hypothesis testing method of statistics.We suppose that coefficient is 0, checks the correctness of this hypothesis.Auto-correlation coefficient is the statistic of a stochastic variable, existing theoretical proof, and this stochastic variable meets normal distribution, and is distributed as
Figure BSA00000233959200181
According to gaussian distribution table and quantile, the value of coefficient correlation is distributed in 95% confidence level
Figure BSA00000233959200182
Between.Ordinate value at the horizontal line of transverse axis upper and lower among Fig. 5 is
Figure BSA00000233959200183
With
Figure BSA00000233959200184
So work as lag=1,2,3 o'clock, the value of auto-correlation coefficient was a non-zero in 95% confidence level, and during lag=4, the auto-correlation coefficient value is positioned at upper confidence interval, thinks under 95% confidence level it is 0.
To sequence { z stably iCarry out ARMA (p, q) * (P, Q) sModeling.Each exponent number is all less than 2, thus able to programmely travel through all combinations of 4 parameters with circulation, and guarantee p, q, P, Q are not 0 simultaneously.Adopt the AIC information criterion to choose optimal models, and with the residual error applicability whether white noise comes testing model.Traversal p, q, P, Q is except that all being all values 0, totally 3 * 3 * 3 * 3=81-1=80 combination.The AIC value of the arma modeling correspondence under each combination as shown in Figure 6.
The AIC method is that (how p q) determines the method criterion of p, q value in the model, use comparatively general for the ARMA that proposed by Japanese scholar Chi Chi (Akaike).Its principle is: establish For having the training set residual sequence variance of white noise characteristics
Figure BSA00000233959200192
Maximum likelihood estimator; Number of parameters to be estimated in the r=p+q+1 representative model; N represents sample points in the time series.AIC criterion value then:
AIC p , q = log σ ^ ϵ 2 + ( p + q + 1 ) 2 n
Can be had by this formula, the match residual error that AIC criterion is not only pursued on the training set be as far as possible little, considers also that simultaneously the exponent number of gained model is also not too big, thereby because the high model complexity of big exponent number representative may cause over-fitting; AIC criterion be in fitting effect and extensive effect, get compromise.
Choose the minimum model of AIC value, this model (p, q) * (P, Q) sRank be (2,0) * (2,2) 24, promptly original telephone traffic sequence { x iBest fit model be (2,0,0) * (2,1,2) 24* (0,1,0) 168, determined that the arma modeling behind the model order specifically is expressed as follows:
(1-φ 1B-φ 2B 2)(1-B 24)(1-B 168)(1-Φ 1B 242B 48)x t=(1-Θ 1B 242B 48t (7)
Wherein, B is a backward shift operator, and Bx is promptly arranged t=x T-1, B 2x t=x T-2, the rest may be inferred, and B is arranged ix t=x T-i, i is a limited integer; ε tFor meeting the residual error that white noise distributes.Such as, be (1,1) if an arma modeling exponent number is arranged, promptly model is as follows: (1-φ 1B) x t=(1-θ 1B) ε t, then the model expansion is: x t1x T-1t1ε T-1, the form that converts better understanding to is: x t1x T-11ε T-1+ ε t
Finish decide rank after, carry out the parameter Estimation of model, thereby obtain sequence { z iArma modeling, sequence { x iThe SARIMA model, utilize MLE (maximum likelihood estimation-maximal likeliness estimation) method to carry out parameter Estimation, as shown in table 3 to the result of each parameter Estimation:
φ 1 φ 2 Φ 1 Φ 2
0.9282554 -0.2283912 -0.11886090 -0.03665631
Θ 1 Θ 2
0.3941738 -0.2952881
Table 3 is determined the parameter value tabulation
To go up in the table 3 above the parameter substitution in the formula (7), be former sequence { x iThe SARIMA model that is suitable for, utilize this model to carry out the one-step prediction of system's number of call attempts, promptly obtain the predicted value of this index.
Repeat above step repeatedly, obtain prediction value of filling up of system's number of call attempts index of disappearance in a period of time, result such as table 4:
Figure BSA00000233959200201
Table 4 disappearance index prediction value list
From last table data as can be seen, carry out the modeling and the prediction of webmaster traffic performance data with the arma modeling that has a product term in season and fill up, fill up the result very accurately, to fill up efficient very high.Explanation exists very strong periodic law in webmaster traffic performance data, and the modeling method that is adopted also is very suitable.
Other index fields for same data record, if can find and known field or the field predicted between have relevance, can adopt the complementing method of the related regression model of above-mentioned field, promptly finish and predict on the basis of filling up that the related regression model of application field carries out filling up of other relevant field indexs in these key performance data.
Simultaneously, the invention allows for implement device corresponding to the complementing method of network management system performance data.
With reference to Fig. 7, show the structure chart that network management system performance data of the present invention is filled up device one embodiment, comprising: historical data acquiring unit 71, regression model are set up unit 72 and data filling unit 73, wherein:
Historical data acquiring unit 71: be used to obtain the History Performance Data records series;
For example, the selection principle of History Performance Data records series is: the History Performance Data records series that is chosen at early certain period of time point (as before one month, or the first two months) of disappearance performance data.
Regression model is set up unit 72: be used for the History Performance Data records series that obtains according to historical data acquiring unit 71, detection has the internal correlation between the different pieces of information item of particular kind of relationship, for the associated data item with internal correlation is set up the regression model of match each other;
Among this device embodiment, the different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually.
Data filling unit 73: be used for value according to the situation of data item disappearance and known associated data item, use regression model to set up the correlation regression model that unit 72 is set up, calculate the estimated value of missing data item, and this estimated value is filled up in the performance data record.
The operation principle of data filling unit 73 can be specially: whether read each field of every data record one by one, differentiating field contents is empty (NULL), if be empty, then judges this field disappearance.If all fields of whole piece data record all lack, then judge this data record disappearance.When being judged as field disappearance, and when the field regression function model (i.e. the relevant data item of this disappearance field, and associated data item is not that data lack field) of corresponding match is arranged, then fill up the field of disappearance according to field association regression model; Lack when being judged as the whole piece record, and the autoregression model of corresponding match is arranged, then fill up the field of disappearance according to autoregression model.The above-mentioned disappearance field that circulates is sought and filling, can fill up all disappearance fields.
Wherein, regression model is set up unit 72 and is comprised that the related regression model of field is set up subelement 721 and autoregression model is set up subelement 722:
The related regression model of field is set up subelement 721 and is used for the History Performance Data records series that obtains according to historical data acquiring unit 71, surveys correlation between the same record different field, sets up the related regression model of field of match; The related regression model of field is set up subelement 721 and is comprised that specifically correlation analysis module 7211 and correlation models set up module 7212:
Correlation analysis module 7211 is used for a plurality of record values according to the History Performance Data records series, and the data value of each intrarecord field X and another field Y is carried out each other correlation analysis, calculates coefficient correlation ρ XYIf, coefficient correlation ρ XYAbsolute value between 0.8~1, then judge between field X and the field Y to have correlation; The coefficient correlation ρ of above-mentioned different field X and Y XYComputing formula as follows:
ρ XY = Cov ( X , Y ) D X D Y
= E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D X D Y
In the above-mentioned formula:
(X Y) is the covariance of field X and Y to Cov;
E (X) is the mathematic expectaion of field X: E ( X ) = 1 n Σ i = 1 n X i
Figure BSA00000233959200224
D YStandard deviation for field Y: D Y = 1 n - 1 Σ i = 1 n ( Y i - E ( Y ) ) 2
Correlation models is set up the result of determination that module 7212 is used to receive correlation analysis module 7211, if having correlation between different field X and the field Y, then sets up following one-variable linear regression function model:
Y=a+bX+ε,ε~N(0,σ 2)
Wherein, the estimated value of parameter a, b is as follows:
b = E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D ( X ) = Σ ( x i - x ‾ ) ( y i - y ‾ ) Σ ( x i - x ‾ ) 2
a = y ‾ - b x ‾ .
Autoregression model is set up subelement 722 and is used for the History Performance Data records series that obtains according to historical data acquiring unit 71, surveys the variation tendency of same field data item between different data record, sets up the autoregression model of match.Autoregression model is set up subelement 722 and is specifically comprised trend detection analysis module 7221, Model Selection and set up module 7222.
Trend detection analysis module 7221 is used for the variation tendency of the performance data item on the described History Performance Data sequence time series is carried out detection analysis, and the output result of detection;
Model Selection and set up module 7222 according to the result of detection of trend detection analysis module, if variation tendency presents linear character stably, is then set up arma modeling; There is dull rise or descend non-stationary if variation tendency presents non-stationary characteristic, and can realize tranquilization after the difference, then set up the ARIMA model; If present the obvious periodic feature, then set up the SARIMA model; If variation tendency presents very strong non-linear non-stationary characteristic, then set up neural network model or supporting vector machine model.
Described Model Selection and set up that the arma modeling modeling process specifically comprises in the module:
Model is decided the rank module, utilizes exponent number arbitration criterion to determine the exponent number of arma modeling, carries out deciding rank, obtains model cluster;
For example utilize AIC to decide the rank method, BIC decides the rank method, the F method of inspection carries out the arma modeling identification.
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model cluster;
The process of modeling is a circulation training process repeatedly, and up to finding suitable model parameter, the fitness that reaches prediction data and real data is best.
The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
Can be by calculating whether residual sequence is that white noise sequence detects, till the training residual error is white noise.
Described Model Selection and set up that ARIMA model modeling process specifically comprises in the module:
The tranquilization pretreatment module is carried out difference processing to the described History Performance Data sequence that presents periodic feature, makes the data sequence tranquilization after the processing;
For example by first-order difference or (with) season difference finish the tranquilization of described data sequence.
Model is decided the rank module, to the tranquilization data sequence that obtains through the tranquilization pretreatment module, utilizes exponent number arbitration criterion to determine the exponent number of ARIMA model, carries out deciding rank;
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model;
The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
Described Model Selection and set up that SARIMA model modeling process specifically comprises in the module:
The tranquilization pretreatment module is carried out the difference processing in season to the described History Performance Data sequence that presents periodic feature, makes the data sequence tranquilization after the processing; If there is multiple seasonality in described History Performance Data sequence, then carry out repeatedly the difference processing in season;
Model is decided the rank module, to the tranquilization data sequence that obtains through the tranquilization pretreatment module, utilizes exponent number arbitration criterion to determine the exponent number of SARIMA model, carries out deciding rank;
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model; The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
When determining to set up ARIMA model or SARIMA model, in the described data filling unit, use regression model to set up the correlation regression model that the unit is set up, calculate after the first estimated value of this missing data item, also to carry out the backward difference computing, recall estimated value, more described estimated value is filled up in the former performance data record for former data sequence.
In preceding method example three, discussed and analyzed data variation with auto-correlation function and whether be periodically, and presented the process of data sequence stably, finish decide rank after, carry out the parameter Estimation of model, thereby obtain sequence { z iArma modeling, sequence { x iThe SARIMA model.Utilize MLE (maximum likelihood estimation-maximal likeliness estimation) method to determine the optimal estimation parameter then.Autoregression model is set up the device that subelement 722 is this process of realization.
Described Model Selection and set up that the supporting vector machine model modeling process can specifically comprise in the module:
The training data acquisition module is used for described normal sample notebook data is carried out preliminary treatment, and phase space reconfiguration obtains the training data sequence;
Parameter is provided with module, is used to preset or adjust the free parameter value of supporting vector machine model;
The training MBM is used for according to parameter the set free parameter value of module being set, and according to structural risk minimization the training data sequence is optimized training, obtains a regression equation as modeling result;
The residual computations module, according to the calculated value of regression equation calculation training data under this regression equation of training MBM to obtain, the actual value of the sample data that obtains with the training data acquisition module asks poor, obtains the match residual sequence, calculates the auto-correlation function of residual sequence;
White noise check and model determination module are used to check whether residual computations residual sequence that module is calculated is white noise sequence, if, then determined supporting vector machine model optimum, free parameter value that output is provided with and optimum supporting vector machine model; Otherwise forward parameter to module is set, adjust the free parameter value of supporting vector machine model, with training again.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, and what each embodiment stressed all is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For device embodiment of the present invention, because it is similar substantially to method embodiment, so description is fairly simple, relevant part gets final product referring to the part explanation of method embodiment.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of claim.

Claims (18)

1. the complementing method of a performance data is characterized in that, comprising:
Obtain the History Performance Data records series;
The described History Performance Data records series of foundation is surveyed the internal correlation between the different pieces of information item with particular kind of relationship; Described different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually;
For the described associated data item with internal correlation is set up the regression model of match each other;
If the data item disappearance is arranged in the performance data record, then, use corresponding regression model according to the value of known associated data item, calculate the estimated value of missing data item, described estimated value is filled up in the performance data record of disappearance.
2. the method for claim 1, it is characterized in that, when described different pieces of information item with particular kind of relationship was meant the different field data item of same record, the method for surveying the internal correlation between the different pieces of information item with particular kind of relationship according to described History Performance Data records series was specially:
According to a plurality of record values of described History Performance Data records series, the data value of each intrarecord field X and another field Y is carried out each other correlation analysis, calculate coefficient correlation ρ XYIf, coefficient correlation ρ XYAbsolute value between 0.8~1, then judge between described field X and the field Y to have correlation, set up the related regression model of field;
Coefficient correlation ρ between described field X and the field Y XYComputing formula as follows:
r XY = Cov ( X , Y ) D X D Y
= E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D X D Y
Wherein:
(X Y) is the covariance of field X and Y to Cov;
E (X) is the mathematic expectaion of field X:
Figure FSA00000233959100013
E (Y) is the mathematic expectaion of field Y:
Figure FSA00000233959100014
D XStandard deviation for field X: D X = 1 n - 1 Σ i - 1 n ( X i - E ( X ) ) 2
D YStandard deviation for field Y:
Figure FSA00000233959100022
3. method as claimed in claim 2 is characterized in that, the described related regression model of field of setting up specifically is to adopt the one-variable linear regression fitting function, specifically:
Y=a+bX+ε,ε~N(0,σ 2),
Adopt least square method or maximum likelihood estimate, the estimated value that can get parameter a, b is as follows:
Figure FSA00000233959100023
a = y ‾ - b x ‾
4. the method for claim 1, it is characterized in that, when described different pieces of information item with particular kind of relationship is meant the same field data item of different recording, the method of setting up the regression model of match each other is: survey the variation tendency of same field data item between different data record according to described History Performance Data records series, set up the autoregression model of match.
5. method as claimed in claim 4 is characterized in that, surveys the variation tendency of same field data item between different data record according to described History Performance Data records series, and the method for setting up the autoregression model of match is specially:
Variation tendency to the record of the performance data in the described History Performance Data sequence is carried out detection analysis, according to the detection analysis result, if variation tendency presents linear character stably, then sets up the arma modeling of described performance data sequence; There is dull rise or descend non-stationary if variation tendency presents non-stationary characteristic, and can realize tranquilization after the difference, then set up the ARIMA model; If present the obvious periodic feature, then set up the SARIMA model; If variation tendency presents very strong non-linear non-stationary characteristic, then set up neural network model or supporting vector machine model.
6. method as claimed in claim 5 is characterized in that, the described process of setting up arma modeling is specially:
Utilize exponent number arbitration criterion to determine the exponent number of arma modeling, carry out deciding rank, obtain model cluster;
Estimate and determine the model parameter of described model cluster;
According to the model parameter of determining, carry out the applicability inspection, determine optimum model parameter.
7. method as claimed in claim 5 is characterized in that, the process of the described ARIMA of foundation model and calculating estimated value is specially:
To presenting dull rise or the described History Performance Data sequence of decline feature is carried out difference processing, make the data sequence tranquilization after the processing;
To the data sequence after the tranquilization, utilize exponent number arbitration criterion to determine exponent number in the ARIMA model to carry out deciding rank, obtain model cluster;
Estimate and determine the model parameter of described model cluster;
According to the model parameter of determining, carry out the applicability inspection, determine optimum model parameter;
According to the value of known associated data item, go out to lack the first predicted value of data item according to the optimal models calculation of parameter that obtains, more first predicted value is carried out backward difference afterwards and handle and calculate, thereby obtain the estimated value of missing data item in the former data sequence.
8. method as claimed in claim 5 is characterized in that, the process of the described SARIMA of foundation model and calculating estimated value is specially:
The described History Performance Data sequence that presents periodic feature is carried out the difference processing in season, make the data sequence tranquilization after the processing,, then carry out repeatedly the difference processing in season if there is multiple seasonality in described History Performance Data sequence;
Data sequence after tranquilization handled utilizes exponent number arbitration criterion to determine to carry out exponent number in the SARIMA model deciding rank and obtain model cluster;
Estimate and determine the model parameter of described model cluster by maximum-likelihood method;
According to the model parameter of determining, and whether be the applicability that white noise comes testing model, try to achieve the optimal models parameter by residual error,
According to the value of known associated data item, try to achieve the first predicted value of missing data item by the optimal models parameter, more first predicted value is carried out reverse season difference processing afterwards and calculate, obtain the estimated value of missing data item in the former data sequence.
9. method as claimed in claim 5 is characterized in that, the described process of setting up supporting vector machine model is specially:
Described History Performance Data sequence is trained, and serves as according to the optimum supporting vector machine model of choosing based on described History Performance Data sequence with the residual error white noise, comprising:
A) described History Performance Data sequence is carried out preliminary treatment, phase space reconfiguration obtains the training data sequence;
B) the free parameter value of supporting vector machine model is set;
C) according to set free parameter value, the training data sequence is trained, obtain a regression equation as modeling result according to structural risk minimization;
D) actual value of training data sequence and the calculated value under the gained regression equation are asked poor, obtain the match residual sequence, calculate the auto-correlation function of residual sequence;
E), check whether described residual sequence is white noise sequence, if this model and corresponding free parameter value are preserved and exported to the SVMs training pattern optimum that is then obtained according to the auto-correlation function result of calculation of described residual sequence; Be not white noise sequence else if, return step B), reset the value of free parameter, train again according to above process, up to obtaining optimum supporting vector machine model.
A performance data fill up device, it is characterized in that, comprising:
The historical data acquiring unit is used to obtain the History Performance Data records series;
Regression model is set up the unit, be used for the History Performance Data records series that the described historical data acquiring unit of foundation obtains, detection has the internal correlation between the different pieces of information item of particular kind of relationship, for the associated data item with internal correlation is set up the regression model of match each other; Described different pieces of information item with particular kind of relationship specifically is meant: belong to the data item of the different field of same record, perhaps belong to the data item of the same field of different recording; Data item with internal correlation is called the other side's associated data item mutually;
The data filling unit, be used for value according to the associated data item of the situation of data item disappearance and known missing data item, use regression model to set up the correlation regression model that the unit is set up, calculate the estimated value of this missing data item, described estimated value is filled up in the performance data record of disappearance.
11. device as claimed in claim 10 is characterized in that, described regression model is set up the unit and is comprised that specifically the related regression model of field is set up subelement and/or autoregression model is set up subelement; Wherein:
The related regression model of described field is set up subelement and is used for the History Performance Data records series that obtains according to described historical data acquiring unit, surveys the correlation between the same record different field data item, sets up the related regression model of field of match;
Described autoregression model is set up subelement and is used for the History Performance Data records series that the described historical data acquiring unit of foundation obtains, and surveys the variation tendency of same field data item between different data record, sets up the autoregression model of match.
12. device as claimed in claim 11 is characterized in that, the related regression model of described field is set up subelement and is comprised that specifically correlation analysis module and correlation models set up module, wherein:
The correlation analysis module is used for a plurality of record values according to described History Performance Data records series, and the data value of each intrarecord field X and another field Y is carried out each other correlation analysis, calculates coefficient correlation ρ XYIf, coefficient correlation ρ XYAbsolute value between 0.8~1, then judge between described field X and the field Y to have correlation; The coefficient correlation ρ of described field X and Y XYComputing formula as follows:
r XY = Cov ( X , Y ) D X D Y
= E { ( X - E ( X ) ) ( Y - E ( Y ) ) } D X D Y
Wherein:
(X Y) is the covariance of field X and Y to Cov;
E (X) is the mathematic expectaion of field X:
Figure FSA00000233959100053
E (Y) is the mathematic expectaion of field Y:
Figure FSA00000233959100054
D XStandard deviation for field X: D X = 1 n - 1 Σ i - 1 n ( X i - E ( X ) ) 2
D YStandard deviation for field Y:
Correlation models is set up module, receives the result of determination of described correlation analysis module, if the result for having relevance between field X and the field Y, then sets up following one-variable linear regression function model:
Y=a+bX+ε,ε~N(0,σ 2),
Wherein, the estimated value of parameter a, b is as follows:
Figure FSA00000233959100061
a = y ‾ - b x ‾
13. device as claimed in claim 11 is characterized in that, described autoregression model is set up subelement and is specifically comprised:
Trend detection analysis module is used for the variation tendency of the performance data item on the described History Performance Data sequence time series is carried out detection analysis, and the output result of detection;
Model Selection and set up module according to the result of detection of trend detection analysis module, if variation tendency presents linear character stably, is then set up arma modeling; There is dull rise or descend non-stationary if variation tendency presents non-stationary characteristic, and can realize tranquilization after the difference, then set up the ARIMA model; If present the obvious periodic feature, then set up the SARIMA model; If variation tendency presents very strong non-linear non-stationary characteristic, then set up neural network model or supporting vector machine model.
14. device as claimed in claim 13 is characterized in that, described Model Selection and set up that the arma modeling modeling process specifically comprises in the module:
Model is decided the rank module, utilizes exponent number arbitration criterion to determine the exponent number of arma modeling, carries out deciding rank, obtains model cluster;
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model cluster;
The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
15. device as claimed in claim 13 is characterized in that, described Model Selection and set up that ARIMA model modeling process specifically comprises in the module:
The tranquilization pretreatment module is carried out difference processing to the described History Performance Data sequence that presents periodic feature, makes the data sequence tranquilization after the processing;
Model is decided the rank module, to the tranquilization data sequence that obtains through the tranquilization pretreatment module, utilizes exponent number arbitration criterion to determine the exponent number of ARIMA model, carries out deciding rank;
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model;
The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
16. device as claimed in claim 13 is characterized in that, described Model Selection and set up that SARIMA model modeling process specifically comprises in the module:
The tranquilization pretreatment module is carried out the difference processing in season to the described History Performance Data sequence that presents periodic feature, makes the data sequence tranquilization after the processing; If there is multiple seasonality in described History Performance Data sequence, then carry out repeatedly the difference processing in season;
Model is decided the rank module, to the tranquilization data sequence that obtains through the tranquilization pretreatment module, utilizes exponent number arbitration criterion to determine the exponent number of SARIMA model, carries out deciding rank;
The parameter Estimation module, be used to estimate and definite model decide the rank module the parameter of definite model;
The applicability detection module, the model for determining with the estimated parameter of detected parameters estimation module carries out the applicability inspection, determines optimum model parameter.
17. device as claimed in claim 13 is characterized in that, described Model Selection and set up that the supporting vector machine model modeling process specifically comprises in the module:
The training data acquisition module is used for described normal sample notebook data is carried out preliminary treatment, and phase space reconfiguration obtains the training data sequence;
Parameter is provided with module, is used to preset or adjust the free parameter value of supporting vector machine model;
The training MBM is used for according to parameter the set free parameter value of module being set, and according to structural risk minimization the training data sequence is optimized training, obtains a regression equation as modeling result;
The residual computations module, according to the calculated value of regression equation calculation training data under this regression equation of training MBM to obtain, the actual value of the sample data that obtains with the training data acquisition module asks poor, obtains the match residual sequence, calculates the auto-correlation function of residual sequence;
White noise check and model determination module are used to check whether residual computations residual sequence that module is calculated is white noise sequence, if, then determined supporting vector machine model optimum, free parameter value that output is provided with and optimum supporting vector machine model; Otherwise forward parameter to module is set, adjust the free parameter value of supporting vector machine model, with training again.
18. as claim 15 or 16 described devices, it is characterized in that, in the described data filling unit, use regression model to set up the correlation regression model that the unit is set up, calculate after the first estimated value of this missing data item, also to carry out the backward difference computing, recall estimated value, more described estimated value be filled up in the former performance data record for former data sequence.
CN201010256368.6A 2010-08-16 2010-08-18 Filling method and device thereof for performance data Active CN102025531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010256368.6A CN102025531B (en) 2010-08-16 2010-08-18 Filling method and device thereof for performance data

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201010254188.4 2010-08-16
CN201010254188 2010-08-16
CN201010256368.6A CN102025531B (en) 2010-08-16 2010-08-18 Filling method and device thereof for performance data

Publications (2)

Publication Number Publication Date
CN102025531A true CN102025531A (en) 2011-04-20
CN102025531B CN102025531B (en) 2014-03-05

Family

ID=43866424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010256368.6A Active CN102025531B (en) 2010-08-16 2010-08-18 Filling method and device thereof for performance data

Country Status (1)

Country Link
CN (1) CN102025531B (en)

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102195814A (en) * 2011-05-04 2011-09-21 成都勤智数码科技有限公司 Method and device for forecasting and predicting by using relevant IT (Information Technology) operation and maintenance indexes
CN102411268A (en) * 2011-11-30 2012-04-11 上海华力微电子有限公司 Photoetching apparatus and method for improving photoetching machine overlay accuracy
CN103020079A (en) * 2011-09-24 2013-04-03 国家电网公司 Industrial data supplementation method
CN103036714A (en) * 2012-12-10 2013-04-10 上海斐讯数据通信技术有限公司 Method and device of performance index obtaining for irrelevant device and corresponding network management system
CN103246702A (en) * 2013-04-02 2013-08-14 大连理工大学 Industrial sequential data missing filling method based on sectional state displaying
CN103377298A (en) * 2012-04-24 2013-10-30 富士通株式会社 Parameter selecting method and device
CN103678721A (en) * 2014-01-02 2014-03-26 中国联合网络通信集团有限公司 Method and device for processing missing data
CN104123312A (en) * 2013-04-28 2014-10-29 国际商业机器公司 Data mining method and device
CN104133866A (en) * 2014-07-18 2014-11-05 国家电网公司 Intelligent-power-grid-oriented missing data filling method
CN104133992A (en) * 2014-07-21 2014-11-05 快威科技集团有限公司 Assessment reference building method and assessment reference building device based on information security assessment correlation
CN104143128A (en) * 2014-07-21 2014-11-12 快威科技集团有限公司 Information system security evaluation index development method and device
CN104216916A (en) * 2013-06-04 2014-12-17 腾讯科技(深圳)有限公司 Data reduction method and device
CN104268658A (en) * 2014-09-29 2015-01-07 招商局重庆交通科研设计院有限公司 Bridge structure safety monitoring data prediction method
CN104516879A (en) * 2013-09-26 2015-04-15 Sap欧洲公司 Method and system for managing database containing record with missing value
CN104520846A (en) * 2012-05-09 2015-04-15 摩福公司 Method for checking data of database relating to persons
CN105183785A (en) * 2015-08-17 2015-12-23 上海斐讯数据通信技术有限公司 Data mining method and system for protecting association rule of original transaction data set
CN105335592A (en) * 2014-06-25 2016-02-17 国际商业机器公司 Method and equipment for generating data in missing section of time data sequence
CN105760952A (en) * 2016-02-15 2016-07-13 国网山东省电力公司电力科学研究院 Load prediction method based on Kalman filtering and self-adaptive fuzzy neural network
CN106156260A (en) * 2015-04-28 2016-11-23 阿里巴巴集团控股有限公司 The method and apparatus that a kind of shortage of data is repaired
CN106408141A (en) * 2015-07-28 2017-02-15 平安科技(深圳)有限公司 Abnormal expense automatic extraction system and method
CN106778048A (en) * 2017-03-10 2017-05-31 广州视源电子科技股份有限公司 The method and device of data processing
CN106844290A (en) * 2015-12-03 2017-06-13 南京南瑞继保电气有限公司 A kind of time series data processing method based on curve matching
CN107038460A (en) * 2017-04-10 2017-08-11 南京航空航天大学 A kind of ship monitor shortage of data value complementing method based on improvement KNN
CN107294795A (en) * 2017-08-02 2017-10-24 上海上讯信息技术股份有限公司 A kind of network security situation prediction method and equipment
CN107590022A (en) * 2016-07-08 2018-01-16 上海东方延华节能技术服务股份有限公司 A kind of instrument to collect data recovery method for building energy consumption metering separate
CN107766877A (en) * 2017-09-27 2018-03-06 华南理工大学 Overweight car dynamic identifying method in a kind of bridge monitoring system
CN108169621A (en) * 2017-12-05 2018-06-15 国电南瑞科技股份有限公司 Taiwan area power-off event complementing method based on support vector machines
CN108829641A (en) * 2018-01-02 2018-11-16 西安优势物联网科技有限公司 A kind of measurement process check method based on statistical technique
CN109297491A (en) * 2018-09-06 2019-02-01 西安云景智维科技有限公司 A kind of indoor positioning navigation methods and systems
CN109376478A (en) * 2018-11-28 2019-02-22 中铁大桥(南京)桥隧诊治有限公司 Bridge health monitoring fault data restorative procedure and system
CN110147367A (en) * 2019-05-14 2019-08-20 中国科学院深圳先进技术研究院 A kind of temperature missing data complementing method, system and electronic equipment
CN110162576A (en) * 2019-04-22 2019-08-23 广东电网有限责任公司信息中心 Data predication method, system and electronic equipment based on system index data
CN110826718A (en) * 2019-09-20 2020-02-21 广东工业大学 Naive Bayes-based large-segment unequal-length missing data filling method
CN110836649A (en) * 2019-11-11 2020-02-25 汕头市超声仪器研究所有限公司 Self-adaptive spatial composite ultrasonic imaging method
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
CN111177135A (en) * 2019-12-27 2020-05-19 清华大学 Landmark-based data filling method and device
CN111191193A (en) * 2020-01-17 2020-05-22 南京工业大学 Long-term soil temperature and humidity high-precision prediction method based on autoregressive moving average model
WO2020140662A1 (en) * 2019-01-02 2020-07-09 深圳壹账通智能科技有限公司 Data table filling method, apparatus, computer device, and storage medium
CN111401553A (en) * 2020-03-12 2020-07-10 南京航空航天大学 Missing data filling method and system based on neural network
CN111443163A (en) * 2020-03-10 2020-07-24 中国科学院深圳先进技术研究院 Interpolation method and device for ozone missing data and interpolation equipment
CN112287562A (en) * 2020-11-18 2021-01-29 国网新疆电力有限公司经济技术研究院 Power equipment retired data completion method and system
WO2021017363A1 (en) * 2019-07-31 2021-02-04 烽火通信科技股份有限公司 Updating method and system for optical performance degradation trend prediction
CN112540407A (en) * 2020-12-01 2021-03-23 中国煤炭地质总局地球物理勘探研究院 Method for establishing prestack depth migration anisotropic field
CN112559502A (en) * 2020-12-01 2021-03-26 国能日新科技股份有限公司 Wind power plant data management system based on time sequence database platform
CN112765553A (en) * 2021-01-14 2021-05-07 深圳市伟峰科技有限公司 Engineering project management system based on big data
CN113554106A (en) * 2021-07-28 2021-10-26 桂林电子科技大学 Collaborative completion method for power missing data
CN113742296A (en) * 2021-09-09 2021-12-03 诺优信息技术(上海)有限公司 Method and device for slicing drive test data and electronic equipment
CN113782038A (en) * 2021-09-13 2021-12-10 北京声智科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN114550945A (en) * 2022-02-21 2022-05-27 湖北省疾病预防控制中心(湖北省预防医学科学院) Method for repairing missing data in pulmonary function detection
CN113742296B (en) * 2021-09-09 2024-04-30 诺优信息技术(上海)有限公司 Drive test data slicing processing method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114946A (en) * 2007-09-12 2008-01-30 中兴通讯股份有限公司 Method for collecting performance object data in telecommunication network management system
CN101136781A (en) * 2007-09-30 2008-03-05 亿阳信通股份有限公司 Performance data acquisition occasion control method and device in network management system
CN101183993A (en) * 2007-12-21 2008-05-21 亿阳信通股份有限公司 Network management system and performance data processing method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114946A (en) * 2007-09-12 2008-01-30 中兴通讯股份有限公司 Method for collecting performance object data in telecommunication network management system
CN101136781A (en) * 2007-09-30 2008-03-05 亿阳信通股份有限公司 Performance data acquisition occasion control method and device in network management system
CN101183993A (en) * 2007-12-21 2008-05-21 亿阳信通股份有限公司 Network management system and performance data processing method

Cited By (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102195814A (en) * 2011-05-04 2011-09-21 成都勤智数码科技有限公司 Method and device for forecasting and predicting by using relevant IT (Information Technology) operation and maintenance indexes
CN102195814B (en) * 2011-05-04 2013-11-20 成都勤智数码科技有限公司 Method and device for forecasting and predicting by using relevant IT (Information Technology) operation and maintenance indexes
CN103020079A (en) * 2011-09-24 2013-04-03 国家电网公司 Industrial data supplementation method
CN103020079B (en) * 2011-09-24 2017-03-08 国家电网公司 A kind of industrial data supplementation method
CN102411268A (en) * 2011-11-30 2012-04-11 上海华力微电子有限公司 Photoetching apparatus and method for improving photoetching machine overlay accuracy
CN103377298A (en) * 2012-04-24 2013-10-30 富士通株式会社 Parameter selecting method and device
CN103377298B (en) * 2012-04-24 2016-08-03 富士通株式会社 The method and apparatus of Selection parameter
CN104520846A (en) * 2012-05-09 2015-04-15 摩福公司 Method for checking data of database relating to persons
CN104520846B (en) * 2012-05-09 2019-03-19 摩福公司 The method of data relevant to people in inspection database
CN103036714B (en) * 2012-12-10 2016-01-20 上海斐讯数据通信技术有限公司 Device-independent acquiring performance index method, device and corresponding network management system
CN103036714A (en) * 2012-12-10 2013-04-10 上海斐讯数据通信技术有限公司 Method and device of performance index obtaining for irrelevant device and corresponding network management system
CN103246702B (en) * 2013-04-02 2016-01-06 大连理工大学 A kind of complementing method of the industrial sequence data disappearance based on segmentation Shape Representation
CN103246702A (en) * 2013-04-02 2013-08-14 大连理工大学 Industrial sequential data missing filling method based on sectional state displaying
CN104123312B (en) * 2013-04-28 2018-02-16 国际商业机器公司 A kind of data digging method and device
CN104123312A (en) * 2013-04-28 2014-10-29 国际商业机器公司 Data mining method and device
CN104216916A (en) * 2013-06-04 2014-12-17 腾讯科技(深圳)有限公司 Data reduction method and device
CN104216916B (en) * 2013-06-04 2018-07-03 腾讯科技(深圳)有限公司 Data restoration method and device
CN104516879A (en) * 2013-09-26 2015-04-15 Sap欧洲公司 Method and system for managing database containing record with missing value
CN104516879B (en) * 2013-09-26 2019-09-13 Sap欧洲公司 For managing the method and system for containing the database of the record with missing values
CN103678721A (en) * 2014-01-02 2014-03-26 中国联合网络通信集团有限公司 Method and device for processing missing data
CN105335592A (en) * 2014-06-25 2016-02-17 国际商业机器公司 Method and equipment for generating data in missing section of time data sequence
CN104133866A (en) * 2014-07-18 2014-11-05 国家电网公司 Intelligent-power-grid-oriented missing data filling method
CN104133992A (en) * 2014-07-21 2014-11-05 快威科技集团有限公司 Assessment reference building method and assessment reference building device based on information security assessment correlation
CN104143128A (en) * 2014-07-21 2014-11-12 快威科技集团有限公司 Information system security evaluation index development method and device
CN104268658B (en) * 2014-09-29 2017-10-10 招商局重庆交通科研设计院有限公司 A kind of Forecasting Methodology of bridge structure safe Monitoring Data
CN104268658A (en) * 2014-09-29 2015-01-07 招商局重庆交通科研设计院有限公司 Bridge structure safety monitoring data prediction method
CN106156260A (en) * 2015-04-28 2016-11-23 阿里巴巴集团控股有限公司 The method and apparatus that a kind of shortage of data is repaired
CN106156260B (en) * 2015-04-28 2020-01-21 阿里巴巴集团控股有限公司 Method and device for repairing missing data
CN106408141A (en) * 2015-07-28 2017-02-15 平安科技(深圳)有限公司 Abnormal expense automatic extraction system and method
CN105183785A (en) * 2015-08-17 2015-12-23 上海斐讯数据通信技术有限公司 Data mining method and system for protecting association rule of original transaction data set
CN105183785B (en) * 2015-08-17 2019-08-16 上海斐讯数据通信技术有限公司 A kind of data digging method and system for protecting former transaction data collection correlation rule
CN106844290A (en) * 2015-12-03 2017-06-13 南京南瑞继保电气有限公司 A kind of time series data processing method based on curve matching
CN106844290B (en) * 2015-12-03 2019-05-21 南京南瑞继保电气有限公司 A kind of time series data processing method based on curve matching
CN105760952A (en) * 2016-02-15 2016-07-13 国网山东省电力公司电力科学研究院 Load prediction method based on Kalman filtering and self-adaptive fuzzy neural network
CN107590022A (en) * 2016-07-08 2018-01-16 上海东方延华节能技术服务股份有限公司 A kind of instrument to collect data recovery method for building energy consumption metering separate
CN107590022B (en) * 2016-07-08 2021-06-25 上海东方延华节能技术服务股份有限公司 Instrument collected data restoration method for building energy consumption subentry measurement
CN106778048A (en) * 2017-03-10 2017-05-31 广州视源电子科技股份有限公司 The method and device of data processing
CN106778048B (en) * 2017-03-10 2019-07-16 广州视源电子科技股份有限公司 The method and device of data processing
CN107038460A (en) * 2017-04-10 2017-08-11 南京航空航天大学 A kind of ship monitor shortage of data value complementing method based on improvement KNN
CN107294795A (en) * 2017-08-02 2017-10-24 上海上讯信息技术股份有限公司 A kind of network security situation prediction method and equipment
CN107766877A (en) * 2017-09-27 2018-03-06 华南理工大学 Overweight car dynamic identifying method in a kind of bridge monitoring system
CN107766877B (en) * 2017-09-27 2020-05-22 华南理工大学 Method for dynamically identifying overweight vehicle in bridge monitoring system
CN108169621A (en) * 2017-12-05 2018-06-15 国电南瑞科技股份有限公司 Taiwan area power-off event complementing method based on support vector machines
CN108829641B (en) * 2018-01-02 2021-12-28 西安优势物联网科技有限公司 Measurement process checking method based on statistical technology
CN108829641A (en) * 2018-01-02 2018-11-16 西安优势物联网科技有限公司 A kind of measurement process check method based on statistical technique
CN109297491A (en) * 2018-09-06 2019-02-01 西安云景智维科技有限公司 A kind of indoor positioning navigation methods and systems
CN109376478A (en) * 2018-11-28 2019-02-22 中铁大桥(南京)桥隧诊治有限公司 Bridge health monitoring fault data restorative procedure and system
WO2020140662A1 (en) * 2019-01-02 2020-07-09 深圳壹账通智能科技有限公司 Data table filling method, apparatus, computer device, and storage medium
CN110162576A (en) * 2019-04-22 2019-08-23 广东电网有限责任公司信息中心 Data predication method, system and electronic equipment based on system index data
CN110147367A (en) * 2019-05-14 2019-08-20 中国科学院深圳先进技术研究院 A kind of temperature missing data complementing method, system and electronic equipment
WO2021017363A1 (en) * 2019-07-31 2021-02-04 烽火通信科技股份有限公司 Updating method and system for optical performance degradation trend prediction
CN110826718A (en) * 2019-09-20 2020-02-21 广东工业大学 Naive Bayes-based large-segment unequal-length missing data filling method
CN110826718B (en) * 2019-09-20 2022-05-13 广东工业大学 Method for filling large-section unequal-length missing data based on naive Bayes
CN110836649A (en) * 2019-11-11 2020-02-25 汕头市超声仪器研究所有限公司 Self-adaptive spatial composite ultrasonic imaging method
CN110836649B (en) * 2019-11-11 2021-05-18 汕头市超声仪器研究所股份有限公司 Self-adaptive spatial composite ultrasonic imaging method
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
CN111177135A (en) * 2019-12-27 2020-05-19 清华大学 Landmark-based data filling method and device
CN111177135B (en) * 2019-12-27 2020-11-10 清华大学 Landmark-based data filling method and device
CN111191193A (en) * 2020-01-17 2020-05-22 南京工业大学 Long-term soil temperature and humidity high-precision prediction method based on autoregressive moving average model
CN111443163A (en) * 2020-03-10 2020-07-24 中国科学院深圳先进技术研究院 Interpolation method and device for ozone missing data and interpolation equipment
CN111401553A (en) * 2020-03-12 2020-07-10 南京航空航天大学 Missing data filling method and system based on neural network
CN112287562A (en) * 2020-11-18 2021-01-29 国网新疆电力有限公司经济技术研究院 Power equipment retired data completion method and system
CN112540407A (en) * 2020-12-01 2021-03-23 中国煤炭地质总局地球物理勘探研究院 Method for establishing prestack depth migration anisotropic field
CN112559502A (en) * 2020-12-01 2021-03-26 国能日新科技股份有限公司 Wind power plant data management system based on time sequence database platform
CN112540407B (en) * 2020-12-01 2023-04-25 中国煤炭地质总局地球物理勘探研究院 Pre-stack depth migration anisotropic field establishment method
CN112765553B (en) * 2021-01-14 2021-08-24 深圳市伟峰科技有限公司 Engineering project management system based on big data
CN112765553A (en) * 2021-01-14 2021-05-07 深圳市伟峰科技有限公司 Engineering project management system based on big data
CN113554106A (en) * 2021-07-28 2021-10-26 桂林电子科技大学 Collaborative completion method for power missing data
CN113554106B (en) * 2021-07-28 2022-03-18 桂林电子科技大学 Collaborative completion method for power missing data
CN113742296A (en) * 2021-09-09 2021-12-03 诺优信息技术(上海)有限公司 Method and device for slicing drive test data and electronic equipment
CN113742296B (en) * 2021-09-09 2024-04-30 诺优信息技术(上海)有限公司 Drive test data slicing processing method and device and electronic equipment
CN113782038A (en) * 2021-09-13 2021-12-10 北京声智科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN114550945A (en) * 2022-02-21 2022-05-27 湖北省疾病预防控制中心(湖北省预防医学科学院) Method for repairing missing data in pulmonary function detection

Also Published As

Publication number Publication date
CN102025531B (en) 2014-03-05

Similar Documents

Publication Publication Date Title
CN102025531B (en) Filling method and device thereof for performance data
Chawla et al. Modeling spatial dependencies for mining geospatial data
Field et al. Optimizing allocation of monitoring effort under economic and observational constraints
Costantini et al. A hierarchical procedure for the combination of forecasts
Sohn et al. Decision tree based on data envelopment analysis for effective technology commercialization
Claveria et al. A new approach for the quantification of qualitative measures of economic expectations
CN109376924A (en) A kind of method, apparatus, equipment and the readable storage medium storing program for executing of material requirements prediction
CN107610021A (en) The comprehensive analysis method of environmental variance spatial and temporal distributions
CN106649832B (en) Estimation method and device based on missing data
CN101771758A (en) Dynamic determine method for normal fluctuation range of performance index value and device thereof
D’Agostino et al. Nowcasting business cycles: A Bayesian approach to dynamic heterogeneous factor models
CN103024762A (en) Service feature based communication service forecasting method
KR101919076B1 (en) Time-series data predicting system
WO2020164740A1 (en) Methods and systems for automatically selecting a model for time series prediction of a data stream
CN110414715B (en) Community detection-based passenger flow volume early warning method
CN112288197B (en) Intelligent scheduling method and device for station vehicles
CN104735710A (en) Mobile network performance early warning pre-judging method based on trend extrapolation clustering
Ravishanker et al. Hierarchical dynamic models for multivariate times series of counts
CN104794112B (en) Time Series Processing method and device
CN101739614A (en) Hierarchy-combined prediction method for communication service
CN111935741A (en) Method, device and system for detecting poor quality cell of communication network
Almeida et al. The impact of uncertainty in the measurement of progress in earned value analysis
CN115456306A (en) Bus load prediction method, system, equipment and storage medium
Han et al. gcKrig: An R package for the analysis of geostatistical count data using gaussian copulas
Darudi et al. Partial mutual information based algorithm for input variable selection for time series forecasting

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant