CN110135012A - A kind of regression coefficient of system linear regressive prediction model determines method - Google Patents
A kind of regression coefficient of system linear regressive prediction model determines method Download PDFInfo
- Publication number
- CN110135012A CN110135012A CN201910334251.6A CN201910334251A CN110135012A CN 110135012 A CN110135012 A CN 110135012A CN 201910334251 A CN201910334251 A CN 201910334251A CN 110135012 A CN110135012 A CN 110135012A
- Authority
- CN
- China
- Prior art keywords
- data
- model coefficient
- model
- coefficient
- systematic sampling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 230000001373 regressive effect Effects 0.000 title claims abstract description 14
- 238000005070 sampling Methods 0.000 claims abstract description 97
- 230000009897 systematic effect Effects 0.000 claims abstract description 74
- 230000009467 reduction Effects 0.000 claims abstract description 32
- 238000012417 linear regression Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 85
- 230000008569 process Effects 0.000 abstract description 11
- 238000004458 analytical method Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000012804 iterative process Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 4
- 241001269238 Data Species 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000032683 aging Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000010248 power generation Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000035882 stress Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Complex Calculations (AREA)
Abstract
The regression coefficient that the present invention discloses a kind of system linear regressive prediction model determines method and system, and method includes: the initial model coefficient based on system initial samples data sequence computing system Linear Regression Forecasting Model, and is recorded;New systematic sampling data sequence is obtained with each increase of systematic sampling point, determines the systematic sampling point data of increased systematic sampling point data and reduction;Based on increase/reduction systematic sampling point data and "current" model coefficient, model coefficient is updated;It is then based on reduction/increased systematic sampling point data and updated model coefficient, updates model coefficient again, increase corresponding model coefficient as each systematic sampling point and is recorded.Present invention computation model coefficient in the way of iteration can avoid a large amount of occupancy in calculating process to server resource, improve the efficiency that system linear regression model coefficient calculates, and ensure the real-time calculated linear regression model (LRM) coefficient.
Description
Technical field
The present invention relates to computer datas to analyze processing technology field, especially a kind of to calculate in real time suitable for big data quantity
Scene determines method based on the regression coefficient of the system linear regressive prediction model of iteration.
Background technique
Actual industrial system in the process of running, join in stateful Monitor And Control Subsystem record system by the operation of key equipment
Number, these parameters are usually the data such as temperature, pressure, flow, electric current, voltage.And in the process for carrying out analysis optimization to system
In, usually more concerned with data such as assembly property decaying, prospective earnings, O&M plans.These data can not be from Monitor And Control Subsystem
It intuitively obtains, and is embodied in the Secular Variation Tendency of these data in the data of acquisition.
Based on the analysis to system principle, linearisation or local linearization can be carried out to system, obtain system parameter
Simple linear representation form between data, the i.e. linear model of system.Coefficient and system component according to system principle, in model
Performance, transfer efficiency, maintenance threshold value etc. are directly related.Therefore, it according to the Secular Variation Tendency of model coefficient, can be realized to being
The work such as the assessment of system performance degradation, earnings forecast at a specified future date, O&M node optimization.
System model after linearisation can be denoted as YT=WTX, wherein X=[X1 X2 ... Xn] is that several sampled points are
System input, YT=[y1 y2 ...] is the system output of corresponding sampled point model prediction, and W is model coefficient.In real system,
Output and input parameter values in the different available different real systems of sampled point, be brought into model constitute one it is super
Determine equation group, the value of model coefficient W can be estimated by least square method.If therefore can be according to before prediction samples point
A dry group system actual parameter, estimating system prediction samples point model coefficient, and then according to the variation tendency of model coefficient into
Row system performance analysis.
Least square method is related to big when the data parameters group number that model dimension is higher or equation group calculates is larger
Moment matrix operation, and need to expend a large amount of computing resources to intermediate matrix inversion.Therefore, on the server to system model into
When row detailed analysis calculates, a large amount of system resources in computation can be occupied.On the one hand increase server stress, while can also reduce to being
The feasibility that model detailed analysis of uniting solves.
In order to sufficiently eliminate the influence that random error and bursty interference estimate model coefficient, usually needed when calculating
The system measured data introduced within the scope of the long period participates in operation.For example, practical calculate to obtain the mould of a sampled point
Type coefficient, it may be necessary to even 1 year continuous one month operation data of system before this.It requires to obtain such as if calculated every time
This mass data will all form immense pressure to server storage I/O bandwidth, memory size etc..
Summary of the invention
The object of the present invention is to provide a kind of regression coefficients of system linear regressive prediction model to determine method, utilizes iteration
Mode computation model coefficient, avoid a large amount of occupancy in calculating process to server resource, improve system linear regression model
The efficiency that coefficient calculates ensures the real-time calculated linear regression model (LRM) coefficient.
The technical scheme adopted by the invention is as follows: a kind of regression coefficient of system linear regressive prediction model determines method, wraps
It includes:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence
The model coefficient of column count system linear regressive prediction model, and record;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling number
According to sequence;Determine new systematic sampling data sequence compared with the increased systematic sampling point data of initial samples data sequence institute and reduction
Systematic sampling point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;Then base
In reduction/increased systematic sampling point data and updated model coefficient, model coefficient is updated again, and record;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, hits
According to the model coefficient after sequence variation, and record.
The present invention also realizes the analysis to system performance variation, that is, further includes step S5, based on the model coefficient recorded
The variation of system performance is analyzed.Carrying out analysis to system performance according to model coefficient variation can be used the prior art.
As a kind of specific embodiment, in step S1, for system model YT=WTX, Y are corresponding sampling input quantity X
System output quantity, W is model coefficient, then has intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN。
Being considered as Ridge Regression Modeling Method can effectively deal with the high scene of sample point data synteny, it is therefore preferred that step
In S1, for system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model coefficient, then has intermediate square
Battle array CN=(XXT+λI)-1, λ is coefficients of ridge regression, and I is unit matrix, "current" model coefficient matrix WN T=YTXTCN.It can be first
Method is integrally changed to Ridge Regression Modeling Method when beginningization.
In step S2, whenever increasing a sample point data, that is, sampled data window, the sample point data of the new write-in is written
I.e. described " increased systematic sampling point data ".Since sampled data window is based on preset data length, primary data sequence
Earliest sample point data will go out sampled data window by " crowded " in the sampling time in column, the quilt it is " crowded " go out sample point data it is i.e. described
" sample point data of reduction ".The length of window of sampled data window can be set as needed, be the prior art.
In step S3, model coefficient first can be carried out more according to increased systematic sampling point data and "current" model coefficient
Newly, the update of model coefficient is then carried out further according to the systematic sampling point data of reduction and "current" model coefficient, it also can sequence
It is on the contrary.
Preferably, carrying out model coefficient update based on increased systematic sampling point data and "current" model coefficient includes:
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1 ... xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1 ... yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1;
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+;
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith
"current" model coefficient matrix WN。
Preferably, the systematic sampling point data based on reduction and "current" model coefficient progress model coefficient update and include:
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y_
=yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN-=CN+CNx_(I+x_ TCNx_)x_ TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx_(I+x_ TCNx_)x_ TCN-y_Tx_ TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith
"current" model coefficient matrix WN。
In step S1, the length of window of initial data sequence is as needed and computer computation ability adjusts, if practical
Obtained initial data sequence is longer, it is assumed that is M, can set lesser m as data window length, be based on using S1
The model coefficient of m sample data sequence is as initial model coefficient, then using step 2 to step 3, gradually increases sampling
Each sample point data after point m, each sequence variation all obtain corresponding model coefficient and record as "current" model system
It counts, in the calculating of iteration to model coefficient next time, until obtaining the model coefficient of corresponding sample data point M.
Preferably, step S1, in S3 and S4, the model coefficient result being calculated is recorded as correspondence respectively and is accordingly adopted
The model coefficient of last samples point in sample data sequence.
Invention additionally discloses a kind of regression coefficients of system linear regressive prediction model to determine system, comprising:
Initial model coefficient determination module obtains system initial samples data for being based on preset data window length
Sequence using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and records;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window
Length forms new systematic sampling data sequence;Determine that new systematic sampling data sequence is increased compared with initial samples data sequence
The systematic sampling point data of the systematic sampling point data and reduction that add;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model system
Number updates model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more
New model coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence
Column, model coefficient update the model system that computing module is calculated after each sample point data increase, sample data sequence variation
Number, and record.
Beneficial effect
Compared with prior art, the present invention has the following advantages that and improves:
1. the computation model coefficient in the way of iteration, each iteration need to only be obtained from monitoring system or existing database to be increased
The constant history input sample value of the model sampled value and quantity added, it is unrelated with sampling window length.It is adopted even if iteration is corresponding
Sample length of window is 1 year even longer, and practical each iterative step only takes less data, can reduce and store IO to server
The requirement of bandwidth, memory size etc..
2. iterative process next step calculating only relies on previous step and acquires model coefficient and intermediary matrix, they are that method is adjacent
Quantity of state between iteration.Hypothetical model coefficient dimension is k, then state numerical quantity only has k* (k+1) a.Therefore very little is only taken up
Memory space can support iterative process.
Add, subtract, multiplying calculating 3. iterative process only uses matrix, not needing to carry out complicated decomposition operation of inverting, computer
Operation pressure it is small, the real-time of data can be ensured;
4. method single iteration calculates, pressure is small, and acquisition data are few, therefore can be parallel under the support of limited service device resource
It handles a large amount of models to calculate, real-time computation model coefficient can be updated with monitoring sampled data, to obtain the company of coefficient in real time
Continuous variation tendency.
Detailed description of the invention
Fig. 1 show the method for the present invention flow diagram;
Fig. 2 show a kind of specific embodiment flow diagram of the method for the present invention;
Fig. 3 show the effect diagram of application examples of the present invention.
Specific embodiment
It is further described below in conjunction with the drawings and specific embodiments.
Symbol description:
X: system input;Y: system output;W: model coefficient;C: intermediary matrix;I unit matrix;A: matrix-vector for example without
It illustrates, is column vector.
Subscript explanation:
T: transposition;- 1: inverting;H: transposition conjugation
Subscript explanation:
+: increased data;: the data of reduction
Embodiment 1
Refering to what is shown in Fig. 1, the regression coefficient of the system linear regressive prediction model of the present embodiment determines method, comprising:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence
The model coefficient of column count system linear regressive prediction model, and record;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling number
According to sequence;Determine new systematic sampling data sequence compared with the increased systematic sampling point data of initial samples data sequence institute and reduction
Systematic sampling point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;Then base
In reduction/increased systematic sampling point data and updated model coefficient, model coefficient is updated again, and record;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, hits
According to the model coefficient after sequence variation, and record.
The present invention is also used to realize the analysis to system performance variation, that is, further includes step S5, based on the model recorded
Coefficient analyzes the variation of system performance.Carrying out analysis to system performance according to model coefficient variation can be used existing skill
Art.
In step S1, for system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model system
Number, then have intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN。
Being considered as Ridge Regression Modeling Method can effectively deal with the high scene of sample point data synteny, therefore in step S1, right
In system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model coefficient, then has intermediary matrix CN=
(XXT+λI)-1, λ is coefficients of ridge regression, and I is unit matrix, "current" model coefficient matrix WN T=YTXTCN.It can be in initialization
Method is integrally changed to Ridge Regression Modeling Method.
In step S2, whenever increasing a sample point data, that is, sampled data window, the sample point data of the new write-in is written
I.e. described " increased systematic sampling point data ".Since sampled data window is based on preset data length, primary data sequence
Earliest sample point data will go out sampled data window by " crowded " in the sampling time in column, the quilt it is " crowded " go out sample point data it is i.e. described
" sample point data of reduction ".The length of window of sampled data window can be set as needed, be the prior art, iterative process
It is then unrelated with length of window.
In step S3, model coefficient first can be carried out more according to increased systematic sampling point data and "current" model coefficient
Newly, the update of model coefficient is then carried out further according to the systematic sampling point data of reduction and "current" model coefficient, it also can sequence
It is on the contrary.
Model coefficient update is carried out based on increased systematic sampling point data and "current" model coefficient specifically:
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1 … xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1 … yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1;
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+;
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith
"current" model coefficient matrix WN。
Systematic sampling point data and "current" model coefficient based on reduction carry out model coefficient update
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y-
=yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN=CN+CNx-(I+x- TCNx-)x- TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx-(I+x- TCNx_)x- TCN-y_ Tx_ TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith
"current" model coefficient matrix WN。
In step S1, the length of window of initial data sequence is as needed and computer computation ability adjusts, if practical
Obtained initial data sequence is longer, it is assumed that is M, can set lesser m as data window length, be based on using S1
The model coefficient of m sample data sequence is as initial model coefficient, then using step 2 to step 3, gradually increases sampling
Each sample point data after point m, each sequence variation all obtain corresponding model coefficient and record as "current" model system
It counts, in the calculating of iteration to model coefficient next time, until obtaining the model coefficient of corresponding sample data point M.
Step S1, it in S3 and S4, is recorded as the model coefficient result being calculated to correspond to corresponding sampled data respectively
The model coefficient of last samples point in sequence.
The principle of the method for the present invention are as follows: overdetermined equation YT=WTIn X, W is undetermined coefficient, each column of input quantity sampling matrix X
For a sampled data.Then seek W formula are as follows:
WT=YTXT(XXT)-1。
There is matrix inversion lemma:
(A+XYH)-1=A-1-A-1X(I+YHA-1X)-1YHA-1
Based on above formula, the feelings for increasing new sampled point in original sample point data collection X, reducing former sampled point are discussed
Shape.Wherein, remember C=(XXT)-1For intermediary matrix.
Increase data iteration and seek coefficient:
Original is to sampled point window XN、YNIt seeks obtaining WN、CN, newly increasing sampled point x+、y+Afterwards, new X is XN+1
=[XN x+], new Y is
So there is CN+1It is as follows to iterate to calculate formula:
CN+1=(XN+1XN+1 T)-1=(XNXN T+x+x+ T)-1
=(XNXN T)-1-(XNXN T)-1X+(I+x+ T(XNXN T)-1x+)x+ T(XNXN T)-1
=CN-CNx+(I+x+ TCNx+)x+ TCN
Because of WN T=YN TXN TCN, and then have WN+1It is as follows to iterate to calculate formula:
It reduces data iteration and seeks coefficient:
Original is to sampled point window XN、YNIt seeks obtaining WN、CN.Remember XN=[x- XN+1],It is reducing
Former sampled point x-、y-Afterwards, remember that new sampled point window data is XN+1、YN+1。
So having:
XNXN T=[x- XN+1][x- XN+1]T=XN+1 XN+1 T+x-x- T
So there is CN+1It is as follows to iterate to calculate formula:
CN+1=(XN+1 XN+1 T)-1=(XNXN T-x-x- T)-1
=(XNXn T)-1+(XN XN T)-1x-(I+x- T(XN XN T)-1x-)x- T(XN XN T)-1
=CN+CNx-(I+x- TCNx-)x- TCN
Because of WN T=YN TxN TCN, and then have WN+1It is as follows to iterate to calculate formula:
WN+1 T=YN+1 TXN+1 T(XN+1 XN+1 T)-1=(YN TXNT-y- Tx- T)CN+1
=YN TXN TCN+YN TXN TCNx-(I+x- TCNx-)x- TCN-y- Tx- TCN+1
=WN T+WN Tx-(I+x- TCNx-)x- TCN-y- Tx- TCN+1
Embodiment 1-1
Increase data the method for the present invention includes iteration and iteration reduces two self-contained process of data.In known one group of sampling
In the case of point regression coefficient, data are increased by iteration, obtain this group of sampled point finally and increase the regression coefficient of sampled point entirety.
In known one group of sampled point regression coefficient, data are reduced by iteration, obtains to reduce in this group of sampled point finally and partially adopt
Regression coefficient after sampling point.
Refering to what is shown in Fig. 2, to guarantee numerical stability, the present embodiment first increases data in iterative process, reduces number afterwards
According to reference to Fig. 2, specific step is as follows.
1. initialization:
Assuming that current sampling point N is directed toward sampled point i;
Obtain original input data xN-m+1..., xN, constitute matrix X=[xN-m+1 … xN];
Obtain original output data yN-m+1..., yN, constitute matrix YT=[yN-m+1 … yN];
According to formula CN=(XXT)-1Calculate intermediary matrix CN;
W is calculated according to formulaN T=YTXTCNComputation model coefficient WN;
Recording status WN、CNFor "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m data of sampled point i-m+1 to i.That is, WN、CNIt is the model of corresponding sampled point i
Coefficient and intermediary matrix.
2. increasing data point:
Obtain increased original input data xN+1, constitute matrix x+=xN+1, new input quantity sampling matrix X is XN+1=
[XN x+];
Obtain increased original output data yN+1, constitute matrix y+=yN+1, new output quantity sampling matrix Y is
According to formula CN+=CN-CNx+(I+x+ TCNx+)x+ TCN, by current intermediary matrix CNIterate to calculate CN+;
According to formula WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+By "current" model coefficient WN、CNWith step 2.3
As a result CN+Iterate to calculate WN+。
Recording status WN=WN+、CN=CN+For "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m+1 data of sampled point N-m+1 to N+1.
3. reducing data point:
Obtain reduced original input data xN-m+1, constitute matrix x-=xN-m+1;
Obtain the original output data y reducedN-m+1, constitute matrix y-=yN-m+1;
According to formula CN-=CN+CNx-(I+x- TCNx-)x- TCNBy current intermediary matrix CNIterate to calculate CN-;
According to formula WN- T=WN T+WN Tx_(I+x_ TCNx_)x_ TCN-y_ Tx_TCN-By "current" model coefficient WN, intermediary matrix CN
With step 3.3 result CN-Iterate to calculate WN-。
Recording status WN=WN-、CN=CN-For "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m data of sampled point N-m+2 to N+1.That is, WN、CNIt is the mould of sampled point i+1
Type coefficient and intermediary matrix.
Preservation state WNIt is calculated to database as the model coefficient of sampled point i+1 for subsequent analysis.
4. current sampling point N is directed toward i+1, step 2,3,4 model coefficients for successively seeking subsequent sampling point are repeated.
In the above process, it is equivalent in sampling point sequence and opens the window that a length is m.It is first in each iterative process
First increase a new sampled point in window tail portion, reduce by a sampled point on window head later, is equivalent to window entirety
Forward one.It calculates and preceding W is started according to iterationN、CNThe sample point data increased and decreased in numerical value and window can be calculated next to adopt
W at sampling pointN+1、XN+1, unrelated with length of window m.Therefore when needing successively to calculate model coefficient at all sampled points, pass through
Computation complexity can be greatly reduced in iterative algorithm.
In the above process, each iterative process, which increases sampling number k in window tail portion, can be greater than 1.At this time in window head
Portion reduces sampling number should be corresponding with points are increased, and is also k.It calculates and preceding W is started according to iterationN、CNIncrease and decrease in numerical value and window
Sample point data can calculate to obtain W after k sampled pointN+k、CN+k。
In the above process, initialization step can not disposably load all m sampled points in window, but repeatedly call step
Rapid 2 successively increase sampled point, until all m sampled point increases finish.Initial phase can be reduced to account for the memory of server
With.
Embodiment 2
Invention additionally discloses a kind of regression coefficients of system linear regressive prediction model to determine system, comprising:
Initial model coefficient determination module obtains system initial samples data for being based on preset data window length
Sequence using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and records;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window
Length forms new systematic sampling data sequence;Determine that new systematic sampling data sequence is increased compared with initial samples data sequence
The systematic sampling point data of the systematic sampling point data and reduction that add;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model system
Number updates model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more
New model coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence
Column, model coefficient update the model system that computing module is calculated after each sample point data increase, sample data sequence variation
Number, and record.
As a kind of application examples, the present invention is used in photovoltaic plant O&M scenarios, log history data time interval
Usually 15 minutes even it is shorter.It is calculated with this, in 25 years power station life cycles, all data of record will have about 1,000,000
Item, wherein 1 year data will have 40,000.In order to estimate the component aging influence to generated energy, the appraising model of foundation is with day
8 monitoring parameters such as gas bar part are as input, using actual power generation as output.In model, assembly property attenuation rate is at any time
It gradually increases, is embodied on the consecutive variations of 8 inputs parameters and model constants item at any time.Meanwhile being blocked for reduction,
The influence that aging is estimated in the chance events such as cleaning, maintenance, calculates the time window for selecting continuous 1 year data to calculate as model
Mouthful.Therefore, after full 1 year of power station operation, in each monitoring sampling time point, system all can according to the data of the previous year calculate
Model coefficient.Finally, by the model coefficient in 24 years life cycles after acquisition power station at each sampling time point.According to model
The situation of change of coefficient can assess the performance degradation of different phase in the life cycle management of power station.
Simulation calculation has been carried out for above-mentioned application scenarios for the error condition of appraisal procedure iterative calculation.Emulation creation
The power station operation datas of 1,000,000 Noises, and using 40,000 datas as analysis window size, successively iterate to calculate all numbers
The absolute error of model coefficient at strong point.Abscissa is sampled point serial number, and ordinate is to return 8 input quantity moulds being calculated
The absolute error of type coefficient and constant entry value and preset value.Visible all error entry value do not dissipate in figure, and wherein data9 is corresponding
The absolute error of constant entry value, the highest in all errors, but still it is no more than 1e-10 magnitude.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions each in flowchart and/or the block diagram
The combination of process and/or box in process and/or box and flowchart and/or the block diagram.It can provide these computers
Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices
To generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices execute
In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram
It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
The embodiment of the present invention is described in conjunction with attached drawing above, but the invention is not limited to above-mentioned specific
Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art
Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much
Form, all of these belong to the protection of the present invention.
Claims (8)
1. a kind of regression coefficient of system linear regressive prediction model determines method, characterized in that include:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence meter
The model coefficient of system linear regressive prediction model is calculated, and is recorded;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling data sequence
Column;Determine new systematic sampling data sequence compared with initial samples data sequence increased systematic sampling point data and reduction be
System sample point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;It is then based on and subtracts
Few/increased systematic sampling point data and updated model coefficient, update model coefficient, and record again;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, sampled data sequence
Model coefficient after column variation, and record.
2. according to the method described in claim 1, it is characterized in that, further include step S5, based on the model coefficient recorded to being
The variation of system performance is analyzed.
3. according to the method described in claim 1, it is characterized in that, in step S1, for system model YT=WTX, Y are that correspondence is adopted
The system output quantity of sample input quantity X, W are model coefficient, then have intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN。
4. according to the method described in claim 1, it is characterized in that, in step S1, for system model YT=WTX, Y are that correspondence is adopted
The system output quantity of sample input quantity X, W are model coefficient, then have intermediary matrix CN=(XXT+λI)-1, λ is coefficients of ridge regression, and I is
Unit matrix, "current" model coefficient matrix WN T=YTXTCN。
5. the method according to claim 3 or 4, characterized in that be based on increased systematic sampling point data and current mould
Type coefficient carries out model coefficient update
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1…xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1…yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1;
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+;
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith it is current
Model coefficient matrix WN。
6. according to the method described in claim 5, it is characterized in that, systematic sampling point data and "current" model system based on reduction
Number carries out model coefficient update
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y_=
yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN-=CN+CNx-(I+x- TCNx-)x- TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx-(I+x- TCNx-)x- TCN-y- Tx-TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith it is current
Model coefficient matrix WN。
7. according to the method described in claim 1, it is characterized in that, step S1, in S3 and S4, the model that will be calculated respectively
Coefficient results are recorded as corresponding to the model coefficient of last samples point in corresponding sample data sequence.
8. a kind of regression coefficient of system linear regressive prediction model determines system, characterized in that include:
Initial model coefficient determination module, for obtaining system initial samples data sequence based on preset data window length,
Using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and record;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window length,
Form new systematic sampling data sequence;Determine new systematic sampling data sequence compared with increased system, initial samples data sequence institute
The systematic sampling point data for the sample point data and reduction of uniting;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model coefficient, more
New model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more new model
Coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence, mould
The model coefficient after each sample point data increase, sample data sequence variation is calculated in type coefficient update computing module, and
Record.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910334251.6A CN110135012B (en) | 2019-04-24 | 2019-04-24 | Regression coefficient determination method of system linear regression prediction model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910334251.6A CN110135012B (en) | 2019-04-24 | 2019-04-24 | Regression coefficient determination method of system linear regression prediction model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110135012A true CN110135012A (en) | 2019-08-16 |
CN110135012B CN110135012B (en) | 2023-12-22 |
Family
ID=67571023
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910334251.6A Active CN110135012B (en) | 2019-04-24 | 2019-04-24 | Regression coefficient determination method of system linear regression prediction model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110135012B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105893541A (en) * | 2016-03-31 | 2016-08-24 | 中国科学院软件研究所 | Streaming data self-adaption persistence method and system based on mixed storage |
CN106487358A (en) * | 2016-09-30 | 2017-03-08 | 西南大学 | A kind of maximal correlation entropy volume kalman filter method based on statistical linear regression |
US9760539B1 (en) * | 2015-02-28 | 2017-09-12 | Cloud & Stream Gears Llc | Incremental simple linear regression coefficient calculation for big data or streamed data using components |
CN107368649A (en) * | 2017-07-19 | 2017-11-21 | 许昌学院 | A kind of sequence optimisation test design method based on increment Kriging |
US9928215B1 (en) * | 2015-02-28 | 2018-03-27 | Cloud & Stream Gears Llc | Iterative simple linear regression coefficient calculation for streamed data using components |
-
2019
- 2019-04-24 CN CN201910334251.6A patent/CN110135012B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9760539B1 (en) * | 2015-02-28 | 2017-09-12 | Cloud & Stream Gears Llc | Incremental simple linear regression coefficient calculation for big data or streamed data using components |
US9928215B1 (en) * | 2015-02-28 | 2018-03-27 | Cloud & Stream Gears Llc | Iterative simple linear regression coefficient calculation for streamed data using components |
CN105893541A (en) * | 2016-03-31 | 2016-08-24 | 中国科学院软件研究所 | Streaming data self-adaption persistence method and system based on mixed storage |
CN106487358A (en) * | 2016-09-30 | 2017-03-08 | 西南大学 | A kind of maximal correlation entropy volume kalman filter method based on statistical linear regression |
CN107368649A (en) * | 2017-07-19 | 2017-11-21 | 许昌学院 | A kind of sequence optimisation test design method based on increment Kriging |
Non-Patent Citations (2)
Title |
---|
刘维嘉: "高峰期网络流量高精准度预测模型研究", 《网络新媒体技术》 * |
洪港 等: "《概率论与数理统计》", 31 January 2019 * |
Also Published As
Publication number | Publication date |
---|---|
CN110135012B (en) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4116026B2 (en) | Engine operation modeling method | |
CN113094860B (en) | Industrial control network flow modeling method based on attention mechanism | |
Clark et al. | Resource requirements for fault-tolerant quantum simulation: The ground state of the transverse Ising model | |
CN105678002A (en) | Plasma particle-field Vlasov-Maxwell system long-term, large-scale and high-fidelity analog method | |
CN110807508B (en) | Bus peak load prediction method considering complex weather influence | |
Qadrdan et al. | Probabilistic wind power forecasting using a single forecast | |
Brandi et al. | Improving stochastic dynamic programming on hydrothermal systems through an iterative process | |
Bekki et al. | Steady-state quantile parameter estimation: an empirical comparison of stochastic kriging and quantile regression | |
CN108462180B (en) | Method for determining probability optimal power flow based on vine copula function | |
Moya et al. | On approximating the dynamic response of synchronous generators via operator learning: A step towards building deep operator-based power grid simulators | |
Lu et al. | An efficient convergent lattice method for Asian option pricing with superlinear complexity | |
CN116627773B (en) | Abnormality analysis method and system of production and marketing difference statistics platform system | |
Ji et al. | Improving the Performance of Digitized Counterdiabatic Quantum Optimization via Algorithm-Oriented Qubit Mapping | |
CN110135012A (en) | A kind of regression coefficient of system linear regressive prediction model determines method | |
CN116306030A (en) | New energy prediction dynamic scene generation method considering prediction error and fluctuation distribution | |
KR102138227B1 (en) | An apparatus for optimizing fluid dynamics analysis and a method therefor | |
Bihlo et al. | Invariant discretization schemes using evolution-projection techniques | |
EP4220485A1 (en) | System, apparatus and method for calibrating simulation model of one or more assets | |
WO2019073913A1 (en) | Pseudo-data generating device, method and program | |
CN114265674A (en) | Task planning method based on reinforcement learning under time sequence logic constraint and related device | |
CN113094636A (en) | Interference user harmonic level estimation method based on massive monitoring data | |
WO2022106863A1 (en) | Method and system for accelerating the convergence of an iterative computation code of physical parameters of a multi-parameter system | |
Nzale et al. | A tool for automatic determination of model parameters using particle swarm optimization | |
Shao et al. | A distributed strategy for games in Euler–Lagrange systems with actuator dead zone | |
JP7371805B1 (en) | Driving support device, driving support method and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Room 301-1, Block B, Building 1, No. 1, Zhongguancun East Road, Haidian District, Beijing 100080 Applicant after: Beijing Qidi Qingyun Intelligent Energy Co.,Ltd. Address before: Room 301-1, Block B, Building 1, No. 1, Zhongguancun East Road, Haidian District, Beijing 100080 Applicant before: BEIJING TSING YUN ENERGY TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |