CN110135012A - A kind of regression coefficient of system linear regressive prediction model determines method - Google Patents

A kind of regression coefficient of system linear regressive prediction model determines method Download PDF

Info

Publication number
CN110135012A
CN110135012A CN201910334251.6A CN201910334251A CN110135012A CN 110135012 A CN110135012 A CN 110135012A CN 201910334251 A CN201910334251 A CN 201910334251A CN 110135012 A CN110135012 A CN 110135012A
Authority
CN
China
Prior art keywords
data
model coefficient
model
coefficient
systematic sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910334251.6A
Other languages
Chinese (zh)
Other versions
CN110135012B (en
Inventor
王新攀
靳洋
牛道恒
陈灵奎
王彦兵
王俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Being Tsingyun Solar Energy Tech Co Ltd
Original Assignee
Being Tsingyun Solar Energy Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Being Tsingyun Solar Energy Tech Co Ltd filed Critical Being Tsingyun Solar Energy Tech Co Ltd
Priority to CN201910334251.6A priority Critical patent/CN110135012B/en
Publication of CN110135012A publication Critical patent/CN110135012A/en
Application granted granted Critical
Publication of CN110135012B publication Critical patent/CN110135012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The regression coefficient that the present invention discloses a kind of system linear regressive prediction model determines method and system, and method includes: the initial model coefficient based on system initial samples data sequence computing system Linear Regression Forecasting Model, and is recorded;New systematic sampling data sequence is obtained with each increase of systematic sampling point, determines the systematic sampling point data of increased systematic sampling point data and reduction;Based on increase/reduction systematic sampling point data and "current" model coefficient, model coefficient is updated;It is then based on reduction/increased systematic sampling point data and updated model coefficient, updates model coefficient again, increase corresponding model coefficient as each systematic sampling point and is recorded.Present invention computation model coefficient in the way of iteration can avoid a large amount of occupancy in calculating process to server resource, improve the efficiency that system linear regression model coefficient calculates, and ensure the real-time calculated linear regression model (LRM) coefficient.

Description

A kind of regression coefficient of system linear regressive prediction model determines method
Technical field
The present invention relates to computer datas to analyze processing technology field, especially a kind of to calculate in real time suitable for big data quantity Scene determines method based on the regression coefficient of the system linear regressive prediction model of iteration.
Background technique
Actual industrial system in the process of running, join in stateful Monitor And Control Subsystem record system by the operation of key equipment Number, these parameters are usually the data such as temperature, pressure, flow, electric current, voltage.And in the process for carrying out analysis optimization to system In, usually more concerned with data such as assembly property decaying, prospective earnings, O&M plans.These data can not be from Monitor And Control Subsystem It intuitively obtains, and is embodied in the Secular Variation Tendency of these data in the data of acquisition.
Based on the analysis to system principle, linearisation or local linearization can be carried out to system, obtain system parameter Simple linear representation form between data, the i.e. linear model of system.Coefficient and system component according to system principle, in model Performance, transfer efficiency, maintenance threshold value etc. are directly related.Therefore, it according to the Secular Variation Tendency of model coefficient, can be realized to being The work such as the assessment of system performance degradation, earnings forecast at a specified future date, O&M node optimization.
System model after linearisation can be denoted as YT=WTX, wherein X=[X1 X2 ... Xn] is that several sampled points are System input, YT=[y1 y2 ...] is the system output of corresponding sampled point model prediction, and W is model coefficient.In real system, Output and input parameter values in the different available different real systems of sampled point, be brought into model constitute one it is super Determine equation group, the value of model coefficient W can be estimated by least square method.If therefore can be according to before prediction samples point A dry group system actual parameter, estimating system prediction samples point model coefficient, and then according to the variation tendency of model coefficient into Row system performance analysis.
Least square method is related to big when the data parameters group number that model dimension is higher or equation group calculates is larger Moment matrix operation, and need to expend a large amount of computing resources to intermediate matrix inversion.Therefore, on the server to system model into When row detailed analysis calculates, a large amount of system resources in computation can be occupied.On the one hand increase server stress, while can also reduce to being The feasibility that model detailed analysis of uniting solves.
In order to sufficiently eliminate the influence that random error and bursty interference estimate model coefficient, usually needed when calculating The system measured data introduced within the scope of the long period participates in operation.For example, practical calculate to obtain the mould of a sampled point Type coefficient, it may be necessary to even 1 year continuous one month operation data of system before this.It requires to obtain such as if calculated every time This mass data will all form immense pressure to server storage I/O bandwidth, memory size etc..
Summary of the invention
The object of the present invention is to provide a kind of regression coefficients of system linear regressive prediction model to determine method, utilizes iteration Mode computation model coefficient, avoid a large amount of occupancy in calculating process to server resource, improve system linear regression model The efficiency that coefficient calculates ensures the real-time calculated linear regression model (LRM) coefficient.
The technical scheme adopted by the invention is as follows: a kind of regression coefficient of system linear regressive prediction model determines method, wraps It includes:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence The model coefficient of column count system linear regressive prediction model, and record;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling number According to sequence;Determine new systematic sampling data sequence compared with the increased systematic sampling point data of initial samples data sequence institute and reduction Systematic sampling point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;Then base In reduction/increased systematic sampling point data and updated model coefficient, model coefficient is updated again, and record;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, hits According to the model coefficient after sequence variation, and record.
The present invention also realizes the analysis to system performance variation, that is, further includes step S5, based on the model coefficient recorded The variation of system performance is analyzed.Carrying out analysis to system performance according to model coefficient variation can be used the prior art.
As a kind of specific embodiment, in step S1, for system model YT=WTX, Y are corresponding sampling input quantity X System output quantity, W is model coefficient, then has intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN
Being considered as Ridge Regression Modeling Method can effectively deal with the high scene of sample point data synteny, it is therefore preferred that step In S1, for system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model coefficient, then has intermediate square Battle array CN=(XXT+λI)-1, λ is coefficients of ridge regression, and I is unit matrix, "current" model coefficient matrix WN T=YTXTCN.It can be first Method is integrally changed to Ridge Regression Modeling Method when beginningization.
In step S2, whenever increasing a sample point data, that is, sampled data window, the sample point data of the new write-in is written I.e. described " increased systematic sampling point data ".Since sampled data window is based on preset data length, primary data sequence Earliest sample point data will go out sampled data window by " crowded " in the sampling time in column, the quilt it is " crowded " go out sample point data it is i.e. described " sample point data of reduction ".The length of window of sampled data window can be set as needed, be the prior art.
In step S3, model coefficient first can be carried out more according to increased systematic sampling point data and "current" model coefficient Newly, the update of model coefficient is then carried out further according to the systematic sampling point data of reduction and "current" model coefficient, it also can sequence It is on the contrary.
Preferably, carrying out model coefficient update based on increased systematic sampling point data and "current" model coefficient includes:
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1 ... xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1 ... yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith "current" model coefficient matrix WN
Preferably, the systematic sampling point data based on reduction and "current" model coefficient progress model coefficient update and include:
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y_ =yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN-=CN+CNx_(I+x_ TCNx_)x_ TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx_(I+x_ TCNx_)x_ TCN-y_Tx_ TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith "current" model coefficient matrix WN
In step S1, the length of window of initial data sequence is as needed and computer computation ability adjusts, if practical Obtained initial data sequence is longer, it is assumed that is M, can set lesser m as data window length, be based on using S1 The model coefficient of m sample data sequence is as initial model coefficient, then using step 2 to step 3, gradually increases sampling Each sample point data after point m, each sequence variation all obtain corresponding model coefficient and record as "current" model system It counts, in the calculating of iteration to model coefficient next time, until obtaining the model coefficient of corresponding sample data point M.
Preferably, step S1, in S3 and S4, the model coefficient result being calculated is recorded as correspondence respectively and is accordingly adopted The model coefficient of last samples point in sample data sequence.
Invention additionally discloses a kind of regression coefficients of system linear regressive prediction model to determine system, comprising:
Initial model coefficient determination module obtains system initial samples data for being based on preset data window length Sequence using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and records;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window Length forms new systematic sampling data sequence;Determine that new systematic sampling data sequence is increased compared with initial samples data sequence The systematic sampling point data of the systematic sampling point data and reduction that add;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model system Number updates model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more New model coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence Column, model coefficient update the model system that computing module is calculated after each sample point data increase, sample data sequence variation Number, and record.
Beneficial effect
Compared with prior art, the present invention has the following advantages that and improves:
1. the computation model coefficient in the way of iteration, each iteration need to only be obtained from monitoring system or existing database to be increased The constant history input sample value of the model sampled value and quantity added, it is unrelated with sampling window length.It is adopted even if iteration is corresponding Sample length of window is 1 year even longer, and practical each iterative step only takes less data, can reduce and store IO to server The requirement of bandwidth, memory size etc..
2. iterative process next step calculating only relies on previous step and acquires model coefficient and intermediary matrix, they are that method is adjacent Quantity of state between iteration.Hypothetical model coefficient dimension is k, then state numerical quantity only has k* (k+1) a.Therefore very little is only taken up Memory space can support iterative process.
Add, subtract, multiplying calculating 3. iterative process only uses matrix, not needing to carry out complicated decomposition operation of inverting, computer Operation pressure it is small, the real-time of data can be ensured;
4. method single iteration calculates, pressure is small, and acquisition data are few, therefore can be parallel under the support of limited service device resource It handles a large amount of models to calculate, real-time computation model coefficient can be updated with monitoring sampled data, to obtain the company of coefficient in real time Continuous variation tendency.
Detailed description of the invention
Fig. 1 show the method for the present invention flow diagram;
Fig. 2 show a kind of specific embodiment flow diagram of the method for the present invention;
Fig. 3 show the effect diagram of application examples of the present invention.
Specific embodiment
It is further described below in conjunction with the drawings and specific embodiments.
Symbol description:
X: system input;Y: system output;W: model coefficient;C: intermediary matrix;I unit matrix;A: matrix-vector for example without It illustrates, is column vector.
Subscript explanation:
T: transposition;- 1: inverting;H: transposition conjugation
Subscript explanation:
+: increased data;: the data of reduction
Embodiment 1
Refering to what is shown in Fig. 1, the regression coefficient of the system linear regressive prediction model of the present embodiment determines method, comprising:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence The model coefficient of column count system linear regressive prediction model, and record;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling number According to sequence;Determine new systematic sampling data sequence compared with the increased systematic sampling point data of initial samples data sequence institute and reduction Systematic sampling point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;Then base In reduction/increased systematic sampling point data and updated model coefficient, model coefficient is updated again, and record;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, hits According to the model coefficient after sequence variation, and record.
The present invention is also used to realize the analysis to system performance variation, that is, further includes step S5, based on the model recorded Coefficient analyzes the variation of system performance.Carrying out analysis to system performance according to model coefficient variation can be used existing skill Art.
In step S1, for system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model system Number, then have intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN
Being considered as Ridge Regression Modeling Method can effectively deal with the high scene of sample point data synteny, therefore in step S1, right In system model YT=WTX, Y are the system output quantity of corresponding sampling input quantity X, and W is model coefficient, then has intermediary matrix CN= (XXT+λI)-1, λ is coefficients of ridge regression, and I is unit matrix, "current" model coefficient matrix WN T=YTXTCN.It can be in initialization Method is integrally changed to Ridge Regression Modeling Method.
In step S2, whenever increasing a sample point data, that is, sampled data window, the sample point data of the new write-in is written I.e. described " increased systematic sampling point data ".Since sampled data window is based on preset data length, primary data sequence Earliest sample point data will go out sampled data window by " crowded " in the sampling time in column, the quilt it is " crowded " go out sample point data it is i.e. described " sample point data of reduction ".The length of window of sampled data window can be set as needed, be the prior art, iterative process It is then unrelated with length of window.
In step S3, model coefficient first can be carried out more according to increased systematic sampling point data and "current" model coefficient Newly, the update of model coefficient is then carried out further according to the systematic sampling point data of reduction and "current" model coefficient, it also can sequence It is on the contrary.
Model coefficient update is carried out based on increased systematic sampling point data and "current" model coefficient specifically:
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1 … xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1 … yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith "current" model coefficient matrix WN
Systematic sampling point data and "current" model coefficient based on reduction carry out model coefficient update
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y- =yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN=CN+CNx-(I+x- TCNx-)x- TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx-(I+x- TCNx_)x- TCN-y_ Tx_ TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith "current" model coefficient matrix WN
In step S1, the length of window of initial data sequence is as needed and computer computation ability adjusts, if practical Obtained initial data sequence is longer, it is assumed that is M, can set lesser m as data window length, be based on using S1 The model coefficient of m sample data sequence is as initial model coefficient, then using step 2 to step 3, gradually increases sampling Each sample point data after point m, each sequence variation all obtain corresponding model coefficient and record as "current" model system It counts, in the calculating of iteration to model coefficient next time, until obtaining the model coefficient of corresponding sample data point M.
Step S1, it in S3 and S4, is recorded as the model coefficient result being calculated to correspond to corresponding sampled data respectively The model coefficient of last samples point in sequence.
The principle of the method for the present invention are as follows: overdetermined equation YT=WTIn X, W is undetermined coefficient, each column of input quantity sampling matrix X For a sampled data.Then seek W formula are as follows:
WT=YTXT(XXT)-1
There is matrix inversion lemma:
(A+XYH)-1=A-1-A-1X(I+YHA-1X)-1YHA-1
Based on above formula, the feelings for increasing new sampled point in original sample point data collection X, reducing former sampled point are discussed Shape.Wherein, remember C=(XXT)-1For intermediary matrix.
Increase data iteration and seek coefficient:
Original is to sampled point window XN、YNIt seeks obtaining WN、CN, newly increasing sampled point x+、y+Afterwards, new X is XN+1 =[XN x+], new Y is
So there is CN+1It is as follows to iterate to calculate formula:
CN+1=(XN+1XN+1 T)-1=(XNXN T+x+x+ T)-1
=(XNXN T)-1-(XNXN T)-1X+(I+x+ T(XNXN T)-1x+)x+ T(XNXN T)-1
=CN-CNx+(I+x+ TCNx+)x+ TCN
Because of WN T=YN TXN TCN, and then have WN+1It is as follows to iterate to calculate formula:
It reduces data iteration and seeks coefficient:
Original is to sampled point window XN、YNIt seeks obtaining WN、CN.Remember XN=[x- XN+1],It is reducing Former sampled point x-、y-Afterwards, remember that new sampled point window data is XN+1、YN+1
So having:
XNXN T=[x- XN+1][x- XN+1]T=XN+1 XN+1 T+x-x- T
So there is CN+1It is as follows to iterate to calculate formula:
CN+1=(XN+1 XN+1 T)-1=(XNXN T-x-x- T)-1
=(XNXn T)-1+(XN XN T)-1x-(I+x- T(XN XN T)-1x-)x- T(XN XN T)-1
=CN+CNx-(I+x- TCNx-)x- TCN
Because of WN T=YN TxN TCN, and then have WN+1It is as follows to iterate to calculate formula:
WN+1 T=YN+1 TXN+1 T(XN+1 XN+1 T)-1=(YN TXNT-y- Tx- T)CN+1
=YN TXN TCN+YN TXN TCNx-(I+x- TCNx-)x- TCN-y- Tx- TCN+1
=WN T+WN Tx-(I+x- TCNx-)x- TCN-y- Tx- TCN+1
Embodiment 1-1
Increase data the method for the present invention includes iteration and iteration reduces two self-contained process of data.In known one group of sampling In the case of point regression coefficient, data are increased by iteration, obtain this group of sampled point finally and increase the regression coefficient of sampled point entirety. In known one group of sampled point regression coefficient, data are reduced by iteration, obtains to reduce in this group of sampled point finally and partially adopt Regression coefficient after sampling point.
Refering to what is shown in Fig. 2, to guarantee numerical stability, the present embodiment first increases data in iterative process, reduces number afterwards According to reference to Fig. 2, specific step is as follows.
1. initialization:
Assuming that current sampling point N is directed toward sampled point i;
Obtain original input data xN-m+1..., xN, constitute matrix X=[xN-m+1 … xN];
Obtain original output data yN-m+1..., yN, constitute matrix YT=[yN-m+1 … yN];
According to formula CN=(XXT)-1Calculate intermediary matrix CN
W is calculated according to formulaN T=YTXTCNComputation model coefficient WN
Recording status WN、CNFor "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m data of sampled point i-m+1 to i.That is, WN、CNIt is the model of corresponding sampled point i Coefficient and intermediary matrix.
2. increasing data point:
Obtain increased original input data xN+1, constitute matrix x+=xN+1, new input quantity sampling matrix X is XN+1= [XN x+];
Obtain increased original output data yN+1, constitute matrix y+=yN+1, new output quantity sampling matrix Y is
According to formula CN+=CN-CNx+(I+x+ TCNx+)x+ TCN, by current intermediary matrix CNIterate to calculate CN+
According to formula WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+By "current" model coefficient WN、CNWith step 2.3 As a result CN+Iterate to calculate WN+
Recording status WN=WN+、CN=CN+For "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m+1 data of sampled point N-m+1 to N+1.
3. reducing data point:
Obtain reduced original input data xN-m+1, constitute matrix x-=xN-m+1
Obtain the original output data y reducedN-m+1, constitute matrix y-=yN-m+1
According to formula CN-=CN+CNx-(I+x- TCNx-)x- TCNBy current intermediary matrix CNIterate to calculate CN-
According to formula WN- T=WN T+WN Tx_(I+x_ TCNx_)x_ TCN-y_ Tx_TCN-By "current" model coefficient WN, intermediary matrix CN With step 3.3 result CN-Iterate to calculate WN-
Recording status WN=WN-、CN=CN-For "current" model coefficient and current intermediary matrix.
W at this timeN、CNIt is calculated by the m data of sampled point N-m+2 to N+1.That is, WN、CNIt is the mould of sampled point i+1 Type coefficient and intermediary matrix.
Preservation state WNIt is calculated to database as the model coefficient of sampled point i+1 for subsequent analysis.
4. current sampling point N is directed toward i+1, step 2,3,4 model coefficients for successively seeking subsequent sampling point are repeated.
In the above process, it is equivalent in sampling point sequence and opens the window that a length is m.It is first in each iterative process First increase a new sampled point in window tail portion, reduce by a sampled point on window head later, is equivalent to window entirety Forward one.It calculates and preceding W is started according to iterationN、CNThe sample point data increased and decreased in numerical value and window can be calculated next to adopt W at sampling pointN+1、XN+1, unrelated with length of window m.Therefore when needing successively to calculate model coefficient at all sampled points, pass through Computation complexity can be greatly reduced in iterative algorithm.
In the above process, each iterative process, which increases sampling number k in window tail portion, can be greater than 1.At this time in window head Portion reduces sampling number should be corresponding with points are increased, and is also k.It calculates and preceding W is started according to iterationN、CNIncrease and decrease in numerical value and window Sample point data can calculate to obtain W after k sampled pointN+k、CN+k
In the above process, initialization step can not disposably load all m sampled points in window, but repeatedly call step Rapid 2 successively increase sampled point, until all m sampled point increases finish.Initial phase can be reduced to account for the memory of server With.
Embodiment 2
Invention additionally discloses a kind of regression coefficients of system linear regressive prediction model to determine system, comprising:
Initial model coefficient determination module obtains system initial samples data for being based on preset data window length Sequence using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and records;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window Length forms new systematic sampling data sequence;Determine that new systematic sampling data sequence is increased compared with initial samples data sequence The systematic sampling point data of the systematic sampling point data and reduction that add;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model system Number updates model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more New model coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence Column, model coefficient update the model system that computing module is calculated after each sample point data increase, sample data sequence variation Number, and record.
As a kind of application examples, the present invention is used in photovoltaic plant O&M scenarios, log history data time interval Usually 15 minutes even it is shorter.It is calculated with this, in 25 years power station life cycles, all data of record will have about 1,000,000 Item, wherein 1 year data will have 40,000.In order to estimate the component aging influence to generated energy, the appraising model of foundation is with day 8 monitoring parameters such as gas bar part are as input, using actual power generation as output.In model, assembly property attenuation rate is at any time It gradually increases, is embodied on the consecutive variations of 8 inputs parameters and model constants item at any time.Meanwhile being blocked for reduction, The influence that aging is estimated in the chance events such as cleaning, maintenance, calculates the time window for selecting continuous 1 year data to calculate as model Mouthful.Therefore, after full 1 year of power station operation, in each monitoring sampling time point, system all can according to the data of the previous year calculate Model coefficient.Finally, by the model coefficient in 24 years life cycles after acquisition power station at each sampling time point.According to model The situation of change of coefficient can assess the performance degradation of different phase in the life cycle management of power station.
Simulation calculation has been carried out for above-mentioned application scenarios for the error condition of appraisal procedure iterative calculation.Emulation creation The power station operation datas of 1,000,000 Noises, and using 40,000 datas as analysis window size, successively iterate to calculate all numbers The absolute error of model coefficient at strong point.Abscissa is sampled point serial number, and ordinate is to return 8 input quantity moulds being calculated The absolute error of type coefficient and constant entry value and preset value.Visible all error entry value do not dissipate in figure, and wherein data9 is corresponding The absolute error of constant entry value, the highest in all errors, but still it is no more than 1e-10 magnitude.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions each in flowchart and/or the block diagram The combination of process and/or box in process and/or box and flowchart and/or the block diagram.It can provide these computers Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices To generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices execute In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
The embodiment of the present invention is described in conjunction with attached drawing above, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form, all of these belong to the protection of the present invention.

Claims (8)

1. a kind of regression coefficient of system linear regressive prediction model determines method, characterized in that include:
S1 is based on preset data window length, obtains system initial samples data sequence, utilizes initial samples data sequence meter The model coefficient of system linear regressive prediction model is calculated, and is recorded;
S2 obtains increased systematic sampling point data, is based on preset data window length, forms new systematic sampling data sequence Column;Determine new systematic sampling data sequence compared with initial samples data sequence increased systematic sampling point data and reduction be System sample point data;
S3 updates model coefficient based on increase/reduction systematic sampling point data and "current" model coefficient;It is then based on and subtracts Few/increased systematic sampling point data and updated model coefficient, update model coefficient, and record again;
S4 repeats step S2 to step S4 with the increase of systematic sampling point, obtains each sample point data increase, sampled data sequence Model coefficient after column variation, and record.
2. according to the method described in claim 1, it is characterized in that, further include step S5, based on the model coefficient recorded to being The variation of system performance is analyzed.
3. according to the method described in claim 1, it is characterized in that, in step S1, for system model YT=WTX, Y are that correspondence is adopted The system output quantity of sample input quantity X, W are model coefficient, then have intermediary matrix CN=(XXT)-1, model coefficient WN T=YTXTCN
4. according to the method described in claim 1, it is characterized in that, in step S1, for system model YT=WTX, Y are that correspondence is adopted The system output quantity of sample input quantity X, W are model coefficient, then have intermediary matrix CN=(XXT+λI)-1, λ is coefficients of ridge regression, and I is Unit matrix, "current" model coefficient matrix WN T=YTXTCN
5. the method according to claim 3 or 4, characterized in that be based on increased systematic sampling point data and current mould Type coefficient carries out model coefficient update
Define input quantity initial samples data sequence matrix are as follows: X=[xN-m+1…xN],
Output quantity sampled data matrix are as follows: YT=[yN-m+1…yN],
Increased input quantity/output quantity systematic sampling point data is xN+1/yN+1, constitute matrix x+=xN+1/y+=yN+1
The then intermediary matrix after sample point data increase are as follows: CN+=CN-CNx+(I+x+ TCNx+)x+ TCN,
Model coefficient matrix after sample point data increase are as follows:
WN+ T=WN T-WN Tx+(I+x+ TCNx+)x+ TCN+y+ Tx+ TCN+
By the intermediary matrix C after sample point data increaseN+With model coefficient matrix WN+It is updated to current intermediary matrix CNWith it is current Model coefficient matrix WN
6. according to the method described in claim 5, it is characterized in that, systematic sampling point data and "current" model system based on reduction Number carries out model coefficient update
Defining reduced input quantity/output quantity systematic sampling point data is xN-m+1/yN-m+1, constitute matrix x_=xN-m+1/y_= yN-m+1,
Intermediary matrix after then sample point data is reduced are as follows: CN-=CN+CNx-(I+x- TCNx-)x- TCN,
Model coefficient matrix after sample point data reduction are as follows:
WN- T=WN T+WN Tx-(I+x- TCNx-)x- TCN-y- Tx-TCN-
Intermediary matrix C after sample point data is reducedN-With model coefficient matrix WN- TIt is updated to current intermediary matrix CNWith it is current Model coefficient matrix WN
7. according to the method described in claim 1, it is characterized in that, step S1, in S3 and S4, the model that will be calculated respectively Coefficient results are recorded as corresponding to the model coefficient of last samples point in corresponding sample data sequence.
8. a kind of regression coefficient of system linear regressive prediction model determines system, characterized in that include:
Initial model coefficient determination module, for obtaining system initial samples data sequence based on preset data window length, Using the model coefficient of initial samples data sequence computing system Linear Regression Forecasting Model, and record;
Sampled data variation obtains module, for obtaining increased systematic sampling point data, is based on preset data window length, Form new systematic sampling data sequence;Determine new systematic sampling data sequence compared with increased system, initial samples data sequence institute The systematic sampling point data for the sample point data and reduction of uniting;
Model coefficient updates computing module, for based on increase/reduction systematic sampling point data and "current" model coefficient, more New model coefficient;It is then based on reduction/increased systematic sampling point data and updated model coefficient, again more new model Coefficient, and record;
With the increase of each systematic sampling point, sampled data variation obtains module and obtains new systematic sampling data sequence, mould The model coefficient after each sample point data increase, sample data sequence variation is calculated in type coefficient update computing module, and Record.
CN201910334251.6A 2019-04-24 2019-04-24 Regression coefficient determination method of system linear regression prediction model Active CN110135012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910334251.6A CN110135012B (en) 2019-04-24 2019-04-24 Regression coefficient determination method of system linear regression prediction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910334251.6A CN110135012B (en) 2019-04-24 2019-04-24 Regression coefficient determination method of system linear regression prediction model

Publications (2)

Publication Number Publication Date
CN110135012A true CN110135012A (en) 2019-08-16
CN110135012B CN110135012B (en) 2023-12-22

Family

ID=67571023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910334251.6A Active CN110135012B (en) 2019-04-24 2019-04-24 Regression coefficient determination method of system linear regression prediction model

Country Status (1)

Country Link
CN (1) CN110135012B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893541A (en) * 2016-03-31 2016-08-24 中国科学院软件研究所 Streaming data self-adaption persistence method and system based on mixed storage
CN106487358A (en) * 2016-09-30 2017-03-08 西南大学 A kind of maximal correlation entropy volume kalman filter method based on statistical linear regression
US9760539B1 (en) * 2015-02-28 2017-09-12 Cloud & Stream Gears Llc Incremental simple linear regression coefficient calculation for big data or streamed data using components
CN107368649A (en) * 2017-07-19 2017-11-21 许昌学院 A kind of sequence optimisation test design method based on increment Kriging
US9928215B1 (en) * 2015-02-28 2018-03-27 Cloud & Stream Gears Llc Iterative simple linear regression coefficient calculation for streamed data using components

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9760539B1 (en) * 2015-02-28 2017-09-12 Cloud & Stream Gears Llc Incremental simple linear regression coefficient calculation for big data or streamed data using components
US9928215B1 (en) * 2015-02-28 2018-03-27 Cloud & Stream Gears Llc Iterative simple linear regression coefficient calculation for streamed data using components
CN105893541A (en) * 2016-03-31 2016-08-24 中国科学院软件研究所 Streaming data self-adaption persistence method and system based on mixed storage
CN106487358A (en) * 2016-09-30 2017-03-08 西南大学 A kind of maximal correlation entropy volume kalman filter method based on statistical linear regression
CN107368649A (en) * 2017-07-19 2017-11-21 许昌学院 A kind of sequence optimisation test design method based on increment Kriging

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘维嘉: "高峰期网络流量高精准度预测模型研究", 《网络新媒体技术》 *
洪港 等: "《概率论与数理统计》", 31 January 2019 *

Also Published As

Publication number Publication date
CN110135012B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
JP4116026B2 (en) Engine operation modeling method
CN113094860B (en) Industrial control network flow modeling method based on attention mechanism
Clark et al. Resource requirements for fault-tolerant quantum simulation: The ground state of the transverse Ising model
CN105678002A (en) Plasma particle-field Vlasov-Maxwell system long-term, large-scale and high-fidelity analog method
CN110807508B (en) Bus peak load prediction method considering complex weather influence
Qadrdan et al. Probabilistic wind power forecasting using a single forecast
Brandi et al. Improving stochastic dynamic programming on hydrothermal systems through an iterative process
Bekki et al. Steady-state quantile parameter estimation: an empirical comparison of stochastic kriging and quantile regression
CN108462180B (en) Method for determining probability optimal power flow based on vine copula function
Moya et al. On approximating the dynamic response of synchronous generators via operator learning: A step towards building deep operator-based power grid simulators
Lu et al. An efficient convergent lattice method for Asian option pricing with superlinear complexity
CN116627773B (en) Abnormality analysis method and system of production and marketing difference statistics platform system
Ji et al. Improving the Performance of Digitized Counterdiabatic Quantum Optimization via Algorithm-Oriented Qubit Mapping
CN110135012A (en) A kind of regression coefficient of system linear regressive prediction model determines method
CN116306030A (en) New energy prediction dynamic scene generation method considering prediction error and fluctuation distribution
KR102138227B1 (en) An apparatus for optimizing fluid dynamics analysis and a method therefor
Bihlo et al. Invariant discretization schemes using evolution-projection techniques
EP4220485A1 (en) System, apparatus and method for calibrating simulation model of one or more assets
WO2019073913A1 (en) Pseudo-data generating device, method and program
CN114265674A (en) Task planning method based on reinforcement learning under time sequence logic constraint and related device
CN113094636A (en) Interference user harmonic level estimation method based on massive monitoring data
WO2022106863A1 (en) Method and system for accelerating the convergence of an iterative computation code of physical parameters of a multi-parameter system
Nzale et al. A tool for automatic determination of model parameters using particle swarm optimization
Shao et al. A distributed strategy for games in Euler–Lagrange systems with actuator dead zone
JP7371805B1 (en) Driving support device, driving support method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 301-1, Block B, Building 1, No. 1, Zhongguancun East Road, Haidian District, Beijing 100080

Applicant after: Beijing Qidi Qingyun Intelligent Energy Co.,Ltd.

Address before: Room 301-1, Block B, Building 1, No. 1, Zhongguancun East Road, Haidian District, Beijing 100080

Applicant before: BEIJING TSING YUN ENERGY TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant