Specific embodiment
The specific embodiments of the invention structure as shown in Figure 1.This embodiment comprises ground floor---on-the-spot observing and controlling layer specifically comprises on-the-spot sensing and controlled plant module, IP pattern intelligent measuring and control device, IP fuzzy control device, on-the-spot controlled plant group; The second layer---enterprise-level supervisory layers specifically comprises Ethernet switch or hub, on-site supervision microcomputer, real-time data base server, observing and controlling strategic server, the webserver; The 3rd layer---based on Internet remote measurement and control layer, mainly comprise remote work station and terminal microcomputer.Principle of work and implementation process that this embodiment is concrete are as follows:
Present embodiment at first is arranged in many IP pattern intelligent measuring and control devices in the collaborative TT﹠C system of IP pattern in each surveyed area, each table apparatus (having 8 passages) connects the parameter that the requirement of system applies object is gathered, each identical passage of each table apparatus all connects identical sensing parameter (parameter number and parametric type), by switch each is installed (see figure 2) in the IP pattern real-time data base server that each parameter collection value is uploaded to the enterprise supervision layer.In the time that limits, the coupled relation of each parameter based on PLSR and SBR principle, carries out double-deck Real Time Compression to the many reference amounts data in comprehensive each measure and control device of real-time data base.
Technological core of the present invention is the compression (see figure 3) of carrying out the multiple correlation relation between a plurality of auxiliary parameters of being left based on the SBR method in the compression of coupled relation between main parameter and the auxiliary parameter and the second layer in the ground floor of IP pattern real-time data base based on the PLSR principle.Wherein:
1, based on the ground floor compression method of PLSR method:
Definition 1 is in the many reference amounts of the collaborative TT﹠C system of IP pattern, and it is closely related with some other parameter that some parameters are arranged, and these parameters by other parameter decision are called main parameter, and other parameter that influences main parameter is called auxiliary parameter.Suppose to have in the collaborative monitoring platform of IP pattern to have m sensing parameter, p auxiliary parameter (being also referred to as independent variable) { x wherein arranged
1, x
2..., x
p, behind n sample point of observation, constitute argument data Table X=[x
1, x
2..., x
p]
N * pBecause it is more generally to belong to the situation of single main parameter (being also referred to as dependent variable) in many sensings parameter of the collaborative TT﹠C system of IP pattern, therefore mainly study at single dependent variable based on the double real-time compression method of PLSR_SBR, dependent variable data wherein are expressed as Y=[y]
N * 1If there have independent and other parameters of q parameter to exist to be certain related, then the dependent variable data are expressed as Y=[y
1, y
2..., y
q]
N * qBegin to set up in the process of PLSR model, the selection of number of principal components is extremely important.In a plurality of major components of calculating, first principal component is most important, increases with number of principal components, significance level reduces successively, so that many major component reflections finally is noise information, so the dimension of this model can determine that compression validity is defined as follows with the compression validity check.
Definition 2 supposition will be extracted h composition at present.At first, regression equation of m composition match is adopted in all sample point set (containing n-1 sample) of removing certain sample point i; The equation of the sample point i substitution front match that was excluded just now, can obtain the match value of dependent variable on sample point i then
Respectively to each sample i=1,2 ..., n repeats above-mentioned steps, obtains squared prediction error and the PRESS of dependent variable y
h
Adopt all sample point matches to contain the regression equation of h composition again.If the predicted value of i sample point of note is
, the error sum of squares SS of y then
h
Then the compression validity of h composition is
With compression measure of effectiveness composition t
hContributrion margin to the model of fit precision has following yardstick: when
The time, t
hThe contributrion margin of composition is significant.Obviously
With (PRESS
h/ SS
H-1)<0.95
2It is principle of decision-making of equal value fully.At this moment increase composition t
m, can be significantly improved to model, therefore, also can consider to increase composition t
mBe obviously useful.
Composition t is extracted in double real-time compression method ground floor compression based on PLSR_SBR respectively in X and Y
jAnd u, wherein t
jBe x
1, x
2..., x
pLinear combination, u is the linear combination of y.Be extracted into timesharing t
jMust satisfy following two conditions with u:
1. t
jWith u should be big as far as possible the variation information in the tables of data separately of carrying;
2. t
jReach maximum with the degree of correlation of u.
The ground floor compression method comprises following five steps:
(1) data normalization is handled
Respectively dependent variable and independent variable are carried out standardization.Adopt the translation exchange to guarantee data by center of gravity, compression is handled can eliminate dimension.After data normalization was handled, the center of gravity of data overlapped with initial point.
In the formula, F
0, E
0Be respectively Y, the standardization matrix of X; E (y
j), E (x
i) be respectively Y, the average of X; S
Yj, S
XiBe respectively Y, the mean square deviation of X; N is a sample size.
(2) first composition t
1Extract
Known F
0, E
0From E
0Middle first composition t that extracts
1, t
1=E
0W
1, W
1Be E
0First axle, and ‖ W
1‖=1; t
1Be standardized variable x
1*, x
2* ..., x
p* linear combination is reintegrating of prime information, W
1Be combination coefficient.From F
0Middle first ingredient u of extracting
1, u
1=F
0C
1, C
1Be F
0First the axle, ‖ C
1‖=1.Require t
1, u
1Can distinguish the data variation information of representing well among X and the Y, satisfy t simultaneously
1To u
1Maximum interpretability is arranged.According to principal component analysis (PCA) principle and canonical correlation analysis thinking, be actually and require t
1With u
1The covariance maximal value, this is an optimization problem.That is:
Be maximum, wherein r (t
1, u
1) be t
1With u
1Degree of correlation maximal value.Therefore at ‖ W
1‖=1 and ‖ C
1Under the constraint condition of ‖=1, remove to ask (W
1 TE
0 TF
0C
1) maximal value, adopt Lagrangian algorithm, have through derivation:
θ
1The objective function of optimization problem just.Wherein, W
1Be E
0 TF
0F
0 TE
0Proper vector, θ
1 2It is the characteristic of correspondence value.Want θ
1Get maximal value, then W
1Be E
0 TF
0F
0 TE
0The unit character vector of matrix eigenvalue of maximum, C
1Be corresponding to matrix F
0 TE
0E
0 TF
0Eigenvalue of maximum θ
1 2The unit character vector.
Because dependent variable F in IP pattern real-time data base multidimensional data
0Just variable, so C
1Be a constant, can get
According to the cycle calculations formula of partial least squares regression, can obtain
Because E
0, F
0All are vector of unit length, so have
Try to achieve a W
1After, get final product composition
Ask F respectively
0, E
0To t
1Regression equation:
F
0=t
1r
1+F
1 (13)
In the formula,
Be corresponding regression coefficient vector (scalar); The note residual matrix
F
1=F
0-t
1r
1 (15)
(3) second composition t
2Extract
With E
1Replace E
0, F
1Replace F
0, ask second axle W with top method
2With second composition t
2, have
t
2=E
1W
2 (17)
W
2Be matrix E
1 TF
1F
1 TE
1Eigenvalue of maximum θ
2 2The unit character vector,
Implement E
1, F
1To t
2Recurrence, have
F
1=t
2r
2+F
2 (19)
In the formula:
(4) h composition t
hExtract
In like manner, inquire into h composition t
hM can discern with compression validity principle, and h is less than the order of X.
(5) inquire into PLSR regression model equation
F
0About t
1, t
2..., t
hThe least square regression equation:
F
0=t
1r
1+t
2r
2+L+t
hr
h (21)
Because t
1, t
2..., t
hAll are E
01, E
02..., E
0pLinear combination, according to partial least squares regression character, know
Note
So,
If note
Therefore, can get regression equation is:
What from the above-mentioned course of work as can be seen, the double real-time compression method ground floor of PLSR_SBR adopted is the strategy of a kind of information decomposition and extraction.It is at multi-variable system x
1, x
2..., x
pIn extract generalized variable t one by one
1, t
2..., t
m(m<p), this is equivalent to x
1, x
2..., x
pIn information reconfigure and extract, thereby obtain the interpretability of Y the strongest, the generalized variable that can summarize simultaneously information among the independent variable set X again.And meanwhile, Y is not explained the information of meaning has been excluded naturally, also just obtained good Information Compression.If there are a plurality of dependent variables of representing by other a plurality of parameters in the multidimensional parameter, rerun the ground floor compression process of algorithm so.
Through above-mentioned analysis as can be known, in the double real-time compression method based on PLSR_SBR, the I as a result of ground floor compression can be expressed as each dependent variable data ordered series of numbers the form storage of (1+p) polynary group:
1. row: the sequence number that is defined as main parameter in the multidimensional parameter;
2. a
1j, a
2j... a
Pj, be defined as the regression model parameter, wherein 1,2 ..., p is the sequence number of auxiliary parameter in the multidimensional parameter.
2, based on the second layer compression method that improves SBR:
After the double real-time compression method process ground floor compression based on PLSR_SBR, be left relevant stronger other a plurality of independents variable (as temperature, humidity, pressure etc.), can be right after the compression of carrying out the second layer and calculate.x
i{ x
I1..., x
InAnd x
j{ x
J1..., x
JnBe two argument data sequences with n sample point, the distance between i sequence of distance metric canonical representation and j sequence.Because data sequence has stronger time restriction in the IP pattern real-time data base, data bulk is little, and especially wherein the measurement variation of a plurality of parameters is little, and linear dependence is more obvious, and therefore the distance definition of estimating between sequence is as follows.
Definition 3 is based on x
j{ x
J1..., x
JnTo x
i{ x
I1..., x
InCarrying out least square linear fit, fitting result is
, based on Euclidean distance, its model of fit evaluation can be expressed as
, requirement
, δ is the total error of the data sequence match of setting.
Definition 4 is based on x
j{ x
J1..., x
JnTo x
i{ x
I1..., x
IkCarry out least square linear fit, wherein k<n then claims x
j{ x
J1..., x
JnUse amount be k.
In the double real-time compression method second layer of PLSR_SBR, all argument data sequences are divided into a plurality of subsequences, and finally are mapped in the basis signal, its mapping result II will be expressed as the form of 5 elements:
1. row: the row number of this data sequence, when data sequence the time in elongated decomposition, can be adding 1 or 2 left sided sequence and right side sequences (becoming 2) after distinguishing decomposition behind the original row number, as 11 and 12 from 1.If also will continue to decompose, in the same way, become more multidigit, so that when decompressing, can discern.
2. shift: the side-play amount of definition subsequence in basis signal;
3. a, b: the parameter that is defined as simple regression;
4. error: the total error that is defined as simple regression.
Each fitting data sequence generally can be used (row, shift, a, b, error) expression wherein.
Mainly comprise following three steps:
(1) sets up the basis signal of forming by basic sequence.
The foundation of basis signal is based on a most important link in the second layer of PLSR_SBR double real-time compression method, it is the benchmark that the data left sequence is carried out the match compression, has determined the compressibility of IP pattern real-time data base and the sequence global error size after the compression.The basis sequence is the parton sequence of electing from remaining sequence.Because basis signal is made up of basic sequence, therefore selecting the subsequence number of basic sequence in the composition basis signal is the key of research, and its implementation procedure as shown in Figure 4.
The sequence of a remaining m-q auxiliary parameter sequence (each sequence has individual n data point) is composed in series a big sequence, and be divided into the sequence that the K group comprises W data point again, wherein
Promptly
, the sequence of every W data point is called subsequence.Remove existing each subsequence j of match (comprising ordered series of numbers itself) respectively with a certain subsequence i successively then, can draw the error value E rror of each match
jAt linearErr (Cand
j) 〉=Error
jUnder the constraint condition, match finishes all subsequences, can draw the benefit value of a current sequence i, and formula is as follows:
LinearErr (Cand wherein
j) be by self error of fitting of match sequence.
Press the size of benefit value and press all subsequences of descending sort, and be stored in the base_list variable.Afterwards, adopt binary search (seeing Fig. 4 right-hand component), select preceding Ins the subsequence (being called basic sequence) among the bast_list to form basis signal, the global error that the global error of the whole original data sequence that is left with preceding Ins basic sequence match obtains less than the match of the individual basic sequence of preceding Ins-1 or Ins+1 institute.Wherein the remaining whole measurement data sequence of Ins the basic sequence match computing method that obtain global error are finished by following step.
(2) remaining original data sequence carries out elongated decomposition, obtains final match total error ε.
The whole process that remaining whole original data sequence carries out elongated decomposition is established to each sequence in the data left sequence mapping of basis signal respectively as shown in Figure 5.The process of mapping is a benchmark with the basis signal exactly, and original data sequence begins match from the high order end of basic sequence, and match once can obtain five elements array (row, shift, an a
1, b
1, error
1).Back original data sequence moves one to basis signal the right, and promptly shift=shift+1 can obtain another one five elements array (row, shift, a again
2, b
2, error
2), original data sequence is shifted according to this, till shift=s * w-w, obtains altogether (s-1) * w+1 five elements array, and by comparing the error in each five elements array, that five elements array of error minimum is promptly represented this original data sequence.Each sequence determines at last that by it corresponding error of fitting error carries out descending sort in the whole then original data sequence.It is two subsequences that first original data sequence is divided equally, and then is established to the mapping of basis signal respectively, can carry out descending sort again according to error of fitting error again afterwards.In order to reduce the number of final subsequence, improve compressibility, if do not satisfy less than the total error ε of system
AlwaysSituation under, according to top method cycle calculations, the length of the subsequence after being decomposed is less than till 6.Last match total error ε can be calculated by the formula in the definition 3.
(3) basis signal is upgraded.
In case the setting value ε of last global error ε occur obtaining greater than the user
IfSituation, therefore then such decomposition result can not meet the demands, and needs to seek the method that reduces global error ε.This paper takes to carry out on the elongated decomposition computation method basis of invariable basis signal being upgraded at remaining original data sequence.Use amount and cumulative errors comprehensive evaluation principle are proposed, promptly in each data sequence of basis signal, use amount k in the application definition 4, select the minimum ordered series of numbers of access times, when the identical a plurality of ordered series of numbers of access times wherein occurring, can compare their cumulative errors again, thereby finally can find the wherein data sequence of cumulative maximum error, substituted by the maximum pairing subsequence of benefit value in the basis signal of leading portion compressed data sequences in the same time period, the side makes final global error ε meet user's setting value ε again
If
Comprehensive above method computational analysis can draw the IP pattern and work in coordination with in the IP pattern real-time data base of TT﹠C system, based on the compressibility universal calculation equation of PLSR_SBR double real-time compression method is:
In the formula (26), t is the elongated total number of sequence of decomposing the back gained of data left sequence.
The present invention solves the common problem of similar field, can be widely used in chemical industry, grain and military munition warehouse, cigarette, medicine, weaving manufacturing and directly be generalized to modern greenhouse medium.Mainly contain following following two characteristics:
(1) double real-time compression method based on PLSR_SBR relatively is adapted at using under the following concrete condition: have main parameter and auxiliary parameter, and different parametric data or same parameter different time sections correlation of data are stronger.The data sequence correlativity is strong more, and its compression effectiveness is good more; The distributed data amount is big more, and its compressibility is high more, has satisfied the IP pattern and has worked in coordination with the development that monitoring platform is expanded measuring control point arbitrarily, has reduced the data traffic of the storage area and the platform of real-time data base;
(2) if under the very large situation of data volume, can consider to compress at the same parameter of concrete multinode, compressibility can improve greatly so.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.