CN101807046A - Online modeling method based on extreme learning machine with adjustable structure - Google Patents

Online modeling method based on extreme learning machine with adjustable structure Download PDF

Info

Publication number
CN101807046A
CN101807046A CN 201010119408 CN201010119408A CN101807046A CN 101807046 A CN101807046 A CN 101807046A CN 201010119408 CN201010119408 CN 201010119408 CN 201010119408 A CN201010119408 A CN 201010119408A CN 101807046 A CN101807046 A CN 101807046A
Authority
CN
China
Prior art keywords
msub
mrow
msubsup
mtd
mtr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201010119408
Other languages
Chinese (zh)
Other versions
CN101807046B (en
Inventor
刘民
李国虎
董明宇
吴澄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN2010101194082A priority Critical patent/CN101807046B/en
Publication of CN101807046A publication Critical patent/CN101807046A/en
Application granted granted Critical
Publication of CN101807046B publication Critical patent/CN101807046B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an online modeling method based on an extreme learning machine with adjustable structure, belongs to the fields of automatic control, information technology and advanced manufacturing, in particular to a method for adjusting the structure and parameters of an extreme learning machine to hold newly acquired data in the online learning process of the extreme learning machine. The method is characterized by comprising the following steps: defining the concept of a category ball; judging whether the newly acquired data is out of the category ball and leads to reduction of modeling accuracy or not in every learning process; if yes, adding a new hidden node; if not, only adjusting the center and the radius of the category ball, and updating the weight of the output layer of the extreme learning machine at last. The method firstly introduces the concept of the category ball for holding the used data in a previous training process, when determining the parameters of the newly added hidden node, the output of the node at a point nearest to the category ball is made to small enough so as to guarantee the output value of the node on the used data is zero, and a formula for updating the weight of the output layer is given. The method can enhance the online modeling accuracy by adding the hidden node.

Description

Online modeling method based on structure-adjustable extreme learning machine
Technical Field
The invention belongs to the fields of automatic control, information technology and advanced manufacturing, and particularly relates to a method for adjusting the structure and parameters of an extreme learning machine in the online learning process of the extreme learning machine to accommodate newly acquired data.
Background
In many modeling environments oriented to actual industrial process detection, control and optimization, data required for modeling often have the characteristic of arriving sequentially. In view of the above, the academia and the industry have proposed online modeling methods (or online learning methods), such as RAN, RANEKF, MRAN, GAP-RBF, GGAP-RBF, which can adjust model structures and parameters online according to newly generated data to accommodate new data information without re-modeling for all acquired data. However, most of the methods have the defects of more parameters to be adjusted, low training speed and the like, and the actual application effect of the methods is seriously influenced. The recently emerging OS-ELM method, although reducing the to-be-adjusted parameter to one, lacks structural adjustability, so that its ability to accommodate new information is relatively limited and the model accuracy cannot be further improved.
Disclosure of Invention
In order to solve the problem of online modeling, the invention provides an online modeling method (SAO-ELM for short) based on a structure-adjustable extreme learning machine. In the SAO-ELM, the basic structure of the network is the same as that of an ELM (Extreme Learning Machine) network, but the number of hidden nodes thereof can be adjusted in the online modeling process. The main difficulty of adding hidden nodes in the modeling process is that the training goal of the SAO-ELM is to minimize the error sum of the adjusted model with respect to all training data, but in each online learning process, the previously used training data must be discarded, which makes the output of the newly added hidden nodes on those discarded data unknown. To this end, the present invention defines a ball classification concept that encompasses all the used training data and records and updates the center and radius of the ball at any time based on the new data. When the hidden layer node is added, the excitation function is selected as a Gaussian function, and then the center and the width of the appropriate excitation function are selected, so that the output of the node at the closest point to the class sphere is small enough, and the output of the newly added hidden layer node on the discarded data can be regarded as 0. Under the conditions, an iterative output layer weight updating formula when hidden layer nodes are added can be deduced, and therefore online modeling based on the structure adjustable extreme learning machine is achieved.
An online modeling method based on a structure-adjustable extreme learning machine is characterized by being realized according to the following steps:
step (1): model selection and parameter initialization
Setting the number M of hidden layer nodes of a single hidden layer extreme learning machine, wherein the number of input layer nodes is the same as the dimension n of a training sample, and the number of output nodes is the same as the dimension M of a training target;
excitation function G (a) of hidden nodei,biX) determining the center a of each hidden node randomly by adopting a Gaussian functioniAnd width bi,i=1,2,…M;
Based on the first N samples
Figure GSA00000031561000021
Training extreme learning machine to obtain initial hidden layer output matrix H0And output layer connection matrix beta0Wherein
β0=(H0 TH0)-1H0T0
Figure GSA00000031561000022
<math><mrow><msub><mi>T</mi><mn>0</mn></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>t</mi><mn>11</mn></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msub><mi>t</mi><mrow><mi>N</mi><mn>1</mn></mrow></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mi>Nm</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><msub><mfenced open='[' close=']'><mtable><mtr><mtd><msubsup><mi>t</mi><mn>1</mn><mi>T</mi></msubsup></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msubsup><mi>t</mi><mi>N</mi><mi>T</mi></msubsup></mtd></mtr></mtable></mfenced><mrow><mi>N</mi><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
the matrix K is initialized so that
Figure GSA00000031561000024
ComputingPreservation of beta0、K0And P0
Enclosing the initial training sample set X with a classification ball O0So that the ball just will be X0All the sample points in (a) are enclosed in the sphere, and the sphere center C of the sphere is determined0And a radius R0
Step (2): online learning process
Newly added training data x1=(xN+1,tN+1) At the time of arrival, the ELM is trained as follows, so that X can be stored0Can accommodate x1New knowledge contained
Step (2.1): maintaining the network structure unchanged according to x only1Adjusting output layer connection matrix beta0The updated output layer weight connection matrix is beta1Updating the matrix K simultaneously0And P0Is K1And P1
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msub><mi>P</mi><mn>1</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mrow><mo>(</mo><msub><mi>T</mi><mn>1</mn></msub><mo>-</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>)</mo></mrow></mrow></math>
P 1 = K 1 - 1 = P 0 - P 0 H 1 T ( I + H 1 P 0 H 1 T ) - 1 H 1 P 0
K 1 = K 0 + H 1 T H 1
Wherein H1And T1New sample x for ELM pairs, respectively1Hidden layer output matrix and training target matrix of (i.e. training target matrix)
H1=[G(a1,b1,xN+1)…G(aM,bM,xN+1)]1×M
<math><mrow><msub><mi>T</mi><mn>1</mn></msub><mo>=</mo><mo>[</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mn>1</mn></mrow></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mi>m</mi></mrow></msub><mo>]</mo><mo>=</mo><msub><mrow><mo>[</mo><msubsup><mi>t</mi><mrow><mi>N</mi><mo>+</mo><mn>1</mn></mrow><mi>T</mi></msubsup><mo>]</mo></mrow><mrow><mn>1</mn><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
Step (2.2): ELM after calculating adjustment parameters is used for newly added sample x1E, judging a new sample x1If the ball is not outside the ball O and e is larger than the set threshold value, abandoning all the adjustments, and turning to the step (2.3), otherwise, turning to the step (3);
step (2.3): is increased by oneHidden node, setting its center a as x1The width b is determined by
<math><mrow><mi>b</mi><mo>&le;</mo><mo>-</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><mi>a</mi><mo>|</mo><mo>|</mo></mrow><mrow><mi>ln</mi><mi>&epsiv;</mi></mrow></mfrac></mrow></math>
Wherein epsilon is a preset threshold value, xcThe coordinates of the closest point on the classification sphere O to the new sample point x1 can be determined by
xc=xo1(xa-xo)
Wherein x isoIs the sphere center coordinate, λ, of the sphere O1Can be determined by
<math><mrow><msub><mi>&lambda;</mi><mn>1</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac></mrow></math>
X in the above formulaaI.e. new sample point x1The coordinates of (a); readjusting output layer connection matrix beta0Is beta1And updating the matrix K accordingly0And P0Is K1And P1So that
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>P</mi><mn>1</mn></msub><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>K</mi><mn>0</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo><msub><mi>K</mi><mn>1</mn></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>K</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr><mtr><mtd><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo><msub><mi>P</mi><mn>1</mn></msub><mo>=</mo><msubsup><mi>K</mi><mn>1</mn><mrow><mo>-</mo><mn>1</mn></mrow></msubsup><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><msub><mi>A</mi><mn>12</mn></msub></mtd></mtr><mtr><mtd><msub><mi>A</mi><mn>21</mn></msub></mtd><mtd><msub><mi>A</mi><mn>22</mn></msub></mtd></mtr></mtable></mfenced></mrow></math>
<math><mrow><msub><mi>A</mi><mn>11</mn></msub><mo>=</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>+</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>,</mo><msub><mi>A</mi><mn>12</mn></msub><mo>=</mo><mo>-</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>,</mo><msub><mi>A</mi><mn>21</mn></msub><mo>=</mo><msubsup><mi>A</mi><mn>12</mn><mi>T</mi></msubsup><mo>,</mo><msub><mi>A</mi><mn>22</mn></msub><mo>=</mo><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup></mrow></math>
<math><mrow><mi>R</mi><mo>=</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>-</mo><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow></mrow></math>
<math><mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>=</mo><msup><mrow><mo>(</mo><msub><mi>K</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>=</mo><msub><mi>P</mi><mn>0</mn></msub><mo>-</mo><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msup><mrow><mo>(</mo><mi>I</mi><mo>+</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><mo>,</mo><msub><mi>P</mi><mn>0</mn></msub><mo>=</mo><msubsup><mi>K</mi><mn>0</mn><mrow><mo>-</mo><mn>1</mn></mrow></msubsup></mrow></math>
Wherein H01And H11Respectively adding hidden layer nodes to the original sample set X0And new sample point x1Hidden layer output matrix of, i.e.
Figure GSA00000031561000041
Figure GSA00000031561000042
N1And L are each x1The number of the sample points and the number of the nodes of the newly added hidden layer are taken into consideration, and N is the number of the new sample points which arrive one by one1=L=1;
And (3): updating parameters of category ball O
Updating the parameters of the ball O of the classification, i.e. its centre coordinates and radius, so that a new ball O1Just about X0And x1All the sample points in (1) are enclosed therein, and the update formula is as follows:
R new = | | x a - x b | | 2
x o _ new = x a + x b 2
wherein,xaand xbAre respectively new sample points x1Coordinate of (2) and ball O upper distance x1The coordinate of the farthest point, xbCan be calculated from the following formula,
xb=xo2(xo-xa)
wherein x isoIs the coordinate of the center of sphere O, λ2Can be calculated from the following formula,
<math><mrow><msub><mi>&lambda;</mi><mn>2</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>b</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac></mrow></math>
according to the online modeling method, a large number of simulation tests are carried out, and as can be seen from simulation results, the online modeling method provided by the invention has higher learning precision compared with other online modeling methods, and a model established by the method also has better generalization performance.
Drawings
FIG. 1: the invention provides an algorithm flow chart, which is a specific implementation step of the online modeling method based on the structure adjustable extreme learning machine.
FIG. 2: a schematic diagram of a ball class enclosing all data used in the training process, wherein the small ball class O is the ball class enclosing all used data except the newly added training data, and the large ball class O1Is a category ball that encloses all used data and new data.
FIG. 3: a gaussian function diagram, where the peak in the middle is its output at the center of the gaussian function and the smaller output values at the edges are its outputs farther away from the center of the gaussian function.
FIG. 4: and in the simulation experiment, the relationship between the training precision and the verification precision along with the change of the number of hidden nodes is shown, wherein a red curve is the relationship between the training precision and the change of the number of hidden nodes, and a green curve is the relationship between the verification precision and the change of the number of hidden nodes.
FIG. 5: an online modeling process schematic diagram in a simulation experiment, wherein fig. 5.1 is a variation relation of verification precision along with the increase of training data, and fig. 5.2 is a variation relation of hidden node number along with the increase of training data.
Detailed Description
The online modeling method based on the structure-adjustable extreme learning machine has the main advantage that the network structure can be adjusted according to needs in the online modeling process. According to the characteristics of on-line modeling, if new training data arrives, learning is carried out, otherwise, the existing model is used for prediction, and the prediction precision can be gradually improved along with the increase of the training data.
The steps involved in the online modeling method based on the structure-adjustable extreme learning machine provided by the invention are explained in detail as follows:
first step, model selection
For the method proposed by the present invention, model selection only involves determining the number M of initial ELM hidden nodes. The invention adopts a cross validation method to determine the number of the initial hidden layer nodes: dividing initial training data into two parts, wherein one part is used for training, and the other part is used for verification; training ELM by using training data from a smaller hidden node number, then obtaining verification error by using verification data, then gradually increasing the number of hidden nodes, repeating the training and verification steps, and finally selecting the hidden node number which enables the verification error to be minimum as the initial hidden node number.
Second step, model initialization
The initialization of the model is the initialization of the model parameters. The method provided by the invention adopts an ELM network structure, and the excitation function of the hidden node adopts a Gaussian function, so that the parameter to be initialized firstly has the center a of the Gaussian functioniAnd width bi,i=1,2,…M,aiAnd biRespectively selected from random numbers according to a specific distribution. Second, the number of samples initially engaged in training is determined. In the invention, for the model for classifying the problem, the initial training sample number is selected to be M + 100; for the model used for the regression problem, M +50 was chosen. The principle of determining the number of initial training samples is such that H0Column full rank. Finally, a classification ball S with the smallest radius is used0Enclosing the initial training data, recording the center and radius of the sphere as C0And R0
Thirdly, determining the initial value of the training process data
The initial data of training process includes hidden output matrix H, output layer connection matrix beta, intermediate matrix K and its inverse matrix K for calculating output layer connection matrix-1
If the initial training data is
Figure GSA00000031561000061
The corresponding hidden layer output matrix is
Solving an objective function of the output layer connection matrix beta as min (| F-T) according to an Empirical Risk Minimization (ERM) principle0‖)=min(‖H0(N×M)β(M×m)-T0(N×m)‖),Wherein, T0For training purposes, i.e.
<math><mrow><msub><mi>T</mi><mn>0</mn></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>t</mi><mn>11</mn></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msub><mi>t</mi><mrow><mi>N</mi><mn>1</mn></mrow></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mi>Nm</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><msub><mfenced open='[' close=']'><mtable><mtr><mtd><msubsup><mi>t</mi><mn>1</mn><mi>T</mi></msubsup></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msubsup><mi>t</mi><mi>N</mi><mi>T</mi></msubsup></mtd></mtr></mtable></mfenced><mrow><mi>N</mi><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
From the knowledge of the matrix, the solution of the above optimization problem is easily solved as
Figure GSA00000031561000064
Wherein
Figure GSA00000031561000065
Is H0The pseudo-inverse of the matrix. When matrix H0When the column is full rank, i.e. rank (H)0) When M, there are
Figure GSA00000031561000066
For the convenience of derivation, an intermediate matrix K is introduced such that K equals HTH, then
Figure GSA00000031561000067
Figure GSA00000031561000068
Let P be K-1Then, then
Figure GSA00000031561000069
Fourthly, fitting the newly added data by only adjusting the weight of the ELM output layer without adjusting the ELM structure
If the newly added data is
Figure GSA000000315610000610
(in fact, in this method N1Selecting as 1; and N is1The case where the value is more than 1 can be converted into N1Case processing equal to 1), the corresponding hidden layer output matrix and training target are respectively:
Figure GSA000000315610000611
<math><mrow><msub><mi>T</mi><mn>1</mn></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mn>1</mn></mrow></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><msub><mi>N</mi><mn>1</mn></msub><mo>)</mo></mrow><mn>1</mn></mrow></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><msub><mi>N</mi><mn>1</mn></msub><mo>)</mo></mrow><mi>m</mi></mrow></msub></mtd></mtr></mtable></mfenced><mo>=</mo><msub><mfenced open='[' close=']'><mtable><mtr><mtd><msubsup><mi>t</mi><mrow><mi>N</mi><mo>+</mo><mn>1</mn></mrow><mi>T</mi></msubsup></mtd></mtr><mtr><mtd><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo></mtd></mtr><mtr><mtd><msubsup><mi>t</mi><mrow><mi>N</mi><mo>+</mo><msub><mi>N</mi><mn>1</mn></msub></mrow><mi>T</mi></msubsup></mtd></mtr></mtable></mfenced><mrow><msub><mi>N</mi><mn>1</mn></msub><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
at this time, the objective function of the new output layer connection matrix β is found to be
<math><mrow><mi>min</mi><mo>(</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mi>&beta;</mi><mo>-</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>)</mo></mrow></math>
Get a solution to the above optimization problem
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msup><mrow><mo>(</mo><msub><mi>K</mi><mn>1</mn></msub><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo><msub><mi>K</mi><mn>1</mn></msub><mo>=</mo><msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced></mrow></math>
To achieve the goal of on-line modeling, β1Must be in contact with H0、T0Is not relevant, but only with respect to P0、K0、β0As a function of (c). As can be derived by a simple mathematical derivation,
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>K</mi><mn>1</mn><mrow><mo>-</mo><mn>1</mn></mrow></msubsup><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mrow><mo>(</mo><msub><mi>T</mi><mn>1</mn></msub><mo>-</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>)</mo></mrow></mrow></math>
K 1 = K 0 + H 1 T H 1
the two equations are the ELM training algorithm under the condition that the ELM structure is not adjusted and only the parameters are adjusted.
Fifthly, judging the new data X of the ELM obtained by the training method in the fourth step1Whether the training error above meets the requirements or not and whether the data X is judged or not1Whether or not it is in the category ball S0And (c) out. If the training error does not meet the requirement, and X1At S0Otherwise, turning to the sixth step, otherwise, turning to the seventh step.
Sixthly, adding a hidden layer node, and then adjusting the output layer connection weight
In the case of adding a hidden node, the hidden output matrix becomes
H 0 H 01 H 1 H 11
Wherein H01Data set X used for training process for newly adding hidden nodes0Output of (H)11For newly added hidden layer nodes with respect to newly added and unused data set X1To output of (c). Accordingly, the objective function of finding a new output layer connection matrix β becomes
<math><mrow><mi>min</mi><mo>(</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd><mtd><msub><mi>H</mi><mn>01</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mi>&beta;</mi><mo>-</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>)</mo></mrow></math>
But due to the used data set X0Has been discarded before the hidden node is added, which results in H01It is not known that the requirement for taking beta requires that H be treated first01An unknown problem.
As can be seen from the schematic diagram of the gaussian function (fig. 3), the output of the hidden node away from its center can be considered to be 0. If it isData set X0And X1The schematic diagram is shown in FIG. 2, in which S0The surrounding is the data used in the training process, and the point A is the position of the newly added data. Therefore, if the center of the node of the newly added hidden layer is selected as the point A, the output at the point C is small enough, that is, the requirement is met
<math><mrow><msup><mi>e</mi><mfrac><mrow><mo>-</mo><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><mi>a</mi><mo>|</mo><mo>|</mo></mrow><mi>b</mi></mfrac></msup><mo>&le;</mo><mi>&epsiv;</mi><mo>&DoubleRightArrow;</mo><mi>b</mi><mo>&le;</mo><mo>-</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><mi>a</mi><mo>|</mo><mo>|</mo></mrow><mrow><mi>ln</mi><mi>&epsiv;</mi></mrow></mfrac></mrow></math>
Where ε is a preselected threshold value, xcIs the coordinate of point C, and can be determined by the following two formulas
xc=xo1(xa-xo)
<math><mrow><msub><mi>&lambda;</mi><mn>1</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac></mrow></math>
Wherein x isoAs a classification ball S0Center of sphere coordinate of (1), xaIs the coordinate of point A, and R is a category sphere S0A radius.
After the center a and the width b of the newly added hidden layer node are selected according to the method, the output of the newly added hidden layer node at the point C is smaller than a very small real number epsilon which is in a category sphere S0The inner output is much smaller to be considered as 0, so H01Can be considered as a 0 matrix. At this time, the hidden layer output matrix after adding the hidden layer node is
<math><mrow><mi>min</mi><mo>(</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mi>&beta;</mi><mo>-</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>)</mo></mrow></math>
Solving the optimization problem can be obtained
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><mo>(</mo><msup><mrow><msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced></mrow></math>
If order
<math><mrow><mi>H</mi><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>H</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo><mi>&delta;H</mi><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>H</mi><mn>11</mn></msub></mtd></mtr></mtable></mfenced></mrow></math>
Then
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msup><mrow><mo>(</mo><msup><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msup><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>T</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced></mrow></math>
At this time, the process of the present invention,
<math><mrow><msub><mi>K</mi><mn>1</mn></msub><mo>=</mo><msup><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msup><mi>H</mi><mi>T</mi></msup></mtd></mtr><mtr><mtd><msup><mi>&delta;H</mi><mi>T</mi></msup></mtd></mtr></mtable></mfenced><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced></mrow></math>
order to
<math><mrow><msubsup><mi>K</mi><mn>1</mn><mrow><mo>-</mo><mn>1</mn></mrow></msubsup><mo>=</mo><mi>A</mi><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><msub><mi>A</mi><mn>12</mn></msub></mtd></mtr><mtr><mtd><msub><mi>A</mi><mn>21</mn></msub></mtd><mtd><msub><mi>A</mi><mn>22</mn></msub></mtd></mtr></mtable></mfenced><mo>=</mo><mo>(</mo><msup><mrow><mfenced open='[' close=']'><mtable><mtr><mtd><msup><mi>H</mi><mi>T</mi></msup></mtd></mtr><mtr><mtd><msup><mi>&delta;H</mi><mi>T</mi></msup></mtd></mtr></mtable></mfenced><mfenced open='[' close=']'><mtable><mtr><mtd><mi>H</mi></mtd><mtd><mi>&delta;H</mi></mtd></mtr></mtable></mfenced><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup></mrow></math>
Wherein each element of the matrix A is
A11=(HTH)-1+(HTH)-1(HTδH)×R-1(δHTH)(HTH)-1
A12=-(HTH)-1(HTδH)R-1
Figure GSA00000031561000088
A22=R-1
Wherein R is δ HTδH-(δHTH)(HTH)-1(HTδ H), substituting the expressions of H and δ H into:
A 11 = ( K 0 + H 1 T H 1 ) - 1 + ( K 0 + H 1 T H 1 ) - 1 ( H 1 T H 11 ) R - 1 ( H 11 T H 1 ) ( K 0 + H 1 T H 1 ) - 1
A 12 = - ( K 0 + H 1 T H 1 ) - 1 ( H 1 T H 11 ) R - 1 , A 21 = A 12 T , A22=R-1
R = H 11 T H 11 - ( H 11 T H 1 ) ( K 0 + H 1 T H 1 ) - 1 ( H 1 T H 11 )
when the hidden nodes are added by combining the above formulas, K, P and the updated formula of β are:
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>P</mi><mn>1</mn></msub><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>K</mi><mn>0</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo></mrow></math> K 1 = K 0 + H 1 T H 1 H 1 T H 11 H 11 T H 1 H 11 T H 11 , P 1 = K 1 - 1 = A 11 A 12 A 21 A 22
<math><mrow><msub><mi>A</mi><mn>11</mn></msub><mo>=</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>+</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>,</mo></mrow></math> <math><mrow><msub><mi>A</mi><mn>12</mn></msub><mo>=</mo><mo>-</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>,</mo></mrow></math> A 21 = A 12 T , A22=R-1
<math><mrow><mi>R</mi><mo>=</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>-</mo><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow></mrow></math>
<math><mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>=</mo><msup><mrow><mo>(</mo><msub><mi>K</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>=</mo><msub><mi>P</mi><mn>0</mn></msub><mo>-</mo><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msup><mrow><mo>(</mo><mi>I</mi><mo>+</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><mo>,</mo></mrow></math> P 0 = K 0 - 1
seventh, the category ball S is updated0Is S1So that S1Comprises S0Data X in (1)0And newly added data X1
As can be seen from FIG. 2, S1Should be at the midpoint of A and BThe radius should be half of the line segment AB. Wherein, point A is the position of the new data, point B is the ball S0The point farthest from point a. So that the new ball class has a center and a radius of
R new = | | x a - x b | | 2
x o _ new = x a + x b 2
xbIs the coordinate of point B, which can be determined by the following two formulas
xb=xo2(xo-xa)
<math><mrow><msub><mi>&lambda;</mi><mn>2</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>b</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac></mrow></math>
The flow chart of the method provided by the invention is shown in figure 1.
According to the online modeling method based on the structure-adjustable extreme learning machine, a large number of simulation experiments are performed, the simulation experiments are limited to space, and only the application effect of the method on actual steelmaking continuous casting quality prediction data is given. The data set originates from an industrial site, the input dimension is 84, and the output dimension is 1; the number of training samples was 1056 and the number of test samples was 508.
The invention compares the application effects of the SAO-ELM method and batch processing type algorithm-BP neural network algorithm and OS-ELM method. The BP neural network algorithm is a classical neural network training method, but it is not an online learning algorithm, and the OS-ELM method is an online learning algorithm, which differs from the algorithm proposed by the present invention in that it has no structural adjustability, i.e., lacks the fifth, sixth, and seventh steps in the present invention. The results of the comparison are shown in Table 1:
TABLE 1 comparison of the Performance of SAO-ELM and other algorithms
As can be seen from Table 1, the training accuracy and testing accuracy of SAO-ELM are best compared to the other two modeling methods, and the training time is nearly an order of magnitude less than that of BP algorithm. Fig. 5 shows an online modeling process, and it can be known from comparison of variation trends thereof that the number of hidden nodes is continuously increasing and the learning accuracy is continuously improving with the advancement of the modeling process, and the variation trends of the hidden nodes and the learning accuracy are consistent, which also shows the effectiveness of the SAO-ELM proposed by the present invention.

Claims (1)

1. An online modeling method based on an Extreme Learning Machine (ELM) with an adjustable structure is characterized in that the method is realized on a computer sequentially according to the following steps:
step (1): model selection and parameter initialization
Setting the number M of hidden layer nodes of a single hidden layer extreme learning machine, wherein the number of input layer nodes is the same as the dimension n of a training sample, and the number of output nodes is the same as the dimension M of a training target;
excitation function G (a) of hidden nodei,biX) using a Gaussian function, determined randomlyCenter a of each hidden nodeiAnd width bi,i=1,2,…M;
Based on the first N samplesTraining extreme learning machine to obtain initial hidden layer output matrix H0And output layer connection matrix beta0Wherein
β0=(H0 TH0)-1H0T0
Figure FSA00000031560900012
<math><mrow><msub><mi>T</mi><mn>0</mn></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>t</mi><mn>11</mn></msub><mo>.</mo><mo>.</mo><mo>.</mo><msub><mi>t</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>.</mo><mo>.</mo><mo>.</mo></mtd></mtr><mtr><mtd><msub><mi>t</mi><mrow><mi>N</mi><mn>1</mn></mrow></msub><mo>.</mo><mo>.</mo><mo>.</mo><msub><mi>t</mi><mi>Nm</mi></msub></mtd></mtr></mtable></mfenced><mo>=</mo><msub><mfenced open='[' close=']'><mtable><mtr><mtd><msubsup><mi>t</mi><mn>1</mn><mi>T</mi></msubsup></mtd></mtr><mtr><mtd><mo>.</mo><mo>.</mo><mo>.</mo></mtd></mtr><mtr><mtd><msubsup><mi>t</mi><mi>N</mi><mi>T</mi></msubsup></mtd></mtr></mtable></mfenced><mrow><mi>N</mi><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
the matrix K is initialized so thatComputingPreservation of beta0、K0And P0
Enclosing the initial training sample set X with a classification ball O0So that the ball just will be X0All the sample points in (a) are enclosed in the sphere, and the sphere center C of the sphere is determined0And a radius R0
Step (2): online learning process
Newly added training data x1=(xN+1,tN+1) At the time of arrival, the ELM is trained as follows, so that X can be stored0Can accommodate x1New knowledge contained
Step (2.1): maintaining the network structure unchanged according to x only1Adjusting output layer connection matrix beta0The updated output layer weight connection matrix is beta1Updating the matrix K simultaneously0And P0Is K1And P1
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msub><mi>P</mi><mn>1</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mrow><mo>(</mo><msub><mi>T</mi><mn>1</mn></msub><mo>-</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>)</mo></mrow></mrow></math>
P 1 = K 1 - 1 = P 0 - P 0 H 1 T ( I + H 1 P 0 H 1 T ) - 1 H 1 P 0
K 1 = K 0 + H 1 T H 1
Wherein H1And T1New sample x for ELM pairs, respectively1Hidden layer output matrix and training target matrix of (i.e. training target matrix)
H1=[G(a1,b1,xN+1)…G(aM,bM,xN+1)]1×M
<math><mrow><msub><mi>T</mi><mn>1</mn></msub><mo>=</mo><mo>[</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mn>1</mn></mrow></msub><mo>.</mo><mo>.</mo><mo>.</mo><msub><mi>t</mi><mrow><mrow><mo>(</mo><mi>N</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mi>m</mi></mrow></msub><mo>]</mo><mo>=</mo><msub><mrow><mo>[</mo><msubsup><mi>t</mi><mrow><mi>N</mi><mo>+</mo><mn>1</mn></mrow><mi>T</mi></msubsup><mo>]</mo></mrow><mrow><mn>1</mn><mo>&times;</mo><mi>m</mi></mrow></msub></mrow></math>
Step (2.2): ELM after calculating adjustment parameters is used for newly added sample x1E, judging a new sample x1If the ball is not outside the ball O and e is larger than the set threshold value, abandoning all the adjustments, and turning to the step (2.3), otherwise, turning to the step (3);
step (2.3): adding a hidden node, setting the center a as x1The width b is determined by
<math><mrow><mi>b</mi><mo>&le;</mo><mo>-</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><mi>a</mi><mo>|</mo><mo>|</mo></mrow><mrow><mi>ln</mi><mi>&epsiv;</mi></mrow></mfrac></mrow></math>
Wherein epsilon is a preset threshold value, xcFor new sample point x on the classification sphere O1The coordinates of the closest point can be determined by
xc=xo1(xa-xo)
Wherein x isoIs the sphere center coordinate, λ, of the sphere O1Can be determined by
<math><mrow><msub><mi>&lambda;</mi><mn>1</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>c</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>a</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac></mrow></math>
X in the above formulaaI.e. new sample point x1The coordinates of (a); readjusting output layer connection matrix beta0Is beta1And updating the matrix K accordingly0And P0Is K1And P1So that
<math><mrow><msub><mi>&beta;</mi><mn>1</mn></msub><mo>=</mo><msub><mi>P</mi><mn>1</mn></msub><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>K</mi><mn>0</mn></msub><msub><mi>&beta;</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>T</mi><mn>1</mn></msub></mtd></mtr></mtable></mfenced><mo>,</mo></mrow></math> K 1 = K 0 + H 1 T H 1 H 1 T H 11 H 11 T H 1 H 11 T H 11 , P 1 = K 1 - 1 = A 11 A 12 A 21 A 22
<math><mrow><msub><mi>A</mi><mn>11</mn></msub><mo>=</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>+</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>,</mo></mrow></math> <math><mrow><msub><mi>A</mi><mn>12</mn></msub><mo>=</mo><mo>-</mo><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow><msup><mi>R</mi><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>,</mo></mrow></math> A 21 = A 12 T , A22=R-1
<math><mrow><mi>R</mi><mo>=</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>-</mo><mrow><mo>(</mo><msubsup><mi>H</mi><mn>11</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mrow><mo>(</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>11</mn></msub><mo>)</mo></mrow></mrow></math>
<math><mrow><msubsup><mi>P</mi><mn>1</mn><mo>&prime;</mo></msubsup><mo>=</mo><msup><mrow><mo>(</mo><msub><mi>K</mi><mn>0</mn></msub><mo>+</mo><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msub><mi>H</mi><mn>1</mn></msub><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>=</mo><msub><mi>P</mi><mn>0</mn></msub><mo>-</mo><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><msup><mrow><mo>(</mo><mi>I</mi><mo>+</mo><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><msubsup><mi>H</mi><mn>1</mn><mi>T</mi></msubsup><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><msub><mi>H</mi><mn>1</mn></msub><msub><mi>P</mi><mn>0</mn></msub><mo>,</mo></mrow></math> P 0 = K 0 - 1
Wherein H01And H11Respectively adding hidden layer nodes to the original sample set X0And new sample point x1Hidden layer output matrix of, i.e.
Figure FSA00000031560900031
Figure FSA00000031560900032
N1And L are each x1The number of the sample points and the number of the nodes of the newly added hidden layer are taken into consideration, and N is the number of the new sample points which arrive one by one1=L=1;
And (3): updating parameters of category ball O
Updating the parameters of the ball O of the classification, i.e. its centre coordinates and radius, so that a new ball O1Just about X0And x1All the sample points in (1) are enclosed therein, and the update formula is as follows:
R new = | | x a - x b | | 2
x o _ new = x a + x b 2
wherein x isaAnd xbAre respectively new sample points x1Coordinate of (2) and ball O upper distance x1The coordinate of the farthest point, xbCan be calculated from the following formula,
xb=xo2(xo-xa)
wherein x isoIs the coordinate of the center of sphere O, λ2Can be calculated from the following formula,
<math><mrow><msub><mi>&lambda;</mi><mn>2</mn></msub><mo>=</mo><mfrac><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>b</mi></msub><mo>-</mo><msub><mi>x</mi><mi>o</mi></msub><mo>|</mo><mo>|</mo></mrow><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>=</mo><mfrac><mi>R</mi><mrow><mo>|</mo><mo>|</mo><msub><mi>x</mi><mi>o</mi></msub><mo>-</mo><msub><mi>x</mi><mi>a</mi></msub><mo>|</mo><mo>|</mo></mrow></mfrac><mo>.</mo></mrow></math>
CN2010101194082A 2010-03-08 2010-03-08 Online modeling method based on extreme learning machine with adjustable structure Expired - Fee Related CN101807046B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101194082A CN101807046B (en) 2010-03-08 2010-03-08 Online modeling method based on extreme learning machine with adjustable structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101194082A CN101807046B (en) 2010-03-08 2010-03-08 Online modeling method based on extreme learning machine with adjustable structure

Publications (2)

Publication Number Publication Date
CN101807046A true CN101807046A (en) 2010-08-18
CN101807046B CN101807046B (en) 2011-08-17

Family

ID=42608870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101194082A Expired - Fee Related CN101807046B (en) 2010-03-08 2010-03-08 Online modeling method based on extreme learning machine with adjustable structure

Country Status (1)

Country Link
CN (1) CN101807046B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708381A (en) * 2012-05-09 2012-10-03 江南大学 Improved extreme learning machine combining learning thought of least square vector machine
CN103106331A (en) * 2012-12-17 2013-05-15 清华大学 Photo-etching line width intelligence forecasting method based on dimension-reduction and quantity-increment-type extreme learning machine
WO2013182176A1 (en) * 2012-06-06 2013-12-12 Kisters Ag Method for training an artificial neural network, and computer program products
CN104537167A (en) * 2014-12-23 2015-04-22 清华大学 Interval type index forecasting method based on robust interval extreme learning machine
CN108229026A (en) * 2018-01-04 2018-06-29 电子科技大学 A kind of electromagnetic field modeling and simulating method based on dynamic core extreme learning machine
CN111125760A (en) * 2019-12-20 2020-05-08 支付宝(杭州)信息技术有限公司 Model training and predicting method and system for protecting data privacy
CN113569038A (en) * 2021-07-28 2021-10-29 北京明略昭辉科技有限公司 Method and device for sorting recalled documents, electronic equipment and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5621648A (en) * 1994-08-02 1997-04-15 Crump; Craig D. Apparatus and method for creating three-dimensional modeling data from an object
CN101504736A (en) * 2009-02-27 2009-08-12 江汉大学 Method for implementing neural network algorithm based on Delphi software
CN101576734A (en) * 2009-06-12 2009-11-11 北京工业大学 Dissolved oxygen control method based on dynamic radial basis function neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5621648A (en) * 1994-08-02 1997-04-15 Crump; Craig D. Apparatus and method for creating three-dimensional modeling data from an object
CN101504736A (en) * 2009-02-27 2009-08-12 江汉大学 Method for implementing neural network algorithm based on Delphi software
CN101576734A (en) * 2009-06-12 2009-11-11 北京工业大学 Dissolved oxygen control method based on dynamic radial basis function neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《山东大学学报(理学版)》 20100531 李彬等 ELM_RBF神经网络的智能优化策略 48-51 1 第45卷, 第5期 *
《电网技术》 20020430 丁坚勇等 基于ELMAN神经网络的同步电机动态参数在线辨识 22-25 1 第26卷, 第4期 *
《系统仿真学报》 20071231 常玉清等 基于极限学习机的生化过程软测量建模 5587-5590 1 第19卷, 第23期 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708381A (en) * 2012-05-09 2012-10-03 江南大学 Improved extreme learning machine combining learning thought of least square vector machine
CN102708381B (en) * 2012-05-09 2014-02-19 江南大学 Improved extreme learning machine combining learning thought of least square vector machine
WO2013182176A1 (en) * 2012-06-06 2013-12-12 Kisters Ag Method for training an artificial neural network, and computer program products
CN103106331A (en) * 2012-12-17 2013-05-15 清华大学 Photo-etching line width intelligence forecasting method based on dimension-reduction and quantity-increment-type extreme learning machine
CN103106331B (en) * 2012-12-17 2015-08-05 清华大学 Based on the lithographic line width Intelligent Forecasting of dimensionality reduction and increment type extreme learning machine
CN104537167A (en) * 2014-12-23 2015-04-22 清华大学 Interval type index forecasting method based on robust interval extreme learning machine
CN104537167B (en) * 2014-12-23 2017-12-15 清华大学 Interval type indices prediction method based on Robust Interval extreme learning machine
CN108229026A (en) * 2018-01-04 2018-06-29 电子科技大学 A kind of electromagnetic field modeling and simulating method based on dynamic core extreme learning machine
CN108229026B (en) * 2018-01-04 2021-07-06 电子科技大学 Electromagnetic field modeling simulation method based on dynamic kernel extreme learning machine
CN111125760A (en) * 2019-12-20 2020-05-08 支付宝(杭州)信息技术有限公司 Model training and predicting method and system for protecting data privacy
CN111125760B (en) * 2019-12-20 2022-02-15 支付宝(杭州)信息技术有限公司 Model training and predicting method and system for protecting data privacy
CN113569038A (en) * 2021-07-28 2021-10-29 北京明略昭辉科技有限公司 Method and device for sorting recalled documents, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN101807046B (en) 2011-08-17

Similar Documents

Publication Publication Date Title
CN101807046A (en) Online modeling method based on extreme learning machine with adjustable structure
WO2016101182A1 (en) Interval type indicator forecasting method based on bayesian network and extreme learning machine
CN109255160B (en) Neural network-based unit delay prediction method and unit delay sensitivity calculation method
Chen et al. Kernel least mean square with adaptive kernel size
CN107561503A (en) A kind of adaptive target tracking filtering method based on the Multiple fading factor
JP6404909B2 (en) How to calculate the output model of a technical system
CN106372278A (en) Sensitivity analysis method jointly considering input parameter uncertainty and proxy model uncertainty
CN110008404B (en) Latent semantic model optimization method based on NAG momentum optimization
JP6950756B2 (en) Neural network rank optimizer and optimization method
CN108181812A (en) BP neural network-based valve positioner PI parameter setting method
CN108490115B (en) Air quality abnormity detection method based on distributed online principal component analysis
CN110286595B (en) Fractional order system self-adaptive control method influenced by saturated nonlinear input
CN109447272A (en) A kind of extreme learning machine method based on center of maximum cross-correlation entropy criterion
CN113176022B (en) Segmented neural network pressure sensor pressure detection method and system
CN111310348A (en) Material constitutive model prediction method based on PSO-LSSVM
CN107181474A (en) A kind of kernel adaptive algorithm filter based on functional expansion
CN106296434A (en) A kind of Grain Crop Yield Prediction method based on PSO LSSVM algorithm
Mi et al. Prediction of accumulated temperature in vegetation period using artificial neural network
Lukić et al. Neural networks-based real-time determination of the laser beam spatial profile and vibrational-to-translational relaxation time within pulsed photoacoustics
CN109597006B (en) Optimal design method for magnetic nanoparticle measurement position
CN105092509B (en) A kind of sample component assay method of PCR-based ELM algorithms
Cai et al. Influence of partially known parameter on flaw characterization in Eddy Current Testing by using a random walk MCMC method based on metamodeling
CN111210877A (en) Method and device for deducing physical property parameters
Faqih et al. Multi-Step Ahead Prediction of Lorenz's Chaotic System Using SOM ELM-RBFNN
Zirkohi et al. Design of Radial Basis Function Network Using Adaptive Particle Swarm Optimization and Orthogonal Least Squares.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110817

Termination date: 20140308