CN109635245A - A kind of robust width learning system - Google Patents

A kind of robust width learning system Download PDF

Info

Publication number
CN109635245A
CN109635245A CN201811362948.6A CN201811362948A CN109635245A CN 109635245 A CN109635245 A CN 109635245A CN 201811362948 A CN201811362948 A CN 201811362948A CN 109635245 A CN109635245 A CN 109635245A
Authority
CN
China
Prior art keywords
matrix
data
training data
formula
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811362948.6A
Other languages
Chinese (zh)
Inventor
褚菲
梁涛
王雪松
程玉虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Publication of CN109635245A publication Critical patent/CN109635245A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of robust width learning system acquires training data and carries out linear transformation processing to training data;Extension input matrix is solved by input data matrix and enhancing node matrix equation, recycles ridge regression algorithm to solve the initial connection weight matrix of iteration, recycles residual error equations residual matrix;Residual error probability density function is acquired using Density Estimator algorithm, then calculates the weight matrix of all training data compositions;Solve the connection weight matrix of kth time iteration, if the maximum value of the absolute value of the difference of the output weight of adjacent two step is no more than the threshold values of setting, or the number of iterations reaches preset maximum number of iterations, iteration terminates, and robust width learning system stops training and establishing robust width model of learning system for model.The problem of system can improve the robustness of width learning system, can effectively inhibit adverse effect of the outlier bring to modeling accuracy, convenient for establishing robust width system model, with the prediction of the index of correlation suitable for complex industrial process.

Description

Robust width learning system
Technical Field
The invention belongs to the technical field of industrial process modeling, and particularly relates to a robust width learning system.
Background
The control and optimization problem of complex industrial processes has been a popular research direction, and the establishment of accurate models of complex industrial processes is the premise and basis for control and optimization. The mechanism modeling is based on physical and chemical mechanism analysis of the process, and a functional relation among the operation variables, the state variables and the output variables is obtained through derivation. The mechanism modeling can accurately express the relation between variables, effectively explain objective phenomena and avoid the situation of violating the common principle, but the modeling difficulty is high, the modeling period is long and researchers need to know all relevant theoretical knowledge. Over the past years, data-driven based modeling techniques have received increasing attention from researchers, and their simplicity and convenience characteristics have significantly improved the efficiency of modeling. With the rapid development of the artificial neural network, some researchers apply the artificial neural network to data-driven modeling, and benefit from the strong learning ability of the artificial neural network, the data modeling method based on the artificial neural network is rapidly popularized and applied.
As artificial intelligence is getting more and more attention, the deep learning neural network behind the artificial intelligence is widely applied to various advanced fields, such as pattern recognition, face recognition, speech recognition, and the like. Depending on a large number of feature layers in the network, the deep learning neural network is very suitable for processing high-dimensional big data. Although the deep learning neural network has a strong learning ability, its complex structure causes it to face numerous tuning parameters, resulting in it needing to go through a lengthy learning process. And in order to obtain better learning effect, the deep learning neural network needs a large amount of support of high-performance computers. However, the actual industrial field is narrow in space and severe in environment, a large number of high-performance computers are difficult to install, and moreover, the high-performance computers are expensive, and the production cost is increased due to the large number of the computers, so that the comprehensive benefits of enterprises are not improved. Recently, some scholars have proposed novel neural networks for simplifying the training process, which can effectively reduce the dependence on computer resources.
The breadth learning system is an effective and efficient novel artificial neural network. The width learning system fully utilizes the advantages of a random vector function connection network, and has direct connection between an input layer and an output layer, so that the network structure is flattened. Compared with a deep learning neural network, the width learning system is simple in structure and less in adjustable parameters. Compared with a deep learning neural network which solves the connection weight of the network by using a complex iterative process, the width learning system calculates the connection weight through a ridge regression algorithm. The simple network structure enables the width learning system to be easily expanded, and the rapid updating of the model can be realized by combining with a corresponding incremental learning algorithm. The width learning system has certain application in the aspects of image identification, time sequence prediction and complex industrial process modeling, and has good effects.
However, in the actual industrial production process, due to the influence of factors such as sensor faults and environmental noise, a certain number of outliers may exist in the acquired sample data, if the sample containing the outliers is used as a training set for training the width learning system, the generalized approximation capability of the width learning system is influenced, and the prediction accuracy of the established width learning system model cannot meet the industrial requirements. Therefore, the existing width learning system has poor robustness, can not inhibit the problem of adverse influence on modeling precision caused by outliers,
disclosure of Invention
Aiming at the problems in the prior art, the invention provides a robust width learning system which can improve the robustness of the width learning system, effectively inhibit the problem of adverse influence on modeling precision caused by outliers, and facilitate the establishment of a robust width system model so as to be suitable for the prediction of relevant indexes in a complex industrial process.
In order to achieve the above object, the present invention provides a robust width learning system, comprising the steps of:
step 1: data preprocessing, which comprises the following steps:
step 1.1: collecting training data, and setting input data matrix of training data asThe output data matrix is
Wherein N is the number of samples of training data;
m and C respectively correspond to the variable numbers of the input data and the output data;
representing a real number domain;
step 1.2: performing linear transformation processing on the training data according to a transformation function in formula (1), and mapping the result value to [ -1,1], wherein the transformation function is:
wherein,represents the transformed data;
x and Y represent data to be converted;
Xmin,Yminrepresents the minimum value in the data to be converted;
Xmax,Ymaxrepresents the maximum value in the data to be converted;
step 2: solving a residual matrix R, which comprises the following steps:
step 2.1: assuming that the number of the enhanced node groups of the robust width learning system is m, each group of q enhanced nodes is solved according to a formula (2)m,AmIs formed by combining an input data matrix and an enhanced node matrix, AmThe expression is as follows
Am=[X0|Hm](2);
Wherein, X0To representThe formed input data matrix;
Hm=[H1,...,Hm]a matrix of enhanced nodes is represented that is,
andrespectively representing a weight matrix and a bias matrix of the enhanced node group, and randomly generating by a system;
ξ (-) represents the activation function, which is a nonlinear function on the enhancement node and is responsible for mapping the input of the enhancement node to the output, and the commonly used sigmod function is adopted as the activation function, and the activation function is shown as formula (3):
step 2.2: solving an iterative initial connection weight matrix using a ridge regression algorithm according to equation (4)
Wherein, I represents an identity matrix;
Y0to representThe formed output data matrix;
λ represents a regularization parameter;
superscript 'T' represents the transpose of the matrix;
step 2.3: solving a residual matrix R using a residual equation according to equation (5):
wherein R ═ R1,r2,...,rN]T
riA residual representing the ith training data;
and step 3: calculating a weight matrix of the training data, comprising the following steps:
step 3.1: and (3) solving a residual probability density function by utilizing a kernel density estimation algorithm according to a formula (6):
wherein,
is the standard deviation of the residual;
k (-) is the kernel density function, expressed as equation (7):
step 3.2: calculating weight matrix theta formed by all training data, and weight theta of the first training datal=f(ri) Calculating θ according to equation (6)lAll of thetalForm a weight matrix theta, theta ═ theta1,...,θN]T
And 4, step 4: establishing a robust width learning system model, which comprises the following specific steps:
step 4.1, solving the connection weight matrix of the kth iteration according to the formula (8)
Wherein k is 1,2,3 …;
step 4.2, if the maximum value of the absolute value of the difference of the output weights of two adjacent steps is larger than a set threshold value epsilon, namelyThen by the formulaCalculating a new residual error matrix R and returning to the step 3 untilOr the iteration times reach the preset maximum iteration times, the iteration is finished, the robust width learning system stops the training of the model and establishes the robust width learning system model, and the model output prediction expression formula (9) is as follows:
wherein x isnewRepresenting new input data;
ypredictionRepresenting the predicted output data.
Hm=[H1,...,Hm],
Further, when new training data is input, the system is updated through the following steps:
s1: setting the input data matrix of new training data asThe output data matrix isWherein a represents the number of newly added training data;
s2: carrying out linear conversion processing on the new training data by using the formula (1) in the step 1.2, and obtaining an input data matrix X of the converted dataa0And the output data matrix Ya0
S3: calculating a residual error matrix R corresponding to the newly added training dataa
S3.1: solving the extended input matrix A corresponding to the newly added training data according to the formula (10)a
S3.2: by the formulaObtaining new extended input matrix of systemaAm
S3.3: solving the new connection weight matrix of the iteration initial according to the formula (11)
Wherein B is derived from equation (12):
wherein C is Aa-DTAm
Is AmThe pseudo-inverse matrix of (2) is calculated according to equation (13):
s3.4: solving a residual error matrix R corresponding to newly added training data by using a residual error formula according to the formula (14)a
Wherein R isa=[r1,r2,...,ra]T
S4: calculating the weight of newly added training data:
s4.1: and (3) obtaining a residual probability density function of the newly added training data by utilizing a kernel density estimation algorithm, wherein the function formula is as follows:
s4.2: calculating all weights of newly added training data to form a weight matrix thetaaWeight θ of the I-th training datala=fa(ri) Calculating θ according to the formula (15)laAll of thetalaForm a weight matrix thetaa,θa=[θ1a,...,θaa]T
S5: updating the robust breadth learning system, calculating according to equation (16)aAmCorresponding pseudo-inverse matrix (aAm)+
Wherein,
b' is derived from equation (17):
wherein, C ═ θaAa-D'θaAm
Solving for k according to equation (18)aNew connection weight matrix of sub-iteration
Wherein k isa=1,2,3…;
S6: if the maximum value of the absolute value of the difference of the new connection weight matrix of two adjacent steps is larger than the set threshold value epsilon, namelyThen by the formulaCalculating a new residual matrix RaAnd returning to S4 to calculateaWmAnd (a)aAm)+Up toOr if the iteration times reach the preset maximum iteration times, ending the iteration and obtaining a new connection weight matrixAnd updating the model, the expression formula (19) of the output prediction of the new model is as follows
Further, when a new enhanced node is added, the system is updated through the following steps:
step a: if the number of the newly added group of enhanced nodes is q, the newly added enhanced nodes are expressed by a formula (20) as follows:
wherein,randomly generated by the system;
randomly generated by the system
Step b: after adding the enhanced node, a new extended input matrix Am+1=[Am|Hm+1]Derived from equation (21) to yield Am+1Pseudo inverse matrix of (A)m+1)+
Wherein,
b' is derived from equation (22):
wherein, C ═ θ Hm+1-θAmD';
Step c, solving a new connection weight matrix W after the enhanced node is added according to the formula (23)m+1
D, outputting a prediction expression formula (24) by the updated model as follows:
yprediction=[xnew|Hm+1]Wm+1(24)
Wherein Hm+1=[H1,...,Hm+1];
Firstly, obtaining a predicted value by using a width learning system; then, calculating the weight of each data according to the residual error between the predicted value and the actual value by using a kernel density estimation method, wherein the normal data is distributed with a larger weight, and the suspected outlier is distributed with a smaller weight, so that the robust width learning system can automatically learn correct information in a training set; and finally, calculating the connection weight of the system by using a weighted ridge regression algorithm in the invention, thereby establishing a robust width learning system model. The invention simultaneously provides two robust incremental learning algorithms which are respectively used for increasing training data and enhancing nodes, thereby realizing the quick update of the established robust width learning system model. Compared with the original width learning system, the method has obvious advantages in the aspects of prediction precision, robustness and generalization when outliers exist in the training data set. The method improves the robustness of the width learning system and inhibits the adverse effect of outliers.
Drawings
FIG. 1 is a comparison graph of the fitting effect of a robust width learning system model and a width learning system model on a test data set;
FIG. 2 is a graph of the variation trend of the root mean square error of the robust width learning system model and the width learning system model test with the increase of the outlier content;
FIG. 3 is a graph showing the variation trend of the root mean square error of the robust width learning system model and the width learning system model with the increase of the training data;
FIG. 4 is a graph of the variation trend of the root mean square error of the robust width learning system model and the width learning system model with the addition of training data and enhanced nodes;
fig. 5 is a comparison graph of learning effects of a robust incremental learning method of adding training data and enhancing nodes.
Detailed Description
The invention is further illustrated by the following examples and figures.
A robust width learning system comprising the steps of:
step 1: data preprocessing, which comprises the following steps:
step 1.1: collecting training data, and setting input data matrix of training data asThe output data matrix is
Wherein N is the number of samples of training data;
m and C respectively correspond to the variable numbers of the input data and the output data;
representing a real number domain;
step 1.2: performing linear transformation processing on the training data according to a transformation function in formula (1), and mapping the result value to [ -1,1], wherein the transformation function is:
wherein,represents the transformed data;
x and Y represent data to be converted;
Xmin,Yminrepresents the minimum value in the data to be converted;
Xmax,Ymaxrepresents the maximum value in the data to be converted;
step 2: solving a residual matrix R, which comprises the following steps:
step 2.1: assuming that the number of the enhanced node groups of the robust width learning system is m, each group of q enhanced nodes is solved according to a formula (2)m,AmIs formed by combining an input data matrix and an enhanced node matrix, AmThe expression is as follows
Am=[X0|Hm](2);
Wherein, X0To representGroup (A) ofA matrix of input data;
Hm=[H1,...,Hm]a matrix of enhanced nodes is represented that is,
andrespectively representing a weight matrix and a bias matrix of the enhanced node group, and randomly generating by a system;
ξ (-) represents the activation function, which is a nonlinear function on the enhancement node and is responsible for mapping the input of the enhancement node to the output, and the commonly used sigmod function is adopted as the activation function, and the activation function is shown as formula (3):
step 2.2: solving an iterative initial connection weight matrix using a ridge regression algorithm according to equation (4)
Wherein, I represents an identity matrix;
Y0to representThe formed output data matrix;
λ represents a regularization parameter;
superscript 'T' represents the transpose of the matrix;
step 2.3: solving a residual matrix R using a residual equation according to equation (5):
wherein R ═ R1,r2,...,rN]T
riA residual representing the ith training data;
and step 3: calculating a weight matrix of the training data, comprising the following steps:
step 3.1: and (3) solving a residual probability density function by utilizing a kernel density estimation algorithm according to a formula (6):
wherein,
is the standard deviation of the residual;
k (-) is the kernel density function, expressed as equation (7):
step 3.2: calculating weight matrix theta formed by all training data, and weight theta of the first training datal=f(ri) Calculating θ according to equation (6)lAll of thetalForm a weight matrix theta, theta ═ theta1,...,θN]T
And 4, step 4: establishing a robust width learning system model, which comprises the following specific steps:
step 4.1, solving the connection weight matrix of the kth iteration according to the formula (8)
Wherein k is 1,2,3 …;
step 4.2, if the maximum value of the absolute value of the difference of the output weights of two adjacent steps is larger than a set threshold value epsilon, namelyWherein max (·) represents taking the maximum value in the sequence, then passing through the formulaCalculating a new residual error matrix R and returning to the step 3 untilOr the iteration times reach the preset maximum iteration times, the iteration is finished, the robust width learning system stops the training of the model and establishes the robust width learning system model, and the model output prediction expression formula (9) is as follows:
wherein Hm=[H1,...,Hm],
xnewRepresenting new input data;
ypredictionRepresenting the predicted output data.
When new training data is input, the system is updated by the following steps:
s1: setting the input data matrix of new training data asThe output data matrix isWherein a represents the number of newly added training data;
s2: carrying out linear conversion processing on the new training data by using the formula (1) in the step 1.2, and obtaining an input data matrix X of the converted dataa0And the output data matrix Ya0
S3: calculating a residual error matrix R corresponding to the newly added training dataa
S3.1: solving the extended input matrix A corresponding to the newly added training data according to the formula (10)a
S3.2: by the formulaObtaining new extended input matrix of systemaAm
S3.3: solving the new connection weight matrix of the iteration initial according to the formula (11)
Wherein B is derived from equation (12):
wherein C is Aa-DTAm
Is AmThe pseudo-inverse matrix of (2) is calculated according to equation (13):
s3.4: solving a residual error matrix R corresponding to newly added training data by using a residual error formula according to the formula (14)a
Wherein R isa=[r1,r2,...,ra]T
S4: calculating the weight of newly added training data:
s4.1: and (3) obtaining a residual probability density function of the newly added training data by utilizing a kernel density estimation algorithm, wherein the function formula is as follows:
s4.2: calculating weight set of all newly added training dataWeight matrix thetaaWeight θ of the I-th training datala=fa(ri) Calculating θ according to the formula (15)laAll of thetalaForm a weight matrix thetaa,θa=[θ1a,...,θaa]T
S5: updating the robust breadth learning system, calculating according to equation (16)aAmCorresponding pseudo-inverse matrix (aAm)+
Wherein,
b' is derived from equation (17):
wherein, C ═ θaAa-D'θaAm
Solving for k according to equation (18)aNew connection weight matrix of sub-iteration
Wherein k isa=1,2,3…;
S6: if the maximum value of the absolute value of the difference of the new connection weight matrix of two adjacent steps is larger than the set threshold value epsilon, namelyThen by the formulaCalculating a new residual matrix RaAnd returning to S4 to calculateaWmAnd (a)aAm)+Up toOr if the iteration times reach the preset maximum iteration times, ending the iteration and obtaining a new connection weight matrixAnd updating the model, the expression formula (19) of the output prediction of the new model is as follows
When a new enhanced node is added, the system is updated by the following steps:
step a: if the number of the newly added group of enhanced nodes is q, the newly added enhanced nodes are expressed by a formula (20) as follows:
wherein,randomly generated by the system;
randomly generated by the system
Step b: after adding the enhanced node, a new extended input matrix Am+1=[Am|Hm+1]Derived from equation (21) to yield Am+1Pseudo inverse matrix of (A)m+1)+
Wherein,
b' is derived from equation (22):
wherein, C ═ θ Hm+1-θAmD';
Step c, solving a new connection weight matrix W after the enhanced node is added according to the formula (23)m+1
D, outputting a prediction expression formula (24) by the updated model as follows:
yprediction=[xnew|Hm+1]Wm+1(24);
Wherein Hm+1=[H1,...,Hm+1];
Example (b):
the embodiment is a large-scale industrial multistage centrifugal compressor modeling method, and a robust width learning system model is established for predicting the output pressure ratio of the large-scale industrial multistage centrifugal compressor, and the method comprises the following specific steps:
510 sets of large industrial multistage centrifugal compressor operating data (this data is taken from a certain steel mill actually operating unit) were collected, and the input data variables included: inlet pressure, inlet temperature and inlet flow, the output data variable is the output pressure ratio. 400 of the data are selected as training set, and 110 are selected as test set. In order to ensure the truth and effectiveness of the test result, a certain number of outliers are added to the training set. Outlier addition was as follows: the percentage of the number of outliers to the total training data is gamma, which is set in the training set. Of the outliers, 50% of the outliers are high-lever outliers (i.e., points in the input data that are not within the normal range, but whose corresponding output values are within a reasonable range), which can be generated by randomly adding or subtracting noise on the input data of the normal training data that does not exceed the maximum value of the input data in the training set; the remaining 50% of outliers are high residual outliers (i.e., points in the output data where the residual is larger than other points), which may be generated by randomly adding or subtracting noise on the output data of normal training data that does not exceed the maximum of the output data in the training set.
1. Establishing a robust width learning system model
Step 1.1: and (4) preprocessing data. Let the input data matrix of the training set containing gamma outliers beThe output data matrix isWhere N-400 is the number of samples of training data, M-3 and C-1 correspond to the number of variables of input and output data, respectively,representing a real number domain. Performing linear conversion processing on the training data, and mapping the result value to [ -1,1 [ -1 [ ]]The transfer function is:
wherein,representing converted data, X, Y representing data to be converted, Xmin,YminRepresenting the minimum value, X, of the data to be convertedmax,YmaxRepresenting the maximum value in the data to be converted. X for input data matrix after linear conversion0Representing, outputting data matrices by Y0And (4) showing.
Step 1.2: a residual matrix R is calculated. Assuming that the number of enhanced node groups of the robust width learning system is m (m is 11), each group has q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the systemmThe system is formed by combining an input data matrix and an enhanced node matrix, and the expression is as follows
Am=[X0|Hm](2)
Wherein Hm=[H1,...,Hm]A matrix of enhanced nodes is represented that is,whileAndrespectively representing the weight matrix and the bias matrix of the enhanced node group, and is controlled by the system from [ -10,10 [ -10 [ ]]ξ (-) represents the activation function, is a nonlinear function on the enhancement node, and is responsible for mapping the input of the enhancement node to the output, and the patent adopts the commonly used sigmod function as the activation function, and the formula is as follows
Then solving iterative initial connection weight matrix by using ridge regression algorithm
Where I represents the identity matrix and λ represents the regularization parameter (λ ═ 2)-8) The superscript 'T' denotes the transpose of the matrix. Finally, solving a residual matrix R by using a residual formula, wherein R is [ R ═ R1,r2,...,rN]TAnd riRepresenting the residual of the ith training data.
Step 1.3: and calculating a weight matrix of the training data. Firstly, a residual probability density function is obtained by utilizing a kernel density algorithm, and the function formula is as follows:
wherein Is the standard deviation of the residual error, k (-) is the kernel density function, and the expression is:
the weight value theta of the last training datalCan be represented by the formula (6) (θ)l=f(ri) Where l, i ═ 1 … N). All trainingsThe weight values of the training data form a weight matrix theta, and theta is equal to [ theta [ [ theta ]1,...,θN]T
Step 1.4: and establishing a robust width learning system model. Firstly, the extended input matrix A calculated in step 1.2 is usedmAnd 1.3, solving a connecting weight matrix of the kth iteration through the weighted ridge regression algorithmWherein k is 1,2, 3.
If the maximum value of the absolute value of the difference between the output weights of two adjacent steps is greater than the set threshold epsilon equal to 0.1, namely(the function max (. cndot.) represents taking the maximum value in the series), i.e.(the function max (. cndot.) represents taking the maximum value in the series), then by the formulaCalculate new residual matrix R and return to step 1.3 untilOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model and establishes a robust width learning system model, and the output prediction expression of the model is as follows
Wherein Hm=[H1,...,Hm],xnewRepresenting new input data, yPredictionRepresenting the predicted output data.
And verifying the established model by using the test set, wherein the verification result is as follows: firstly, under the condition that lambda is 20, the robust width learning system model is used for predicting test data, and for comparison, the robust width learning system model is used for predicting the test data, and the predicted values and the true values of the two models are shown in fig. 1. It can be seen from fig. 1 that the predicted value of the robust width learning system model is closer to the true value. Next, the prediction capabilities of the two models are compared under the condition of different numbers of outliers, that is, λ ═ 0,5,10,15,20,25,30, and the Root mean square Error (Root mean square Error, abbreviated as RMSE) is used as a criterion, and the Root mean square Error is as follows:
wherein, yiA predicted value representing the output of the ith test data,an actual value of an output of the ith test data is represented, and N represents the number of test data. Fig. 2 is a diagram of the root mean square error variation of the prediction accuracy of the robust width learning system model and the width learning system as the proportion of the number of outliers to the total training data is higher, and the related specific values are shown in table one. It can be clearly seen from fig. 2 and table 1 that the prediction accuracy of the model established by the robust width learning system of the present invention is significantly higher than that of the model established by the width learning system, which indicates that the present invention effectively improves the robustness and the generalization of the width learning system.
Table 1 shows the root mean square error of the robust width learning system model and the width learning system model under different outlier contents
2. Robust incremental learning algorithm for increasing training data
Step 2.1: and (4) preprocessing data. Let the input data matrix of the training set containing 20 outliers beThe output data matrix is(N ═ 100, M ═ 3, and C ═ 1). The training data is subjected to linear conversion processing by using formula (1), and the result value is mapped to [ -1,1 [ -1 [ ]]. X for the transformed input data matrix0Representing, outputting data matrices by Y0And (4) showing.
Step 2.2: and solving a residual error matrix R. Assuming that the number of enhanced node group sets of the robust width learning system is m (m is 3), each group has q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the systemmThe node enhancement method is formed by combining an input data matrix and an enhancement node matrix, and the expression is as shown in formula (2). Then, the iterative initial connection weight matrix is solved by using the formula (4)Finally, solving a residual matrix R by using a formula (5), wherein R is [ R ═ R1,r2,...,rN]T
Step 2.3: and calculating the weight of the training data. Firstly, a residual probability density function is obtained by using a kernel density estimation algorithm, as shown in formula (6). The weight value theta of the last training datalCan be represented by the formula (6) (θ)l=f(ri) Where l, i ═ 1 … N).All the weights of the training data form a weight matrix theta, and theta is equal to [ theta ═ theta [ [ theta ]1,...,θN]T
Step 2.4: and establishing a robust width learning system model. Firstly, the extended input matrix A calculated in step 2.2 is usedmAnd 2.3, solving a connection weight matrix of the kth iteration through a formula (8)Where k is 1,2,3 …, 30. And A ismPseudo inverse matrix ofCan be determined by the formula, where λ ═ 2-8
If the maximum value of the absolute value of the difference between the connection weight matrices of two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R, returning to the step 2.3, and calculating a new connection weight matrix WmAndup toOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model and establishes a robust width learning system model, and the output prediction expression of the model is as follows
Wherein Hm=[H1,...,Hm],
Step 2.5: and preprocessing new training data. Setting the input data matrix of newly added training data asThe output data matrix isWherein a is 30. The new training data is subjected to a linear transformation process using equation (1), where Xmax,Xmin,YmaxAnd YminThe maximum and minimum values determined in step 2.1. For simplicity, the input data matrix for the newly added training data after linear transformation is Xa0Representing, outputting data matrices by Ya0And (4) showing.
Step 2.6: calculating a residual error matrix R corresponding to the newly added training dataa. Firstly, solving an extended input matrix A corresponding to newly added training dataa
WhereinAndthe value generated in step 2.2. Novel extended input matrix of systemIterating the initial new connection weight matrix may be performed byThe following formula is used to obtain a pair
And C ═ Aa-DTAmFinally, solving a residual matrix R corresponding to the newly added training data by using a residual formulaa,Ra=[r1,r2,...,ra]T
Step 2.7: and solving the newly-added training data weight. And (3) obtaining a residual probability density function of the newly added training data by using a kernel density estimation algorithm, wherein the function formula is as follows:
whereinThe weight theta of the first training datalCan be represented by the formula (16) (θ)l=fa(ri) Where l, i ═ 1 … a), the weights of all the training data make up a weight matrix θaa=[θ1,...,θa]T
Step 2.8: and updating the robust width learning system model. Using results obtained in step 2.4A obtained in step 2.6aTheta determined in step 2.7aCalculating and solvingaAmCorresponding pseudo-inverse matrix (aAm)+The calculation formula is as follows
Wherein
Wherein C ═ θaAa-D'θaAmThen k isa(ka≦ 30) new connection weight matrix for iterationsCan be calculated by
If the maximum value of the absolute value of the difference between the new connection weight matrices in two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix RaAnd returning to step 2.7 to derive calculation in turnaWmAnd (a)aAm)+Up toOr when the iteration times reach 30, the iteration is ended, and a new connection weight matrix is obtainedAnd updates the model. The expression of the new model output prediction is as follows
Step 2.9: and verifying the updated model each time by using the test set, and calculating the root mean square error of the predicted value of the model. Returning to the step 2.5, continuing to add 30 groups of training data, and updating the model according to the step sequence until all 400 groups of training data are added, and stopping updating the model.
The results are as follows: fig. 3 is a comparison graph of root mean square errors of two model prediction values after updating the robust width learning system model and the width learning system model by using respective incremental learning algorithms. It can be obviously seen from the figure that the prediction precision of the model after each update of the robust width learning system model of the invention is far higher than that of the width learning system model, which shows that the proposed robust incremental learning algorithm for increasing data is effective.
3. Robust incremental learning algorithm for adding enhanced nodes
After newly added training data enter the model, the learning capacity of the model to the newly added data can be improved by adding a proper number of enhanced nodes. In the verification experiment, the robust incremental learning algorithm for adding the training data and the enhanced nodes is combined, and the enhanced nodes are added while the training data is added to improve the prediction accuracy of the model. The specific implementation steps are as follows:
step 3.1: and (4) preprocessing data. Let the input data matrix of the training set containing 20 outliers beThe output data matrix is(N ═ 100, M ═ 3, and C ═ 1). The training data is subjected to linear conversion processing by using formula (1), and the result value is mapped to [ -1,1 [ -1 [ ]]. X for the transformed input data matrix0Representing, outputting data matrices by Y0And (4) showing.
Step 3.2: a residual matrix R is calculated. Assuming that the number of enhanced node group sets of the robust width learning system is m (m is 1), each group is q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the systemmThe node enhancement method is formed by combining an input data matrix and an enhancement node matrix, and the expression is as shown in formula (2). Then, the iterative initial connection weight matrix is solved by using the formula (4)Finally, solving a residual matrix R by using a formula (5), wherein R is [ R ═ R1,r2,...,rN]T
Step 3.3: and calculating a weight matrix of the training data. Firstly, a residual probability density function is obtained by utilizing a kernel density estimation algorithm, the function formula is shown as (6), and the weight theta of the first training datalCan be represented by the formula (6) (θ)l=f(ri) Where l, i ═ 1 … N). All the weights of the training data form a weight matrix theta, and theta is equal to [ theta ═ theta [ [ theta ]1,...,θN]T
Step 3.4: and establishing a robust width learning system model. First, the extended input matrix A calculated in step 3.2 is usedmAnd 3.3, solving a connection weight matrix of the kth iteration through a formula (8)Where k is 1,2,3 …, 30. And A ismPseudo inverse matrix ofCan be used forIs determined by the formula, where λ ═ 2-8
If the maximum value of the absolute value of the difference between the connection weight matrices of two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R, returning to the step 3.3, and calculating a new connection weight matrix WmAndup toOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model, the robust width learning system model is established, and the output prediction expression of the model is as follows
Wherein Hm=[H1,...,Hm],
Step 3.5: and preprocessing new training data. Setting the input data matrix of new training data asThe output data matrix isWherein a is 30. The new training data is subjected to a linear transformation process using equation (1), where Xmax,Xmin,YmaxAnd YminThe maximum and minimum values determined in step 3.1. For simplicity, the input data matrix for the newly added training data after linear transformation is Xa0Representing, outputting data matrices by Ya0And (4) showing.
Step 3.6: calculating a residual error matrix R corresponding to the newly added training dataa. Firstly, solving an extended input matrix A corresponding to newly added training dataa
WhereinAndthe value generated in step 3.2. Novel extended input matrix of systemThen iterating the initial new connection weight matrix can be obtained by the following equation
And C ═ Aa-DTAmFinally, solving a residual matrix R corresponding to the newly added training data by using a residual formulaa,Ra=[r1,r2,...,ra]T
Step 3.7: and solving the newly-added training data weight. And (3) obtaining a residual probability density function of the newly added training data by using a kernel density estimation algorithm, wherein the function formula is as follows:
whereinThe weight theta of the first training datalCan be represented by the formula (16) (θ)l=fa(ri) Where l, i ═ 1 … a), the weights of all training data make up a weight matrix θaa=[θ1,...,θa]T
And 3.8, updating the robust width learning system model. Using results obtained in step 2.4A obtained in step 3.6aTheta with respect to step 3.7aCalculating and solvingaAmCorresponding pseudo-inverse matrix (aAm)+The calculation formula is as follows
Wherein
Wherein C ═ θaAa-D'θAmThen k isa(ka≦ 30) new connection weight matrix for iterationsCan be calculated by
If the maximum value of the absolute value of the difference between the new connection weight matrices in two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix RaAnd returning to step 3.7 to derive calculations in turnaWmAnd (a)aAm)+Up toOr when the iteration times reach 30, the iteration is ended, and a new connection weight matrix is obtained
And 3.9, updating the robust width learning system model after the enhanced nodes are added. Assuming that the number of enhanced nodes of the newly added group is q (q is 20), the newly added enhanced node can be expressed as
Wherein Andfrom the interval [ -10,10 [)]Are randomly generated. Then after adding a set of enhanced nodes, a new extended input matrixxAm+1=[xAm|Hm+1]Corresponding pseudo-inverse matrix (xAm+1)+Can be derived from the following formula
Wherein
And C ═ θcHm+1xAmD'. Then add the new connection weight matrix after the enhanced node to
Wherein
Updating the prediction output expression of the robust width learning system model after adding the enhanced node into
Wherein
And 3.10, verifying the model by using the test set, and calculating the root mean square error of the predicted value. And returning to the step 3.5, continuously adding 30 new groups of training data and 20 enhanced nodes, and sequentially updating the model until all 400 groups of training data are added.
The results are as follows: fig. 4 is a comparison graph of root mean square errors of predicted values of two models after each update, in which a robust width learning system model and a width learning system update the models by adding training data and enhancing nodes by using respective incremental learning algorithms. It can be obviously seen from the figure that the prediction of the model of the robust width learning system of the invention after each update is much higher than that of the width learning system model, which shows that the proposed robust incremental learning algorithm for adding nodes is effective. Fig. 5 is a root mean square error comparison graph of model prediction values, in which a robust incremental learning algorithm for adding training data and enhancing nodes and a robust incremental learning algorithm for only adding training data are used to update a robust width learning system model. It is seen from the figure that the prediction accuracy of the model updated by the robust incremental learning algorithm that adds training data and enhances nodes is higher than the model updated by the robust incremental learning algorithm that adds only training data. The learning capacity of the model can be improved by adding the enhanced nodes.

Claims (3)

1. A robust width learning system comprising the steps of:
step 1: data preprocessing, which comprises the following steps:
step 1.1: collecting training data, and setting input data matrix of training data asThe output data matrix is
Wherein N is the number of samples of training data;
m and C respectively correspond to the variable numbers of the input data and the output data;
representing a real number domain;
step 1.2: performing linear transformation processing on the training data according to a transformation function in formula (1), and mapping the result value to [ -1,1], wherein the transformation function is:
wherein,represents the transformed data;
x and Y represent data to be converted;
Xmin,Yminrepresents the minimum value in the data to be converted;
Xmax,Ymaxrepresents the maximum value in the data to be converted;
step 2: solving a residual matrix R, which comprises the following steps:
step 2.1: assuming that the number of the enhanced node groups of the robust width learning system is m, each group of q enhanced nodes is solved according to a formula (2)m,AmIs formed by combining an input data matrix and an enhanced node matrix, AmThe expression is as follows
Am=[X0|Hm](2);
Wherein, X0To representThe formed input data matrix;
Hm=[H1,...,Hm]a matrix of enhanced nodes is represented that is,
andrespectively representing a weight matrix and a bias matrix of the enhanced node group, and randomly generating by a system;
ξ (-) represents the activation function, which is a nonlinear function on the enhancement node and is responsible for mapping the input of the enhancement node to the output, and the commonly used sigmod function is adopted as the activation function, and the activation function is shown as formula (3):
step 2.2: solving an iterative initial connection weight matrix using a ridge regression algorithm according to equation (4)
Wherein, I represents an identity matrix;
Y0to representThe formed output data matrix;
λ represents a regularization parameter;
superscript 'T' represents the transpose of the matrix;
step 2.3: solving a residual matrix R using a residual equation according to equation (5):
wherein R ═ R1,r2,...,rN]T
riA residual representing the ith training data;
and step 3: calculating a weight matrix of the training data, comprising the following steps:
step 3.1: and (3) obtaining a residual probability density function by using a kernel density estimation algorithm, wherein the function formula is (6):
wherein,
is the standard deviation of the residual;
k (-) is the kernel density function, expressed as equation (7):
step 3.2: calculating weight matrix theta formed by all training data, and weight theta of the first training datal=f(ri) Calculating θ according to equation (6)lAll of thetalForm a weight matrix theta, theta ═ theta1,...,θN]T
And 4, step 4: establishing a robust width learning system model, which comprises the following specific steps:
step 4.1, solving the connection weight matrix of the kth iteration according to the formula (8)
Wherein k is 1,2,3 …;
step 4.2, if the maximum value of the absolute value of the difference of the output weights of two adjacent steps is larger than a set threshold value epsilon, namelyThen by the formulaCalculating a new residual error matrix R and returning to the step 3 untilOr the iteration times reach the preset maximum iteration times, the iteration is finished, the robust width learning system stops the training of the model and establishes the robust width learning system model, and the model output prediction expression formula (9) is as follows:
wherein x isnewRepresenting new input data;
ypredictionRepresenting the predicted output data.
Hm=[H1,...,Hm],
2. A robust width learning system as claimed in claim 1, wherein when new training data is entered, the system is updated by:
s1: setting the input data matrix of new training data asThe output data matrix is
Wherein a represents the number of newly added training data;
s2: carrying out linear conversion processing on the new training data by using the formula (1) in the step 1.2, and obtaining an input data matrix X of the converted dataa0And the output data matrix Ya0
S3: calculating a residual error matrix R corresponding to the newly added training dataa
S3.1: solving the extended input matrix A corresponding to the newly added training data according to the formula (10)a
S3.2: by the formulaObtaining new extended input matrix of systemaAm
S3.3: solving the new connection weight matrix of the iteration initial according to the formula (11)
Wherein B is derived from equation (12):
wherein C is Aa-DTAm
Is AmThe pseudo-inverse matrix of (2) is calculated according to equation (13):
s3.4: solving a residual error matrix R corresponding to newly added training data by using a residual error formula according to the formula (14)a
Wherein R isa=[r1,r2,...,ra]T
S4: calculating newly added training data weight
S4.1: and (3) obtaining a residual probability density function of the newly added training data by utilizing a kernel density estimation algorithm, wherein the function formula is as follows:
s4.2: calculating all weights of newly added training data to form a weight matrix thetaaWeight θ of the I-th training datala=fa(ri) Calculating θ according to the formula (15)laAll of thetalaForm a weight matrix thetaa,θa=[θ1a,...,θaa]T
S5: updating the robust breadth learning system, calculating according to equation (16)aAmCorresponding pseudo-inverse matrix (aAm)+
Wherein,
b' is derived from equation (17):
wherein, C ═ θaAa-D'θaAm
Solving for k according to equation (18)aNew connection weight matrix of sub-iteration
Wherein k isa=1,2,3…;
S6: if the maximum value of the absolute value of the difference of the new connection weight matrix of two adjacent steps is larger than the set threshold value epsilon, namelyThen by the formulaCalculating a new residual matrix RaAnd returning to S4 to calculateaWmAnd (a)aAm)+Up toOr if the iteration times reach the preset maximum iteration times, ending the iteration and obtaining a new connection weight matrixAnd updating the model, the expression formula (19) of the output prediction of the new model is as follows
3. A robust width learning system as claimed in claim 1, wherein when a new enhanced node is added, the system is updated by:
step a: if the number of the newly added group of enhanced nodes is q, the newly added enhanced nodes are expressed by a formula (20) as follows:
wherein,is generated at random by the system and is,
randomly generated by the system
Step b: after adding the enhanced node, a new extended input matrix Am+1=[Am|Hm+1]Derived from equation (21) to yield Am+1Pseudo inverse matrix of (A)m+1)+
Wherein,
b' is derived from equation (22):
wherein, C ═ θ Hm+1-θAmD';
Step c, solving a new connection weight matrix W after the enhanced node is added according to the formula (23)m+1
D, outputting a prediction expression formula (24) by the updated model as follows:
yprediction=[xnew|Hm+1]Wm+1(24)
Wherein Hm+1=[H1,...,Hm+1];
CN201811362948.6A 2018-09-29 2018-11-16 A kind of robust width learning system Pending CN109635245A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811153248 2018-09-29
CN2018111532486 2018-09-29

Publications (1)

Publication Number Publication Date
CN109635245A true CN109635245A (en) 2019-04-16

Family

ID=66068177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811362948.6A Pending CN109635245A (en) 2018-09-29 2018-11-16 A kind of robust width learning system

Country Status (1)

Country Link
CN (1) CN109635245A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222453A (en) * 2019-06-14 2019-09-10 中国矿业大学 A kind of compressor outlet parameter prediction modeling method based on width learning system
CN110322969A (en) * 2019-07-03 2019-10-11 北京工业大学 A kind of fMRI data classification method based on width study
CN110334775A (en) * 2019-07-12 2019-10-15 广东工业大学 A kind of recognition methods of unmanned plane line fault and device based on width study
CN110472741A (en) * 2019-06-27 2019-11-19 广东工业大学 A kind of small wave width study filtering system of three-domain fuzzy and method
CN110570019A (en) * 2019-08-14 2019-12-13 中国地质大学(武汉) Sintering process comprehensive coke ratio time sequence prediction method based on width learning
CN111598236A (en) * 2020-05-20 2020-08-28 中国矿业大学 Width learning system network model compression method
CN112508192A (en) * 2020-12-21 2021-03-16 华南理工大学 Increment heap width learning system with degree of depth structure
CN113709782A (en) * 2021-07-30 2021-11-26 南昌航空大学 Link quality assessment method adopting lamination width learning
CN114200936A (en) * 2021-12-06 2022-03-18 广东工业大学 AGV real-time path planning method based on optimal control and width learning
CN114528764A (en) * 2022-02-18 2022-05-24 清华大学 Soft measurement modeling method and device based on integral optimization and instant learning
CN115392543A (en) * 2022-07-29 2022-11-25 广东工业大学 Injection product quality prediction method combining L21 norm and residual cascade width learning

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222453A (en) * 2019-06-14 2019-09-10 中国矿业大学 A kind of compressor outlet parameter prediction modeling method based on width learning system
CN110472741A (en) * 2019-06-27 2019-11-19 广东工业大学 A kind of small wave width study filtering system of three-domain fuzzy and method
CN110322969A (en) * 2019-07-03 2019-10-11 北京工业大学 A kind of fMRI data classification method based on width study
CN110334775A (en) * 2019-07-12 2019-10-15 广东工业大学 A kind of recognition methods of unmanned plane line fault and device based on width study
CN110570019A (en) * 2019-08-14 2019-12-13 中国地质大学(武汉) Sintering process comprehensive coke ratio time sequence prediction method based on width learning
CN111598236A (en) * 2020-05-20 2020-08-28 中国矿业大学 Width learning system network model compression method
WO2022134268A1 (en) * 2020-12-21 2022-06-30 华南理工大学 Incremental stacked broad learning system having depth structure
CN112508192B (en) * 2020-12-21 2022-04-22 华南理工大学 Increment heap width learning system with degree of depth structure
CN112508192A (en) * 2020-12-21 2021-03-16 华南理工大学 Increment heap width learning system with degree of depth structure
CN113709782A (en) * 2021-07-30 2021-11-26 南昌航空大学 Link quality assessment method adopting lamination width learning
CN113709782B (en) * 2021-07-30 2022-05-31 南昌航空大学 Link quality assessment method adopting lamination width learning
CN114200936A (en) * 2021-12-06 2022-03-18 广东工业大学 AGV real-time path planning method based on optimal control and width learning
CN114528764A (en) * 2022-02-18 2022-05-24 清华大学 Soft measurement modeling method and device based on integral optimization and instant learning
CN114528764B (en) * 2022-02-18 2024-09-10 清华大学 Soft measurement modeling method and device based on integral optimization and instant learning
CN115392543A (en) * 2022-07-29 2022-11-25 广东工业大学 Injection product quality prediction method combining L21 norm and residual cascade width learning
CN115392543B (en) * 2022-07-29 2023-11-24 广东工业大学 Injection product quality prediction method combining L21 norm and residual cascade width learning

Similar Documents

Publication Publication Date Title
CN109635245A (en) A kind of robust width learning system
CN109242223B (en) Quantum support vector machine evaluation and prediction method for urban public building fire risk
Han et al. Adaptive computation algorithm for RBF neural network
CN108897286B (en) Fault detection method based on distributed nonlinear dynamic relation model
CN107798383B (en) Improved positioning method of nuclear extreme learning machine
Liu et al. A fault diagnosis intelligent algorithm based on improved BP neural network
CN111768000A (en) Industrial process data modeling method for online adaptive fine-tuning deep learning
CN108537366B (en) Reservoir scheduling method based on optimal convolution bidimensionalization
CN112800675A (en) KPCA and ELM-based time-space separation distribution parameter system modeling method
CN112508244B (en) Multi-element load prediction method for user-level comprehensive energy system
CN114117852B (en) Regional heat load rolling prediction method based on finite difference working domain division
Li et al. Application of ARIMA and LSTM in relative humidity prediction
CN109033021A (en) A kind of linear equation solver design method for joining convergence neural network based on change
CN111222689A (en) LSTM load prediction method, medium, and electronic device based on multi-scale temporal features
CN116345555A (en) CNN-ISCA-LSTM model-based short-term photovoltaic power generation power prediction method
CN113052373A (en) Monthly runoff change trend prediction method based on improved ELM model
Ehsan et al. Wind speed prediction and visualization using long short-term memory networks (LSTM)
CN114169091A (en) Method for establishing prediction model of residual life of engineering mechanical part and prediction method
CN116975645A (en) Industrial process soft measurement modeling method based on VAE-MRCNN
CN106682312A (en) Industrial process soft-measurement modeling method of local weighing extreme learning machine model
CN106227767A (en) A kind of based on the adaptive collaborative filtering method of field dependency
CN107909202A (en) A kind of oilwell produced fluid amount integrated prediction method based on time series
CN110276478B (en) Short-term wind power prediction method based on segmented ant colony algorithm optimization SVM
CN109543263B (en) Method for establishing integrated atmospheric distillation process proxy model
CN112132328A (en) Photovoltaic output power ultra-short-term local emotion reconstruction neural network prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190416