CN109635245A

CN109635245A - A kind of robust width learning system

Info

Publication number: CN109635245A
Application number: CN201811362948.6A
Authority: CN
Inventors: 褚菲; 梁涛; 王雪松; 程玉虎
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2018-09-29
Filing date: 2018-11-16
Publication date: 2019-04-16

Abstract

A kind of robust width learning system acquires training data and carries out linear transformation processing to training data；Extension input matrix is solved by input data matrix and enhancing node matrix equation, recycles ridge regression algorithm to solve the initial connection weight matrix of iteration, recycles residual error equations residual matrix；Residual error probability density function is acquired using Density Estimator algorithm, then calculates the weight matrix of all training data compositions；Solve the connection weight matrix of kth time iteration, if the maximum value of the absolute value of the difference of the output weight of adjacent two step is no more than the threshold values of setting, or the number of iterations reaches preset maximum number of iterations, iteration terminates, and robust width learning system stops training and establishing robust width model of learning system for model.The problem of system can improve the robustness of width learning system, can effectively inhibit adverse effect of the outlier bring to modeling accuracy, convenient for establishing robust width system model, with the prediction of the index of correlation suitable for complex industrial process.

Description

Robust width learning system

Technical Field

The invention belongs to the technical field of industrial process modeling, and particularly relates to a robust width learning system.

Background

The control and optimization problem of complex industrial processes has been a popular research direction, and the establishment of accurate models of complex industrial processes is the premise and basis for control and optimization. The mechanism modeling is based on physical and chemical mechanism analysis of the process, and a functional relation among the operation variables, the state variables and the output variables is obtained through derivation. The mechanism modeling can accurately express the relation between variables, effectively explain objective phenomena and avoid the situation of violating the common principle, but the modeling difficulty is high, the modeling period is long and researchers need to know all relevant theoretical knowledge. Over the past years, data-driven based modeling techniques have received increasing attention from researchers, and their simplicity and convenience characteristics have significantly improved the efficiency of modeling. With the rapid development of the artificial neural network, some researchers apply the artificial neural network to data-driven modeling, and benefit from the strong learning ability of the artificial neural network, the data modeling method based on the artificial neural network is rapidly popularized and applied.

As artificial intelligence is getting more and more attention, the deep learning neural network behind the artificial intelligence is widely applied to various advanced fields, such as pattern recognition, face recognition, speech recognition, and the like. Depending on a large number of feature layers in the network, the deep learning neural network is very suitable for processing high-dimensional big data. Although the deep learning neural network has a strong learning ability, its complex structure causes it to face numerous tuning parameters, resulting in it needing to go through a lengthy learning process. And in order to obtain better learning effect, the deep learning neural network needs a large amount of support of high-performance computers. However, the actual industrial field is narrow in space and severe in environment, a large number of high-performance computers are difficult to install, and moreover, the high-performance computers are expensive, and the production cost is increased due to the large number of the computers, so that the comprehensive benefits of enterprises are not improved. Recently, some scholars have proposed novel neural networks for simplifying the training process, which can effectively reduce the dependence on computer resources.

The breadth learning system is an effective and efficient novel artificial neural network. The width learning system fully utilizes the advantages of a random vector function connection network, and has direct connection between an input layer and an output layer, so that the network structure is flattened. Compared with a deep learning neural network, the width learning system is simple in structure and less in adjustable parameters. Compared with a deep learning neural network which solves the connection weight of the network by using a complex iterative process, the width learning system calculates the connection weight through a ridge regression algorithm. The simple network structure enables the width learning system to be easily expanded, and the rapid updating of the model can be realized by combining with a corresponding incremental learning algorithm. The width learning system has certain application in the aspects of image identification, time sequence prediction and complex industrial process modeling, and has good effects.

However, in the actual industrial production process, due to the influence of factors such as sensor faults and environmental noise, a certain number of outliers may exist in the acquired sample data, if the sample containing the outliers is used as a training set for training the width learning system, the generalized approximation capability of the width learning system is influenced, and the prediction accuracy of the established width learning system model cannot meet the industrial requirements. Therefore, the existing width learning system has poor robustness, can not inhibit the problem of adverse influence on modeling precision caused by outliers,

disclosure of Invention

Aiming at the problems in the prior art, the invention provides a robust width learning system which can improve the robustness of the width learning system, effectively inhibit the problem of adverse influence on modeling precision caused by outliers, and facilitate the establishment of a robust width system model so as to be suitable for the prediction of relevant indexes in a complex industrial process.

In order to achieve the above object, the present invention provides a robust width learning system, comprising the steps of:

step 1: data preprocessing, which comprises the following steps:

step 1.1: collecting training data, and setting input data matrix of training data asThe output data matrix is

Wherein N is the number of samples of training data;

m and C respectively correspond to the variable numbers of the input data and the output data;

representing a real number domain;

step 1.2: performing linear transformation processing on the training data according to a transformation function in formula (1), and mapping the result value to [ -1,1], wherein the transformation function is:

wherein,represents the transformed data;

x and Y represent data to be converted;

X_min，Y_minrepresents the minimum value in the data to be converted;

X_max，Y_maxrepresents the maximum value in the data to be converted;

step 2: solving a residual matrix R, which comprises the following steps:

step 2.1: assuming that the number of the enhanced node groups of the robust width learning system is m, each group of q enhanced nodes is solved according to a formula (2)_m，A_mIs formed by combining an input data matrix and an enhanced node matrix, A_mThe expression is as follows

A_m＝[X₀|H^m](2)；

Wherein, X₀To representThe formed input data matrix;

H^m＝[H₁,...,H_m]a matrix of enhanced nodes is represented that is,

andrespectively representing a weight matrix and a bias matrix of the enhanced node group, and randomly generating by a system;

ξ (-) represents the activation function, which is a nonlinear function on the enhancement node and is responsible for mapping the input of the enhancement node to the output, and the commonly used sigmod function is adopted as the activation function, and the activation function is shown as formula (3):

step 2.2: solving an iterative initial connection weight matrix using a ridge regression algorithm according to equation (4)

Wherein, I represents an identity matrix;

Y₀to representThe formed output data matrix;

λ represents a regularization parameter;

superscript 'T' represents the transpose of the matrix;

step 2.3: solving a residual matrix R using a residual equation according to equation (5):

wherein R ═ R₁,r₂,...,r_N]^T；

r_iA residual representing the ith training data;

and step 3: calculating a weight matrix of the training data, comprising the following steps:

step 3.1: and (3) solving a residual probability density function by utilizing a kernel density estimation algorithm according to a formula (6):

wherein,

is the standard deviation of the residual;

k (-) is the kernel density function, expressed as equation (7):

step 3.2: calculating weight matrix theta formed by all training data, and weight theta of the first training data_l＝f(r_i) Calculating θ according to equation (6)_lAll of theta_lForm a weight matrix theta, theta ═ theta₁,...,θ_N]^T；

And 4, step 4: establishing a robust width learning system model, which comprises the following specific steps:

step 4.1, solving the connection weight matrix of the kth iteration according to the formula (8)

Wherein k is 1,2,3 …;

step 4.2, if the maximum value of the absolute value of the difference of the output weights of two adjacent steps is larger than a set threshold value epsilon, namelyThen by the formulaCalculating a new residual error matrix R and returning to the step 3 untilOr the iteration times reach the preset maximum iteration times, the iteration is finished, the robust width learning system stops the training of the model and establishes the robust width learning system model, and the model output prediction expression formula (9) is as follows:

wherein x is_newRepresenting new input data;

y_predictionRepresenting the predicted output data.

H^m＝[H₁,...,H_m]，

Further, when new training data is input, the system is updated through the following steps:

s1: setting the input data matrix of new training data asThe output data matrix isWherein a represents the number of newly added training data;

s2: carrying out linear conversion processing on the new training data by using the formula (1) in the step 1.2, and obtaining an input data matrix X of the converted data_a0And the output data matrix Y_a0；

S3: calculating a residual error matrix R corresponding to the newly added training data^a：

S3.1: solving the extended input matrix A corresponding to the newly added training data according to the formula (10)_a：

S3.2: by the formulaObtaining new extended input matrix of system^aA_m；

S3.3: solving the new connection weight matrix of the iteration initial according to the formula (11)

Wherein B is derived from equation (12):

wherein C is A_a-D^TA_m，

Is A_mThe pseudo-inverse matrix of (2) is calculated according to equation (13):

s3.4: solving a residual error matrix R corresponding to newly added training data by using a residual error formula according to the formula (14)^a；

Wherein R is^a＝[r₁,r₂,...,r_a]^T；

S4: calculating the weight of newly added training data:

s4.1: and (3) obtaining a residual probability density function of the newly added training data by utilizing a kernel density estimation algorithm, wherein the function formula is as follows:

s4.2: calculating all weights of newly added training data to form a weight matrix theta^aWeight θ of the I-th training data_la＝f_a(r_i) Calculating θ according to the formula (15)_laAll of theta_laForm a weight matrix theta^a，θ^a＝[θ_1a,...,θ_aa]^T；

S5: updating the robust breadth learning system, calculating according to equation (16)^aA_mCorresponding pseudo-inverse matrix (^aA_m)⁺：

Wherein,

b' is derived from equation (17):

wherein, C ═ θ^aA_a-D'θ^aA_m；

Solving for k according to equation (18)_aNew connection weight matrix of sub-iteration

Wherein k is_a＝1,2,3…；

S6: if the maximum value of the absolute value of the difference of the new connection weight matrix of two adjacent steps is larger than the set threshold value epsilon, namelyThen by the formulaCalculating a new residual matrix R^aAnd returning to S4 to calculate^aW_mAnd (a)^aA_m)⁺Up toOr if the iteration times reach the preset maximum iteration times, ending the iteration and obtaining a new connection weight matrixAnd updating the model, the expression formula (19) of the output prediction of the new model is as follows

Further, when a new enhanced node is added, the system is updated through the following steps:

step a: if the number of the newly added group of enhanced nodes is q, the newly added enhanced nodes are expressed by a formula (20) as follows:

wherein,randomly generated by the system;

randomly generated by the system

Step b: after adding the enhanced node, a new extended input matrix A_m+1＝[A_m|H_m+1]Derived from equation (21) to yield A_m+1Pseudo inverse matrix of (A)_m+1)⁺：

Wherein,

b' is derived from equation (22):

wherein, C ═ θ H_m+1-θA_mD'；

Step c, solving a new connection weight matrix W after the enhanced node is added according to the formula (23)_m+1：

D, outputting a prediction expression formula (24) by the updated model as follows:

y_prediction＝[x_new|H^m+1]W_m+1(24)

Wherein H^m+1＝[H₁,...,H_m+1]；

Firstly, obtaining a predicted value by using a width learning system; then, calculating the weight of each data according to the residual error between the predicted value and the actual value by using a kernel density estimation method, wherein the normal data is distributed with a larger weight, and the suspected outlier is distributed with a smaller weight, so that the robust width learning system can automatically learn correct information in a training set; and finally, calculating the connection weight of the system by using a weighted ridge regression algorithm in the invention, thereby establishing a robust width learning system model. The invention simultaneously provides two robust incremental learning algorithms which are respectively used for increasing training data and enhancing nodes, thereby realizing the quick update of the established robust width learning system model. Compared with the original width learning system, the method has obvious advantages in the aspects of prediction precision, robustness and generalization when outliers exist in the training data set. The method improves the robustness of the width learning system and inhibits the adverse effect of outliers.

Drawings

FIG. 1 is a comparison graph of the fitting effect of a robust width learning system model and a width learning system model on a test data set;

FIG. 2 is a graph of the variation trend of the root mean square error of the robust width learning system model and the width learning system model test with the increase of the outlier content;

FIG. 3 is a graph showing the variation trend of the root mean square error of the robust width learning system model and the width learning system model with the increase of the training data;

FIG. 4 is a graph of the variation trend of the root mean square error of the robust width learning system model and the width learning system model with the addition of training data and enhanced nodes;

fig. 5 is a comparison graph of learning effects of a robust incremental learning method of adding training data and enhancing nodes.

Detailed Description

The invention is further illustrated by the following examples and figures.

A robust width learning system comprising the steps of:

step 1: data preprocessing, which comprises the following steps:

Wherein N is the number of samples of training data;

representing a real number domain;

wherein,represents the transformed data;

x and Y represent data to be converted;

X_min，Y_minrepresents the minimum value in the data to be converted;

X_max，Y_maxrepresents the maximum value in the data to be converted;

step 2: solving a residual matrix R, which comprises the following steps:

A_m＝[X₀|H^m](2)；

Wherein, X₀To representGroup (A) ofA matrix of input data;

H^m＝[H₁,...,H_m]a matrix of enhanced nodes is represented that is,

Wherein, I represents an identity matrix;

Y₀to representThe formed output data matrix;

λ represents a regularization parameter;

superscript 'T' represents the transpose of the matrix;

wherein R ═ R₁,r₂,...,r_N]^T；

r_iA residual representing the ith training data;

wherein,

is the standard deviation of the residual;

k (-) is the kernel density function, expressed as equation (7):

Wherein k is 1,2,3 …;

step 4.2, if the maximum value of the absolute value of the difference of the output weights of two adjacent steps is larger than a set threshold value epsilon, namelyWherein max (·) represents taking the maximum value in the sequence, then passing through the formulaCalculating a new residual error matrix R and returning to the step 3 untilOr the iteration times reach the preset maximum iteration times, the iteration is finished, the robust width learning system stops the training of the model and establishes the robust width learning system model, and the model output prediction expression formula (9) is as follows:

wherein H^m＝[H₁,...,H_m]，

x_newRepresenting new input data;

y_predictionRepresenting the predicted output data.

When new training data is input, the system is updated by the following steps:

S3.2: by the formulaObtaining new extended input matrix of system^aA_m；

Wherein B is derived from equation (12):

wherein C is A_a-D^TA_m，

Wherein R is^a＝[r₁,r₂,...,r_a]^T；

S4: calculating the weight of newly added training data:

s4.2: calculating weight set of all newly added training dataWeight matrix theta^aWeight θ of the I-th training data_la＝f_a(r_i) Calculating θ according to the formula (15)_laAll of theta_laForm a weight matrix theta^a，θ^a＝[θ_1a,...,θ_aa]^T；

Wherein,

b' is derived from equation (17):

wherein, C ═ θ^aA_a-D'θ^aA_m；

Wherein k is_a＝1,2,3…；

When a new enhanced node is added, the system is updated by the following steps:

wherein,randomly generated by the system;

randomly generated by the system

Wherein,

b' is derived from equation (22):

wherein, C ═ θ H_m+1-θA_mD'；

y_prediction＝[x_new|H^m+1]W_m+1(24)；

Wherein H^m+1＝[H₁,...,H_m+1]；

Example (b):

the embodiment is a large-scale industrial multistage centrifugal compressor modeling method, and a robust width learning system model is established for predicting the output pressure ratio of the large-scale industrial multistage centrifugal compressor, and the method comprises the following specific steps:

510 sets of large industrial multistage centrifugal compressor operating data (this data is taken from a certain steel mill actually operating unit) were collected, and the input data variables included: inlet pressure, inlet temperature and inlet flow, the output data variable is the output pressure ratio. 400 of the data are selected as training set, and 110 are selected as test set. In order to ensure the truth and effectiveness of the test result, a certain number of outliers are added to the training set. Outlier addition was as follows: the percentage of the number of outliers to the total training data is gamma, which is set in the training set. Of the outliers, 50% of the outliers are high-lever outliers (i.e., points in the input data that are not within the normal range, but whose corresponding output values are within a reasonable range), which can be generated by randomly adding or subtracting noise on the input data of the normal training data that does not exceed the maximum value of the input data in the training set; the remaining 50% of outliers are high residual outliers (i.e., points in the output data where the residual is larger than other points), which may be generated by randomly adding or subtracting noise on the output data of normal training data that does not exceed the maximum of the output data in the training set.

1. Establishing a robust width learning system model

Step 1.1: and (4) preprocessing data. Let the input data matrix of the training set containing gamma outliers beThe output data matrix isWhere N-400 is the number of samples of training data, M-3 and C-1 correspond to the number of variables of input and output data, respectively,representing a real number domain. Performing linear conversion processing on the training data, and mapping the result value to [ -1,1 [ -1 [ ]]The transfer function is:

wherein,representing converted data, X, Y representing data to be converted, X_min，Y_minRepresenting the minimum value, X, of the data to be converted_max，Y_maxRepresenting the maximum value in the data to be converted. X for input data matrix after linear conversion₀Representing, outputting data matrices by Y₀And (4) showing.

Step 1.2: a residual matrix R is calculated. Assuming that the number of enhanced node groups of the robust width learning system is m (m is 11), each group has q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the system_mThe system is formed by combining an input data matrix and an enhanced node matrix, and the expression is as follows

A_m＝[X₀|H^m](2)

Wherein H^m＝[H₁,...,H_m]A matrix of enhanced nodes is represented that is,whileAndrespectively representing the weight matrix and the bias matrix of the enhanced node group, and is controlled by the system from [ -10,10 [ -10 [ ]]ξ (-) represents the activation function, is a nonlinear function on the enhancement node, and is responsible for mapping the input of the enhancement node to the output, and the patent adopts the commonly used sigmod function as the activation function, and the formula is as follows

Then solving iterative initial connection weight matrix by using ridge regression algorithm

Where I represents the identity matrix and λ represents the regularization parameter (λ ═ 2)^-8) The superscript 'T' denotes the transpose of the matrix. Finally, solving a residual matrix R by using a residual formula, wherein R is [ R ═ R₁,r₂,...,r_N]^TAnd r_iRepresenting the residual of the ith training data.

Step 1.3: and calculating a weight matrix of the training data. Firstly, a residual probability density function is obtained by utilizing a kernel density algorithm, and the function formula is as follows:

wherein Is the standard deviation of the residual error, k (-) is the kernel density function, and the expression is:

the weight value theta of the last training data_lCan be represented by the formula (6) (θ)_l＝f(r_i) Where l, i ═ 1 … N). All trainingsThe weight values of the training data form a weight matrix theta, and theta is equal to [ theta [ [ theta ]₁,...,θ_N]^T。

Step 1.4: and establishing a robust width learning system model. Firstly, the extended input matrix A calculated in step 1.2 is used_mAnd 1.3, solving a connecting weight matrix of the kth iteration through the weighted ridge regression algorithmWherein k is 1,2, 3.

If the maximum value of the absolute value of the difference between the output weights of two adjacent steps is greater than the set threshold epsilon equal to 0.1, namely(the function max (. cndot.) represents taking the maximum value in the series), i.e.(the function max (. cndot.) represents taking the maximum value in the series), then by the formulaCalculate new residual matrix R and return to step 1.3 untilOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model and establishes a robust width learning system model, and the output prediction expression of the model is as follows

Wherein H^m＝[H₁,...,H_m],x_newRepresenting new input data, y_PredictionRepresenting the predicted output data.

And verifying the established model by using the test set, wherein the verification result is as follows: firstly, under the condition that lambda is 20, the robust width learning system model is used for predicting test data, and for comparison, the robust width learning system model is used for predicting the test data, and the predicted values and the true values of the two models are shown in fig. 1. It can be seen from fig. 1 that the predicted value of the robust width learning system model is closer to the true value. Next, the prediction capabilities of the two models are compared under the condition of different numbers of outliers, that is, λ ═ 0,5,10,15,20,25,30, and the Root mean square Error (Root mean square Error, abbreviated as RMSE) is used as a criterion, and the Root mean square Error is as follows:

wherein, y_iA predicted value representing the output of the ith test data,an actual value of an output of the ith test data is represented, and N represents the number of test data. Fig. 2 is a diagram of the root mean square error variation of the prediction accuracy of the robust width learning system model and the width learning system as the proportion of the number of outliers to the total training data is higher, and the related specific values are shown in table one. It can be clearly seen from fig. 2 and table 1 that the prediction accuracy of the model established by the robust width learning system of the present invention is significantly higher than that of the model established by the width learning system, which indicates that the present invention effectively improves the robustness and the generalization of the width learning system.

Table 1 shows the root mean square error of the robust width learning system model and the width learning system model under different outlier contents

2. Robust incremental learning algorithm for increasing training data

Step 2.1: and (4) preprocessing data. Let the input data matrix of the training set containing 20 outliers beThe output data matrix is(N ═ 100, M ═ 3, and C ═ 1). The training data is subjected to linear conversion processing by using formula (1), and the result value is mapped to [ -1,1 [ -1 [ ]]. X for the transformed input data matrix₀Representing, outputting data matrices by Y₀And (4) showing.

Step 2.2: and solving a residual error matrix R. Assuming that the number of enhanced node group sets of the robust width learning system is m (m is 3), each group has q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the system_mThe node enhancement method is formed by combining an input data matrix and an enhancement node matrix, and the expression is as shown in formula (2). Then, the iterative initial connection weight matrix is solved by using the formula (4)Finally, solving a residual matrix R by using a formula (5), wherein R is [ R ═ R₁,r₂,...,r_N]^T。

Step 2.3: and calculating the weight of the training data. Firstly, a residual probability density function is obtained by using a kernel density estimation algorithm, as shown in formula (6). The weight value theta of the last training data_lCan be represented by the formula (6) (θ)_l＝f(r_i) Where l, i ═ 1 … N).All the weights of the training data form a weight matrix theta, and theta is equal to [ theta ═ theta [ [ theta ]₁,...,θ_N]^T。

Step 2.4: and establishing a robust width learning system model. Firstly, the extended input matrix A calculated in step 2.2 is used_mAnd 2.3, solving a connection weight matrix of the kth iteration through a formula (8)Where k is 1,2,3 …, 30. And A is_mPseudo inverse matrix ofCan be determined by the formula, where λ ═ 2^-8

If the maximum value of the absolute value of the difference between the connection weight matrices of two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R, returning to the step 2.3, and calculating a new connection weight matrix W_mAndup toOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model and establishes a robust width learning system model, and the output prediction expression of the model is as follows

Wherein H^m＝[H₁,...,H_m]，

Step 2.5: and preprocessing new training data. Setting the input data matrix of newly added training data asThe output data matrix isWherein a is 30. The new training data is subjected to a linear transformation process using equation (1), where X_max，X_min，Y_maxAnd Y_minThe maximum and minimum values determined in step 2.1. For simplicity, the input data matrix for the newly added training data after linear transformation is X_a0Representing, outputting data matrices by Y_a0And (4) showing.

Step 2.6: calculating a residual error matrix R corresponding to the newly added training data^a. Firstly, solving an extended input matrix A corresponding to newly added training data_a

WhereinAndthe value generated in step 2.2. Novel extended input matrix of systemIterating the initial new connection weight matrix may be performed byThe following formula is used to obtain a pair

And C ═ A_a-D^TA_m，Finally, solving a residual matrix R corresponding to the newly added training data by using a residual formula^a，R^a＝[r₁,r₂,...,r_a]^T。

Step 2.7: and solving the newly-added training data weight. And (3) obtaining a residual probability density function of the newly added training data by using a kernel density estimation algorithm, wherein the function formula is as follows:

whereinThe weight theta of the first training data_lCan be represented by the formula (16) (θ)_l＝f_a(r_i) Where l, i ═ 1 … a), the weights of all the training data make up a weight matrix θ^a,θ^a＝[θ₁,...,θ_a]^T。

Step 2.8: and updating the robust width learning system model. Using results obtained in step 2.4A obtained in step 2.6_aTheta determined in step 2.7^aCalculating and solving^aA_mCorresponding pseudo-inverse matrix (^aA_m)⁺The calculation formula is as follows

Wherein

Wherein C ═ θ^aA_a-D'θ^aA_mThen k is_a(k_a≦ 30) new connection weight matrix for iterationsCan be calculated by

If the maximum value of the absolute value of the difference between the new connection weight matrices in two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R^aAnd returning to step 2.7 to derive calculation in turn^aW_mAnd (a)^aA_m)⁺Up toOr when the iteration times reach 30, the iteration is ended, and a new connection weight matrix is obtainedAnd updates the model. The expression of the new model output prediction is as follows

Step 2.9: and verifying the updated model each time by using the test set, and calculating the root mean square error of the predicted value of the model. Returning to the step 2.5, continuing to add 30 groups of training data, and updating the model according to the step sequence until all 400 groups of training data are added, and stopping updating the model.

The results are as follows: fig. 3 is a comparison graph of root mean square errors of two model prediction values after updating the robust width learning system model and the width learning system model by using respective incremental learning algorithms. It can be obviously seen from the figure that the prediction precision of the model after each update of the robust width learning system model of the invention is far higher than that of the width learning system model, which shows that the proposed robust incremental learning algorithm for increasing data is effective.

3. Robust incremental learning algorithm for adding enhanced nodes

After newly added training data enter the model, the learning capacity of the model to the newly added data can be improved by adding a proper number of enhanced nodes. In the verification experiment, the robust incremental learning algorithm for adding the training data and the enhanced nodes is combined, and the enhanced nodes are added while the training data is added to improve the prediction accuracy of the model. The specific implementation steps are as follows:

step 3.1: and (4) preprocessing data. Let the input data matrix of the training set containing 20 outliers beThe output data matrix is(N ═ 100, M ═ 3, and C ═ 1). The training data is subjected to linear conversion processing by using formula (1), and the result value is mapped to [ -1,1 [ -1 [ ]]. X for the transformed input data matrix₀Representing, outputting data matrices by Y₀And (4) showing.

Step 3.2: a residual matrix R is calculated. Assuming that the number of enhanced node group sets of the robust width learning system is m (m is 1), each group is q (q is 20) enhanced nodes. Firstly, solving the extended input matrix A of the system_mThe node enhancement method is formed by combining an input data matrix and an enhancement node matrix, and the expression is as shown in formula (2). Then, the iterative initial connection weight matrix is solved by using the formula (4)Finally, solving a residual matrix R by using a formula (5), wherein R is [ R ═ R₁,r₂,...,r_N]^T。

Step 3.3: and calculating a weight matrix of the training data. Firstly, a residual probability density function is obtained by utilizing a kernel density estimation algorithm, the function formula is shown as (6), and the weight theta of the first training data_lCan be represented by the formula (6) (θ)_l＝f(r_i) Where l, i ═ 1 … N). All the weights of the training data form a weight matrix theta, and theta is equal to [ theta ═ theta [ [ theta ]₁,...,θ_N]^T。

Step 3.4: and establishing a robust width learning system model. First, the extended input matrix A calculated in step 3.2 is used_mAnd 3.3, solving a connection weight matrix of the kth iteration through a formula (8)Where k is 1,2,3 …, 30. And A is_mPseudo inverse matrix ofCan be used forIs determined by the formula, where λ ═ 2^-8

If the maximum value of the absolute value of the difference between the connection weight matrices of two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R, returning to the step 3.3, and calculating a new connection weight matrix W_mAndup toOr when the iteration times reach 30, the iteration is finished, the robust width learning system stops the training of the model, the robust width learning system model is established, and the output prediction expression of the model is as follows

Wherein H^m＝[H₁,...,H_m]，

Step 3.5: and preprocessing new training data. Setting the input data matrix of new training data asThe output data matrix isWherein a is 30. The new training data is subjected to a linear transformation process using equation (1), where X_max，X_min，Y_maxAnd Y_minThe maximum and minimum values determined in step 3.1. For simplicity, the input data matrix for the newly added training data after linear transformation is X_a0Representing, outputting data matrices by Y_a0And (4) showing.

Step 3.6: calculating a residual error matrix R corresponding to the newly added training data^a. Firstly, solving an extended input matrix A corresponding to newly added training data_a

WhereinAndthe value generated in step 3.2. Novel extended input matrix of systemThen iterating the initial new connection weight matrix can be obtained by the following equation

Step 3.7: and solving the newly-added training data weight. And (3) obtaining a residual probability density function of the newly added training data by using a kernel density estimation algorithm, wherein the function formula is as follows:

whereinThe weight theta of the first training data_lCan be represented by the formula (16) (θ)_l＝f_a(r_i) Where l, i ═ 1 … a), the weights of all training data make up a weight matrix θ^a,θ^a＝[θ₁,...,θ_a]^T。

And 3.8, updating the robust width learning system model. Using results obtained in step 2.4A obtained in step 3.6_aTheta with respect to step 3.7^aCalculating and solving^aA_mCorresponding pseudo-inverse matrix (^aA_m)⁺The calculation formula is as follows

Wherein

Wherein C ═ θ^aA_a-D'θA_mThen k is_a(k_a≦ 30) new connection weight matrix for iterationsCan be calculated by

If the maximum value of the absolute value of the difference between the new connection weight matrices in two adjacent steps is greater than a set threshold value epsilon (epsilon is 0.1), that is to sayThen by the formulaCalculating a new residual matrix R^aAnd returning to step 3.7 to derive calculations in turn^aW_mAnd (a)^aA_m)⁺Up toOr when the iteration times reach 30, the iteration is ended, and a new connection weight matrix is obtained

And 3.9, updating the robust width learning system model after the enhanced nodes are added. Assuming that the number of enhanced nodes of the newly added group is q (q is 20), the newly added enhanced node can be expressed as

Wherein Andfrom the interval [ -10,10 [)]Are randomly generated. Then after adding a set of enhanced nodes, a new extended input matrix^xA_m+1＝[^xA_m|H_m+1]Corresponding pseudo-inverse matrix (^xA_m+1)⁺Can be derived from the following formula

Wherein

And C ═ θ_cH_m+1-θ^xA_mD'. Then add the new connection weight matrix after the enhanced node to

Wherein

Updating the prediction output expression of the robust width learning system model after adding the enhanced node into

Wherein

And 3.10, verifying the model by using the test set, and calculating the root mean square error of the predicted value. And returning to the step 3.5, continuously adding 30 new groups of training data and 20 enhanced nodes, and sequentially updating the model until all 400 groups of training data are added.

The results are as follows: fig. 4 is a comparison graph of root mean square errors of predicted values of two models after each update, in which a robust width learning system model and a width learning system update the models by adding training data and enhancing nodes by using respective incremental learning algorithms. It can be obviously seen from the figure that the prediction of the model of the robust width learning system of the invention after each update is much higher than that of the width learning system model, which shows that the proposed robust incremental learning algorithm for adding nodes is effective. Fig. 5 is a root mean square error comparison graph of model prediction values, in which a robust incremental learning algorithm for adding training data and enhancing nodes and a robust incremental learning algorithm for only adding training data are used to update a robust width learning system model. It is seen from the figure that the prediction accuracy of the model updated by the robust incremental learning algorithm that adds training data and enhances nodes is higher than the model updated by the robust incremental learning algorithm that adds only training data. The learning capacity of the model can be improved by adding the enhanced nodes.

Claims

1. A robust width learning system comprising the steps of:

step 1: data preprocessing, which comprises the following steps:

Wherein N is the number of samples of training data;

representing a real number domain;

wherein,represents the transformed data;

x and Y represent data to be converted;

X_min，Y_minrepresents the minimum value in the data to be converted;

X_max，Y_maxrepresents the maximum value in the data to be converted;

step 2: solving a residual matrix R, which comprises the following steps:

A_m＝[X₀|H^m](2)；

Wherein, X₀To representThe formed input data matrix;

H^m＝[H₁,...,H_m]a matrix of enhanced nodes is represented that is,

Wherein, I represents an identity matrix;

Y₀to representThe formed output data matrix;

λ represents a regularization parameter;

superscript 'T' represents the transpose of the matrix;

wherein R ═ R₁,r₂,...,r_N]^T；

r_iA residual representing the ith training data;

step 3.1: and (3) obtaining a residual probability density function by using a kernel density estimation algorithm, wherein the function formula is (6):

wherein,

is the standard deviation of the residual;

k (-) is the kernel density function, expressed as equation (7):

Wherein k is 1,2,3 …;

wherein x is_newRepresenting new input data;

y_predictionRepresenting the predicted output data.

H^m＝[H₁,...,H_m]，

2. A robust width learning system as claimed in claim 1, wherein when new training data is entered, the system is updated by:

s1: setting the input data matrix of new training data asThe output data matrix is

Wherein a represents the number of newly added training data;

S3.2: by the formulaObtaining new extended input matrix of system^aA_m；

Wherein B is derived from equation (12):

wherein C is A_a-D^TA_m，

Wherein R is^a＝[r₁,r₂,...,r_a]^T；

S4: calculating newly added training data weight

Wherein,

b' is derived from equation (17):

wherein, C ═ θ^aA_a-D'θ^aA_m；

Wherein k is_a＝1,2,3…；

3. A robust width learning system as claimed in claim 1, wherein when a new enhanced node is added, the system is updated by:

wherein,is generated at random by the system and is,

randomly generated by the system

Wherein,

b' is derived from equation (22):

wherein, C ═ θ H_m+1-θA_mD'；

y_prediction＝[x_new|H^m+1]W_m+1(24)

Wherein H^m+1＝[H₁,...,H_m+1]；