CN112116088A

CN112116088A - Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Info

Publication number: CN112116088A
Application number: CN202010857885.2A
Authority: CN
Inventors: 卢诚波; 梅颖; 高源�
Original assignee: Lishui University
Current assignee: Lishui University
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2020-12-22

Abstract

The invention relates to the technical field of machine learning, in particular to an incremental semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes. The ultra-limit learning platform comprises a feedforward neural network unit, a semi-supervised learning unit, an ultra-limit learning unit and an incremental learning unit; the semi-supervised learning unit is used for carrying out pattern recognition work in a mode of combining unlabeled samples and labeled samples; the overrun learning unit is used for constructing a semi-supervised learning system; the increment learning unit is used for increasing the hidden layer nodes and determining the number of the hidden layer nodes. In the invention, the semi-supervised ultralimit learning machine can increase hidden layer nodes one by one or in batches through the arranged increment unit, and adaptively determine the number of the hidden layer nodes.

Description

Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Technical Field

The invention relates to the technical field of machine learning, in particular to an incremental semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes.

Background

In the regression problem, "labeling" generally refers to an output value of a data sample, and in the classification problem, "labeling" generally refers to a class label of the data sample, and most of the current learning algorithms are trained by using a labeled sample, and in fact, the collected data samples often coexist with a labeled sample and an unlabeled sample, and even more include the unlabeled sample, and the unlabeled sample usually needs to be converted into a labeled sample by using special equipment or through tedious and very time-consuming manual labeling.

In general, half-supervised learning can use unlabeled samples to assist in labeling samples for learning, but the learning machine cannot adaptively determine the reasonable number of hidden layers in the learning process, and when the number of hidden layers is increased, an external weight matrix of a network needs to be retrained, so that the training time is increased, the waiting time of the learning machine is prolonged, and the learning efficiency is influenced.

Disclosure of Invention

The invention aims to provide an increment semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes, so as to solve the problems in the background technology.

In order to achieve the aim, the invention provides an increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes, which comprises an overrun learning platform, wherein the overrun learning platform comprises a feedforward neural network unit, a semi-supervised learning unit, an overrun learning unit and an increment learning unit; the feedforward neural network unit is used for receiving and outputting signals of all units and modules; the semi-supervised learning unit is used for carrying out pattern recognition work in a mode of combining unlabeled samples and labeled samples; the overrun learning unit is used for constructing a semi-supervised learning system; the increment learning unit is used for increasing hidden layer nodes and determining the number of the hidden layer nodes;

the feedforward neural network unit comprises an input module, a hidden layer module and an output module; the input module is used for receiving the characteristic dimension of the learning sample and transmitting the characteristic dimension to the hidden layer module through the neuron; the hidden layer module is used for calculating the characteristic dimension product through an excitation function and transmitting the calculation result to the output module through a neuron; the output module is used for packaging and outputting the calculation result;

the semi-supervised learning unit comprises an induction module, a hypothesis module and an optimization module; the induction module is used for inducing the received learning samples into unmarked samples and marked samples; the hypothesis module is used for performing hypothesis on the unlabeled samples and the labeled samples; the optimization module is used for optimizing the external weight matrix;

the overrun learning unit comprises an initial module and an algorithm module; the initial module is used for initially setting unmarked samples, marked samples and external weight matrixes in the semi-supervised learning unit; the algorithm module is used for carrying out algorithm calculation on the unmarked samples, the marked samples and the external weight matrix after initial setting;

the increment learning unit comprises a learning module and a dynamic adjusting module; the learning module is used for gradually learning the updated knowledge, and correcting and reinforcing the previous knowledge; the dynamic adjustment module is used for enabling weight vectors of the neurons in the learning module and the topological structure of the network to be dynamically adjusted along with the arrival of input learning data.

As a further improvement of the technical solution, the input module, the hidden layer module and the output module form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.

As a further improvement of the technical solution, a formula of an excitation function in the hidden layer module is as follows:

wherein X is (X)₁，x₂，x₃，…，x_n)^TInputting the n-dimensional input quantity of the network; w_i＝(w_i1，w_i2，w_i3，…，w_in)^TThe threshold values of the ith nodes of the input layer and the hidden layer are set; v. of_iThe connection right from the ith node of the hidden layer to the output layer;

is a hidden layer activation function; and f (X) is the network output.

As a further improvement of the technical solution, the feedforward neural network unit adopts a single hidden layer feedforward network learning algorithm, and the algorithm is as follows:

the hidden layer output of the training sample x is represented as a row vector, then:

h(x)＝[G(a₁，b₁，x)，G(a₂，b₂，x)，…G(a_k，b_k，x)；

in the formula, a_j，b_j(j ═ 1,2, …, k) is a learning parameter corresponding to the jth hidden node given randomly; k is the number of hidden layer nodes; g (x) is an excitation function;

given N training samples (x)_i，t_i)，x_i∈R^m，t_i∈RⁿThe mathematical model of the overrun learning machine is as follows:

Hβ＝T；

in the formula, H is a hidden layer output matrix; beta is an outer weight matrix; t is an objective matrix, wherein:

the solution of this model is:

in order to improve the generalization performance of the learner, a ridge regression formula of the model is given:

Minimize：

Subjoct toHβ＝T-；

the solution of this model is:

wherein c is a parameter.

As a further improvement of the present technical solution, the hypothesis module includes that the labeled sample set and the unlabeled sample set are taken from the same edge distribution hypothesis, and the labeled sample set and the unlabeled sample set have similar samples and similar conditional probability hypotheses.

As a further improvement of the present technical solution, the learning process of the semi-supervised learning unit includes the following steps:

s1, firstly, summarizing l marked samples through a summary module

And u unlabeled samples

x_i∈R^mm，t_i∈Rⁿ；

S2, making a first hypothesis mark sample set and an unmarked sample set from the same edge distribution through a hypothesis module;

s3, making a second assumption by the assumption module if two samples x_iAnd x_jSimilarly, then the conditional probability P (y | x)_i) And P (y | x)_j) Also similar;

s4, assuming that a is complete, (a)_ij)_n×nDetermining an external weight matrix beta for a similar matrix of a training sample;

generally, for a given sample, a Gaussian function is used

Calculating a_ijObviously, sample x_iAnd x_jThe closer to a_ijThe greater the value of (A); let D be the diagonal matrix and the ith diagonal element be

L-D-a is called laplacian matrix.

As a further improvement of the technical solution, the optimization problem formula of the optimization module is as follows:

wherein the content of the first and second substances,

a training target matrix is obtained; first behavior

The rest is 0; c is a diagonal matrix of order l + u; the first one diagonal element is a parameter c, and the remaining u diagonal elements are 0;

the optimization module (123) optimizes a solution of a problem as:

wherein, I_kIs an identity matrix of order k.

As a further improvement of the technical scheme, the algorithm module adopts an ISS-ELM algorithm, and unsupervised clustering and character embedding problems are obtained through manifold regularization.

As a further improvement of the technical scheme, the ISS-ELM algorithm comprises the following steps:

s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples₀The initial hidden layer output matrix is H₀Considering the case that l is greater than k in the solution of the optimization problem of the optimization module, the initial external weight matrix

S1.2, completing the initial setting, when increasing₀＝k₁-k₀When each hidden node is generated:

then

The Schur supplement of (1) is:

s1.3, selecting a proper parameter lambda to enable a matrix P to be reversible; from the inverse of the 2 × 2 block matrix table we can derive:

s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:

in particular, when λ is 0, i.e., no unmarked samples, to avoid duplicate calculations, Q in the equation of S1.4₀，R₀，U₀，V₀The calculation can be done in the following order:

P^-1(ΔH₀ ^T(C+λL)H₀)→P^-1(ΔH₀ ^T(C+λL)H₀)β₀(＝U₀)；

as a further improvement of the present technical solution, the algorithm of the incremental learning unit is as follows:

(I) input and output stage:

s2.1, inputting I marked samples

And u unlabeled samples

x_i∈R^m，t_i∈RⁿAnd outputs beta;

(II) initial stage:

s2.2, determining the number of initial hidden layer nodes as k for given u unmarked samples of the l marked samples₀Randomly giving a learning parameter a of a jth hidden layer node_jAnd b_j(j＝1，2，…，k₀)；

S2.3, calculating an initial hidden layer output matrix:

s2.4, calculating an initial external weight matrix beta through the output matrix₀；

S2.5, calculating sample output error

S2.6, setting i to be 0;

(II) hidden layer node growth stage:

s2.7, number k of nodes in hidden layer_i≤k_max(k_maxIs a preset maximum number of hidden nodes), and

S2.8、i＝i+1；

s2.9, increase_i-1Hidden layer nodes, the total number of the hidden layer nodes is k_iRandomly giving the learning parameter a of the jth hidden layer node_jAnd b_j(j＝k₀+1，k₀+2，…，k₁) Corresponding hidden layer output matrix H_i+1＝[H_i，ΔH_i]；

S2.10, adjusting the external weight matrix beta as follows:

and S2.11, returning to the hidden node growth stage.

Compared with the prior art, the invention has the beneficial effects that: in the increment semi-supervised overrun learning machine system for adaptively determining the number of the hidden nodes, the semi-supervised overrun learning machine can increase the number of the hidden nodes one by one or in batches through the set increment unit, and adaptively determines the number of the hidden nodes.

Drawings

FIG. 1 is an overall block diagram of embodiment 1;

FIG. 2 is a block diagram of a feedforward neural network unit according to embodiment 1;

FIG. 3 is a block diagram of a module of a semi-supervised learning unit of embodiment 1;

FIG. 4 is a block diagram of an overrun learning unit in accordance with embodiment 1;

fig. 5 is a block diagram of a quantity learning unit module of embodiment 1.

The various reference numbers in the figures mean:

100. an overrun learning platform;

110. a feedforward neural network unit; 111. an input module; 112. a hidden layer module; 113. an output module;

120. a semi-supervised learning unit; 121. a summarizing module; 122. a hypothesis module; 123. an optimization module;

130. an overrun learning unit; 131. an initial module; 132. an algorithm module;

140. an incremental learning unit; 141. a learning module; 142. and a dynamic adjusting module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the equipment or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.

Example 1

The invention provides an increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes, please refer to fig. 1-5, which comprises an overrun learning platform 100, wherein the overrun learning platform 100 comprises a feedforward neural network unit 110, a semi-supervised learning unit 120, an overrun learning unit 130 and an increment learning unit 140; the feedforward neural network unit 110 is used for receiving and outputting signals of all units and modules, receiving signals of neurons in the previous layer, generating and outputting signals to the next layer, namely an input layer at the 0 th layer, a last layer is called an output layer, other middle layers are called hidden layers, interaction is carried out between the layers through the neurons, and the number of the neurons in the hidden layers is used as hidden layer nodes; the semi-supervised learning unit 120 is used for carrying out pattern recognition work by combining unlabeled samples and labeled samples; the overrun learning unit 130 is used for constructing a semi-supervised learning system; the increment learning unit 140 is configured to increase hidden layer nodes and determine the number of hidden layer nodes, the increment learning unit 140 often gradually increases the data volume in actual sensing data, and when new data is encountered, the increment learning unit 140 can adaptively modify a trained system to learn knowledge contained in the new data, and does not need to reconstruct all knowledge bases, but updates only changes caused by newly added data on the basis of an original knowledge base, so that the time cost for modifying a trained system is usually lower than the cost required for retraining a system;

the feedforward neural network unit 110 includes an input module 111, a hidden layer module 112 and an output module 113; the input module 111 is configured to receive the feature dimensions of the learning samples, and transmit the feature dimensions to the hidden layer module 112 through neurons; the hidden layer module 112 is configured to calculate a feature dimension product through an excitation function, and transmit a calculation result to the output module 113 through a neuron; the output module 113 is configured to encapsulate the calculation result and output the encapsulated calculation result;

the feedforward neural network unit 110 is a single hidden layer feedforward network, the inner weight and hidden layer bias of the network are randomly given, and the outer weight is obtained by solving an optimization problem, so that the network learning speed and generalization performance are improved;

the semi-supervised learning unit 120 comprises a generalization module 121, a hypothesis module 122 and an optimization module 123; the induction module 121 is configured to induce the received learning samples into unlabeled samples and labeled samples; the hypothesis module 122 is used to make hypotheses on unlabeled samples and labeled samples; the optimization module 123 is configured to perform optimization processing on the external weight matrix;

the overrun learning unit 130 includes an initialization module 131 and an algorithm module 132; the initial module 131 is configured to perform initial setting on the unlabeled samples, the labeled samples, and the external weight matrix in the semi-supervised learning unit 120; the algorithm module 132 is configured to perform algorithm calculation on the unmarked samples, the marked samples, and the external weight matrix after the initial setting;

the incremental learning unit 140 includes a learning module 141 and a dynamic adjustment module 142; the learning module 141 is configured to perform progressive learning on the updated knowledge, and modify and reinforce the previous knowledge, so that the updated knowledge can adapt to newly arrived data without learning all data again; the dynamic adjustment module 142 is configured to enable the weight vector of the neuron in the learning module 141 and the topology of the network to be dynamically adjusted with the arrival of the input learning data, so as to optimize the expression precision of the input learning data, and in addition, by increasing the number of neurons in due time, not only can the number of neurons be adaptively determined to satisfy a certain quantization error constraint, but also the neurons can adapt to learning data that has not been learned before without affecting the previous learning result.

In this embodiment, the input module 111, the hidden layer module 112, and the output module 113 form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.

Further, the formula of the excitation function in the hidden layer module 112 is as follows:

is a hidden layer activation function; and f (X) is the network output.

Specifically, the feedforward neural network unit 110 adopts a single hidden layer feedforward network learning algorithm, which is as follows:

h(x)＝[G(a₁，b₁，x)，G(a₂，b₂，x)，…G(a_k，b_k，x)；

Hβ＝T；

the solution of this model is:

Minimize：

Subject toHβ＝T-；

the solution of this model is:

wherein c is a parameter.

Furthermore, the hypothesis module 122 includes the edge distribution hypotheses that the labeled sample set and the unlabeled sample set are taken from the same, and the conditional probability similarity hypotheses that the samples of the labeled sample set and the unlabeled sample set are similar.

In addition, the learning process of the semi-supervised learning unit 120 includes the following steps:

s1, firstly, the induction module 121 induces l marked samples

And u unlabeled samples

x_i∈R^m，t_i∈Rⁿ；

S2, making the first hypothesis mark sample set and the unlabeled sample set from the same edge distribution by the hypothesis module 122;

s3, make the second assumption by the assumption module 122 if two samples x_iAnd x_jSimilarly, then the conditional probability P (y | x)_i) And P (y | x)_j) Also similar;

generally, for a given sample, a Gaussian function is used

L-D-A is called as LapuA Lass matrix.

Further, the optimization module 123 optimizes the problem formula as follows:

wherein the content of the first and second substances,

a training target matrix is obtained; first behavior

the optimization module (123) optimizes the solution of the problem as:

wherein, I_kIs an identity matrix of order k.

Specifically, the algorithm module 132 employs an ISS-ELM algorithm, which obtains unsupervised clustering and character embedding oriented problems through manifold regularization.

In addition, the ISS-ELM algorithm steps are as follows:

s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples₀The initial hidden layer output matrix is H₀Considering the case where l > k in the solution of the optimization problem of the optimization module (123), the initial external weight matrix

then

The Schur supplement of (1) is:

s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:

P^-1(ΔH₀ ^T(C+λL)H₀)→P^-1(ΔH₀ ^T(C+λL)H₀)β₀(＝U₀)；

in addition, the algorithm of the incremental learning unit 140 is as follows:

(I) input and output stage:

s2.1, inputting I marked samples

And u unlabeled samples

x_i∈R^m，t_i∈RⁿAnd outputs beta;

(II) initial stage:

S2.3, calculating an initial hidden layer output matrix:

S2.5, calculating sample output error

S2.6, setting i to be 0;

(III) hidden layer node growth stage:

s2.7, number k of nodes in hidden layer_i≤k_max(k_maxFor a preset maximum number of hidden layer nodes) And is and

S2.8、i＝i+1；

S2.10, adjusting the external weight matrix beta as follows:

and S2.11, returning to the hidden node growth stage.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. An increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes comprises an overrun learning platform (100) and is characterized in that: the ultralimit learning platform (100) comprises a feedforward neural network unit (110), a semi-supervised learning unit (120), an ultralimit learning unit (130) and an increment learning unit (140); the feed-forward neural network unit (110) is used for receiving and outputting signals of various units and modules; the semi-supervised learning unit (120) is used for carrying out pattern recognition work by combining unlabeled samples and labeled samples; the overrun learning unit (130) is used for constructing a semi-supervised learning system; the increment learning unit (140) is used for increasing hidden layer nodes and determining the number of the hidden layer nodes;

the feedforward neural network unit (110) comprises an input module (111), a hidden layer module (112) and an output module (113); the input module (111) is used for receiving the characteristic dimension of the learning sample and transmitting the characteristic dimension to the hidden layer module (112) through a neuron; the hidden layer module (112) is used for calculating the characteristic dimension product through an excitation function and transmitting the calculation result to the output module (113) through a neuron; the output module (113) is used for packaging and outputting the calculation result;

the semi-supervised learning unit (120) comprises a generalization module (121), a hypothesis module (122) and an optimization module (123); the induction module (121) is used for inducing the received learning samples into unmarked samples and marked samples; the hypothesis module (122) is configured to hypothesize unlabeled samples and labeled samples; the optimization module (123) is used for optimizing the external weight matrix;

the overrun learning unit (130) comprises an initialization module (131) and an algorithm module (132); the initial module (131) is used for initially setting unlabeled samples, labeled samples and an external weight matrix in the semi-supervised learning unit (120); the algorithm module (132) is used for carrying out algorithm calculation on the unmarked samples, the marked samples and the external weight matrix after initial setting;

the incremental learning unit (140) comprises a learning module (141) and a dynamic adjustment module (142); the learning module (141) is used for gradually learning the updated knowledge and correcting and reinforcing the previous knowledge; the dynamic adjustment module (142) is used for enabling the weight vectors of the neurons in the learning module (141) and the topological structure of the network to be dynamically adjusted along with the arrival of input learning data.

2. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the input module (111), the hidden layer module (112) and the output module (113) form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.

3. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 2, wherein: the formula of the excitation function in the hidden layer module (112) is as follows:

is a hidden layer activation function; and f (X) is the network output.

4. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the feedforward neural network unit (110) adopts a single hidden layer feedforward network learning algorithm, and the algorithm is as follows:

h(x)＝[G(a₁,b₁,x),G(a₂,b₂,x),…G(a_k,b_k,x)；

in the formula, a_j,b_j(j ═ 1,2, …, k) is a learning parameter corresponding to the jth hidden node given randomly; k is the number of hidden layer nodes; g (x) is an excitation function;

given N training samples (x)_i,t_i)，x_i∈R^m,t_i∈RⁿThe mathematical model of the overrun learning machine is as follows:

Hβ＝T；

the solution of this model is:

5. the incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the hypothesis module (122) includes hypothesis that the labeled sample set and the unlabeled sample set are taken from the same edge distribution, and a conditional probability similarity hypothesis that the samples of the labeled sample set and the unlabeled sample set are similar.

6. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the semi-supervised learning unit (120) learning process comprises the steps of:

s1, firstly, inducing l marked samples by an inducing module (121)

And u unlabeled samples

x_i∈R^m,t_i∈Rⁿ；

S2, making a first hypothesis mark sample set and an unmarked sample set from the same edge distribution through the hypothesis module (122);

s3, making the second step through the hypothesis module (122)Two hypothesis if two samples x_iAnd x_jSimilarly, then the conditional probability P (y | x)_i) And P (y | x)_j) Also similar;

s4, assuming that a is complete, (a)_ij)_n×nA similarity matrix for the training samples is determined and an outer weight matrix beta is determined.

7. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the optimization module (123) optimizes a problem formula as follows:

wherein the content of the first and second substances,

a training target matrix is obtained; first behavior

the optimization module (123) optimizes a solution of a problem as:

wherein, I_kIs an identity matrix of order k.

8. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the algorithm module (132) employs an ISS-ELM algorithm that obtains unsupervised clustering and character embedding oriented problems through manifold regularization.

9. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 8, wherein: the ISS-ELM algorithm comprises the following steps:

s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples₀The initial hidden layer output matrix is H₀；

then

The Schur supplement of (1) is:

s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:

10. the incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the algorithm of the incremental learning unit (140) is as follows:

(I) input and output stage:

s2.1, inputting I marked samples

And u unlabeled samples

x_i∈R^m,t_i∈RⁿAnd outputs beta;

(II) initial stage:

s2.2, determining the number of initial hidden layer nodes as k for given u unmarked samples of the l marked samples₀Randomly giving a learning parameter a of a jth hidden layer node_jAnd b_j(j＝1,2,…,k₀)；

S2.3, calculating an initial hidden layer output matrix:

S2.5, calculating sample output error

S2.6, setting i to be 0;

(III) hidden layer node growth stage:

S2.8、i＝i+1；

s2.9, increase_i-1Hidden layer nodes, the total number of the hidden layer nodes is k_iRandomly giving the learning parameter a of the jth hidden layer node_jAnd b_j(j＝k₀+1,k₀+2,…,k₁) Corresponding hidden layer output matrix H_i+1＝[H_i,ΔH_i]；

S2.10, adjusting the external weight matrix beta as follows:

and S2.11, returning to the hidden node growth stage.