CN112116088A - Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes - Google Patents

Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes Download PDF

Info

Publication number
CN112116088A
CN112116088A CN202010857885.2A CN202010857885A CN112116088A CN 112116088 A CN112116088 A CN 112116088A CN 202010857885 A CN202010857885 A CN 202010857885A CN 112116088 A CN112116088 A CN 112116088A
Authority
CN
China
Prior art keywords
module
learning
hidden layer
samples
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010857885.2A
Other languages
Chinese (zh)
Inventor
卢诚波
梅颖
高源�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lishui University
Original Assignee
Lishui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lishui University filed Critical Lishui University
Priority to CN202010857885.2A priority Critical patent/CN112116088A/en
Publication of CN112116088A publication Critical patent/CN112116088A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of machine learning, in particular to an incremental semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes. The ultra-limit learning platform comprises a feedforward neural network unit, a semi-supervised learning unit, an ultra-limit learning unit and an incremental learning unit; the semi-supervised learning unit is used for carrying out pattern recognition work in a mode of combining unlabeled samples and labeled samples; the overrun learning unit is used for constructing a semi-supervised learning system; the increment learning unit is used for increasing the hidden layer nodes and determining the number of the hidden layer nodes. In the invention, the semi-supervised ultralimit learning machine can increase hidden layer nodes one by one or in batches through the arranged increment unit, and adaptively determine the number of the hidden layer nodes.

Description

Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes
Technical Field
The invention relates to the technical field of machine learning, in particular to an incremental semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes.
Background
In the regression problem, "labeling" generally refers to an output value of a data sample, and in the classification problem, "labeling" generally refers to a class label of the data sample, and most of the current learning algorithms are trained by using a labeled sample, and in fact, the collected data samples often coexist with a labeled sample and an unlabeled sample, and even more include the unlabeled sample, and the unlabeled sample usually needs to be converted into a labeled sample by using special equipment or through tedious and very time-consuming manual labeling.
In general, half-supervised learning can use unlabeled samples to assist in labeling samples for learning, but the learning machine cannot adaptively determine the reasonable number of hidden layers in the learning process, and when the number of hidden layers is increased, an external weight matrix of a network needs to be retrained, so that the training time is increased, the waiting time of the learning machine is prolonged, and the learning efficiency is influenced.
Disclosure of Invention
The invention aims to provide an increment semi-supervised over-limit learning machine system for adaptively determining the number of hidden nodes, so as to solve the problems in the background technology.
In order to achieve the aim, the invention provides an increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes, which comprises an overrun learning platform, wherein the overrun learning platform comprises a feedforward neural network unit, a semi-supervised learning unit, an overrun learning unit and an increment learning unit; the feedforward neural network unit is used for receiving and outputting signals of all units and modules; the semi-supervised learning unit is used for carrying out pattern recognition work in a mode of combining unlabeled samples and labeled samples; the overrun learning unit is used for constructing a semi-supervised learning system; the increment learning unit is used for increasing hidden layer nodes and determining the number of the hidden layer nodes;
the feedforward neural network unit comprises an input module, a hidden layer module and an output module; the input module is used for receiving the characteristic dimension of the learning sample and transmitting the characteristic dimension to the hidden layer module through the neuron; the hidden layer module is used for calculating the characteristic dimension product through an excitation function and transmitting the calculation result to the output module through a neuron; the output module is used for packaging and outputting the calculation result;
the semi-supervised learning unit comprises an induction module, a hypothesis module and an optimization module; the induction module is used for inducing the received learning samples into unmarked samples and marked samples; the hypothesis module is used for performing hypothesis on the unlabeled samples and the labeled samples; the optimization module is used for optimizing the external weight matrix;
the overrun learning unit comprises an initial module and an algorithm module; the initial module is used for initially setting unmarked samples, marked samples and external weight matrixes in the semi-supervised learning unit; the algorithm module is used for carrying out algorithm calculation on the unmarked samples, the marked samples and the external weight matrix after initial setting;
the increment learning unit comprises a learning module and a dynamic adjusting module; the learning module is used for gradually learning the updated knowledge, and correcting and reinforcing the previous knowledge; the dynamic adjustment module is used for enabling weight vectors of the neurons in the learning module and the topological structure of the network to be dynamically adjusted along with the arrival of input learning data.
As a further improvement of the technical solution, the input module, the hidden layer module and the output module form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.
As a further improvement of the technical solution, a formula of an excitation function in the hidden layer module is as follows:
Figure BDA0002647098010000021
wherein X is (X)1,x2,x3,…,xn)TInputting the n-dimensional input quantity of the network; wi=(wi1,wi2,wi3,…,win)TThe threshold values of the ith nodes of the input layer and the hidden layer are set; v. ofiThe connection right from the ith node of the hidden layer to the output layer;
Figure BDA0002647098010000022
is a hidden layer activation function; and f (X) is the network output.
As a further improvement of the technical solution, the feedforward neural network unit adopts a single hidden layer feedforward network learning algorithm, and the algorithm is as follows:
the hidden layer output of the training sample x is represented as a row vector, then:
h(x)=[G(a1,b1,x),G(a2,b2,x),…G(ak,bk,x);
in the formula, aj,bj(j ═ 1,2, …, k) is a learning parameter corresponding to the jth hidden node given randomly; k is the number of hidden layer nodes; g (x) is an excitation function;
given N training samples (x)i,ti),xi∈Rm,ti∈RnThe mathematical model of the overrun learning machine is as follows:
Hβ=T;
in the formula, H is a hidden layer output matrix; beta is an outer weight matrix; t is an objective matrix, wherein:
Figure BDA0002647098010000031
the solution of this model is:
Figure BDA0002647098010000032
in order to improve the generalization performance of the learner, a ridge regression formula of the model is given:
Minimize:
Figure BDA0002647098010000033
Subjoct toHβ=T-;
wherein c is a parameter; margin with error | | | |)2Is an empirical risk; | beta | | non-conducting phosphor2Is a structural risk;
the solution of this model is:
Figure BDA0002647098010000034
wherein c is a parameter.
As a further improvement of the present technical solution, the hypothesis module includes that the labeled sample set and the unlabeled sample set are taken from the same edge distribution hypothesis, and the labeled sample set and the unlabeled sample set have similar samples and similar conditional probability hypotheses.
As a further improvement of the present technical solution, the learning process of the semi-supervised learning unit includes the following steps:
s1, firstly, summarizing l marked samples through a summary module
Figure BDA0002647098010000035
And u unlabeled samples
Figure BDA0002647098010000036
xi∈Rmm,ti∈Rn
S2, making a first hypothesis mark sample set and an unmarked sample set from the same edge distribution through a hypothesis module;
s3, making a second assumption by the assumption module if two samples xiAnd xjSimilarly, then the conditional probability P (y | x)i) And P (y | x)j) Also similar;
s4, assuming that a is complete, (a)ij)n×nDetermining an external weight matrix beta for a similar matrix of a training sample;
generally, for a given sample, a Gaussian function is used
Figure BDA0002647098010000037
Calculating aijObviously, sample xiAnd xjThe closer to aijThe greater the value of (A); let D be the diagonal matrix and the ith diagonal element be
Figure BDA0002647098010000038
L-D-a is called laplacian matrix.
As a further improvement of the technical solution, the optimization problem formula of the optimization module is as follows:
Figure BDA0002647098010000041
wherein the content of the first and second substances,
Figure BDA0002647098010000042
a training target matrix is obtained; first behavior
Figure BDA0002647098010000043
The rest is 0; c is a diagonal matrix of order l + u; the first one diagonal element is a parameter c, and the remaining u diagonal elements are 0;
the optimization module (123) optimizes a solution of a problem as:
Figure BDA0002647098010000044
wherein, IkIs an identity matrix of order k.
As a further improvement of the technical scheme, the algorithm module adopts an ISS-ELM algorithm, and unsupervised clustering and character embedding problems are obtained through manifold regularization.
As a further improvement of the technical scheme, the ISS-ELM algorithm comprises the following steps:
s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples0The initial hidden layer output matrix is H0Considering the case that l is greater than k in the solution of the optimization problem of the optimization module, the initial external weight matrix
Figure BDA0002647098010000045
S1.2, completing the initial setting, when increasing0=k1-k0When each hidden node is generated:
Figure BDA0002647098010000046
then
Figure BDA0002647098010000048
The Schur supplement of (1) is:
Figure BDA0002647098010000049
s1.3, selecting a proper parameter lambda to enable a matrix P to be reversible; from the inverse of the 2 × 2 block matrix table we can derive:
Figure BDA0002647098010000047
Figure BDA0002647098010000051
s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:
Figure BDA0002647098010000052
in particular, when λ is 0, i.e., no unmarked samples, to avoid duplicate calculations, Q in the equation of S1.40,R0,U0,V0The calculation can be done in the following order:
P-1(ΔH0 T(C+λL)H0)→P-1(ΔH0 T(C+λL)H00(=U0);
Figure BDA0002647098010000058
Figure BDA0002647098010000059
Figure BDA0002647098010000053
Figure BDA00026470980100000510
as a further improvement of the present technical solution, the algorithm of the incremental learning unit is as follows:
(I) input and output stage:
s2.1, inputting I marked samples
Figure BDA0002647098010000054
And u unlabeled samples
Figure BDA0002647098010000055
xi∈Rm,ti∈RnAnd outputs beta;
(II) initial stage:
s2.2, determining the number of initial hidden layer nodes as k for given u unmarked samples of the l marked samples0Randomly giving a learning parameter a of a jth hidden layer nodejAnd bj(j=1,2,…,k0);
S2.3, calculating an initial hidden layer output matrix:
Figure BDA0002647098010000056
s2.4, calculating an initial external weight matrix beta through the output matrix0
S2.5, calculating sample output error
Figure BDA0002647098010000057
S2.6, setting i to be 0;
(II) hidden layer node growth stage:
s2.7, number k of nodes in hidden layeri≤kmax(kmaxIs a preset maximum number of hidden nodes), and
Figure BDA0002647098010000061
S2.8、i=i+1;
s2.9, increasei-1Hidden layer nodes, the total number of the hidden layer nodes is kiRandomly giving the learning parameter a of the jth hidden layer nodejAnd bj(j=k0+1,k0+2,…,k1) Corresponding hidden layer output matrix Hi+1=[Hi,ΔHi];
Figure BDA0002647098010000062
S2.10, adjusting the external weight matrix beta as follows:
Figure BDA0002647098010000063
and S2.11, returning to the hidden node growth stage.
Compared with the prior art, the invention has the beneficial effects that: in the increment semi-supervised overrun learning machine system for adaptively determining the number of the hidden nodes, the semi-supervised overrun learning machine can increase the number of the hidden nodes one by one or in batches through the set increment unit, and adaptively determines the number of the hidden nodes.
Drawings
FIG. 1 is an overall block diagram of embodiment 1;
FIG. 2 is a block diagram of a feedforward neural network unit according to embodiment 1;
FIG. 3 is a block diagram of a module of a semi-supervised learning unit of embodiment 1;
FIG. 4 is a block diagram of an overrun learning unit in accordance with embodiment 1;
fig. 5 is a block diagram of a quantity learning unit module of embodiment 1.
The various reference numbers in the figures mean:
100. an overrun learning platform;
110. a feedforward neural network unit; 111. an input module; 112. a hidden layer module; 113. an output module;
120. a semi-supervised learning unit; 121. a summarizing module; 122. a hypothesis module; 123. an optimization module;
130. an overrun learning unit; 131. an initial module; 132. an algorithm module;
140. an incremental learning unit; 141. a learning module; 142. and a dynamic adjusting module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the equipment or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Example 1
The invention provides an increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes, please refer to fig. 1-5, which comprises an overrun learning platform 100, wherein the overrun learning platform 100 comprises a feedforward neural network unit 110, a semi-supervised learning unit 120, an overrun learning unit 130 and an increment learning unit 140; the feedforward neural network unit 110 is used for receiving and outputting signals of all units and modules, receiving signals of neurons in the previous layer, generating and outputting signals to the next layer, namely an input layer at the 0 th layer, a last layer is called an output layer, other middle layers are called hidden layers, interaction is carried out between the layers through the neurons, and the number of the neurons in the hidden layers is used as hidden layer nodes; the semi-supervised learning unit 120 is used for carrying out pattern recognition work by combining unlabeled samples and labeled samples; the overrun learning unit 130 is used for constructing a semi-supervised learning system; the increment learning unit 140 is configured to increase hidden layer nodes and determine the number of hidden layer nodes, the increment learning unit 140 often gradually increases the data volume in actual sensing data, and when new data is encountered, the increment learning unit 140 can adaptively modify a trained system to learn knowledge contained in the new data, and does not need to reconstruct all knowledge bases, but updates only changes caused by newly added data on the basis of an original knowledge base, so that the time cost for modifying a trained system is usually lower than the cost required for retraining a system;
the feedforward neural network unit 110 includes an input module 111, a hidden layer module 112 and an output module 113; the input module 111 is configured to receive the feature dimensions of the learning samples, and transmit the feature dimensions to the hidden layer module 112 through neurons; the hidden layer module 112 is configured to calculate a feature dimension product through an excitation function, and transmit a calculation result to the output module 113 through a neuron; the output module 113 is configured to encapsulate the calculation result and output the encapsulated calculation result;
the feedforward neural network unit 110 is a single hidden layer feedforward network, the inner weight and hidden layer bias of the network are randomly given, and the outer weight is obtained by solving an optimization problem, so that the network learning speed and generalization performance are improved;
the semi-supervised learning unit 120 comprises a generalization module 121, a hypothesis module 122 and an optimization module 123; the induction module 121 is configured to induce the received learning samples into unlabeled samples and labeled samples; the hypothesis module 122 is used to make hypotheses on unlabeled samples and labeled samples; the optimization module 123 is configured to perform optimization processing on the external weight matrix;
the overrun learning unit 130 includes an initialization module 131 and an algorithm module 132; the initial module 131 is configured to perform initial setting on the unlabeled samples, the labeled samples, and the external weight matrix in the semi-supervised learning unit 120; the algorithm module 132 is configured to perform algorithm calculation on the unmarked samples, the marked samples, and the external weight matrix after the initial setting;
the incremental learning unit 140 includes a learning module 141 and a dynamic adjustment module 142; the learning module 141 is configured to perform progressive learning on the updated knowledge, and modify and reinforce the previous knowledge, so that the updated knowledge can adapt to newly arrived data without learning all data again; the dynamic adjustment module 142 is configured to enable the weight vector of the neuron in the learning module 141 and the topology of the network to be dynamically adjusted with the arrival of the input learning data, so as to optimize the expression precision of the input learning data, and in addition, by increasing the number of neurons in due time, not only can the number of neurons be adaptively determined to satisfy a certain quantization error constraint, but also the neurons can adapt to learning data that has not been learned before without affecting the previous learning result.
In this embodiment, the input module 111, the hidden layer module 112, and the output module 113 form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.
Further, the formula of the excitation function in the hidden layer module 112 is as follows:
Figure BDA0002647098010000081
wherein X is (X)1,x2,x3,…,xn)TInputting the n-dimensional input quantity of the network; wi=(wi1,wi2,wi3,…,win)TThe threshold values of the ith nodes of the input layer and the hidden layer are set; v. ofiThe connection right from the ith node of the hidden layer to the output layer;
Figure BDA0002647098010000091
is a hidden layer activation function; and f (X) is the network output.
Specifically, the feedforward neural network unit 110 adopts a single hidden layer feedforward network learning algorithm, which is as follows:
the hidden layer output of the training sample x is represented as a row vector, then:
h(x)=[G(a1,b1,x),G(a2,b2,x),…G(ak,bk,x);
in the formula, aj,bj(j ═ 1,2, …, k) is a learning parameter corresponding to the jth hidden node given randomly; k is the number of hidden layer nodes; g (x) is an excitation function;
given N training samples (x)i,ti),xi∈Rm,ti∈RnThe mathematical model of the overrun learning machine is as follows:
Hβ=T;
in the formula, H is a hidden layer output matrix; beta is an outer weight matrix; t is an objective matrix, wherein:
Figure BDA0002647098010000092
the solution of this model is:
Figure BDA0002647098010000093
in order to improve the generalization performance of the learner, a ridge regression formula of the model is given:
Minimize:
Figure BDA0002647098010000094
Subject toHβ=T-;
wherein c is a parameter; margin with error | | | |)2Is an empirical risk; | beta | | non-conducting phosphor2Is a structural risk;
the solution of this model is:
Figure BDA0002647098010000095
wherein c is a parameter.
Furthermore, the hypothesis module 122 includes the edge distribution hypotheses that the labeled sample set and the unlabeled sample set are taken from the same, and the conditional probability similarity hypotheses that the samples of the labeled sample set and the unlabeled sample set are similar.
In addition, the learning process of the semi-supervised learning unit 120 includes the following steps:
s1, firstly, the induction module 121 induces l marked samples
Figure BDA0002647098010000101
And u unlabeled samples
Figure BDA0002647098010000102
xi∈Rm,ti∈Rn
S2, making the first hypothesis mark sample set and the unlabeled sample set from the same edge distribution by the hypothesis module 122;
s3, make the second assumption by the assumption module 122 if two samples xiAnd xjSimilarly, then the conditional probability P (y | x)i) And P (y | x)j) Also similar;
s4, assuming that a is complete, (a)ij)n×nDetermining an external weight matrix beta for a similar matrix of a training sample;
generally, for a given sample, a Gaussian function is used
Figure BDA0002647098010000103
Calculating aijObviously, sample xiAnd xjThe closer to aijThe greater the value of (A); let D be the diagonal matrix and the ith diagonal element be
Figure BDA0002647098010000104
L-D-A is called as LapuA Lass matrix.
Further, the optimization module 123 optimizes the problem formula as follows:
Figure BDA0002647098010000105
wherein the content of the first and second substances,
Figure BDA0002647098010000106
a training target matrix is obtained; first behavior
Figure BDA0002647098010000107
The rest is 0; c is a diagonal matrix of order l + u; the first one diagonal element is a parameter c, and the remaining u diagonal elements are 0;
the optimization module (123) optimizes the solution of the problem as:
Figure BDA0002647098010000108
wherein, IkIs an identity matrix of order k.
Specifically, the algorithm module 132 employs an ISS-ELM algorithm, which obtains unsupervised clustering and character embedding oriented problems through manifold regularization.
In addition, the ISS-ELM algorithm steps are as follows:
s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples0The initial hidden layer output matrix is H0Considering the case where l > k in the solution of the optimization problem of the optimization module (123), the initial external weight matrix
Figure BDA0002647098010000109
S1.2, completing the initial setting, when increasing0=k1-k0When each hidden node is generated:
Figure BDA00026470980100001010
Figure BDA0002647098010000111
then
Figure BDA0002647098010000115
The Schur supplement of (1) is:
Figure BDA0002647098010000116
s1.3, selecting a proper parameter lambda to enable a matrix P to be reversible; from the inverse of the 2 × 2 block matrix table we can derive:
Figure BDA0002647098010000112
s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:
Figure BDA0002647098010000113
in particular, when λ is 0, i.e., no unmarked samples, to avoid duplicate calculations, Q in the equation of S1.40,R0,U0,V0The calculation can be done in the following order:
P-1(ΔH0 T(C+λL)H0)→P-1(ΔH0 T(C+λL)H00(=U0);
Figure BDA0002647098010000117
Figure BDA0002647098010000118
Figure BDA0002647098010000114
Figure BDA0002647098010000119
in addition, the algorithm of the incremental learning unit 140 is as follows:
(I) input and output stage:
s2.1, inputting I marked samples
Figure BDA0002647098010000121
And u unlabeled samples
Figure BDA0002647098010000122
xi∈Rm,ti∈RnAnd outputs beta;
(II) initial stage:
s2.2, determining the number of initial hidden layer nodes as k for given u unmarked samples of the l marked samples0Randomly giving a learning parameter a of a jth hidden layer nodejAnd bj(j=1,2,…,k0);
S2.3, calculating an initial hidden layer output matrix:
Figure BDA0002647098010000123
s2.4, calculating an initial external weight matrix beta through the output matrix0
S2.5, calculating sample output error
Figure BDA0002647098010000124
S2.6, setting i to be 0;
(III) hidden layer node growth stage:
s2.7, number k of nodes in hidden layeri≤kmax(kmaxFor a preset maximum number of hidden layer nodes) And is and
Figure BDA0002647098010000125
S2.8、i=i+1;
s2.9, increasei-1Hidden layer nodes, the total number of the hidden layer nodes is kiRandomly giving the learning parameter a of the jth hidden layer nodejAnd bj(j=k0+1,k0+2,…,k1) Corresponding hidden layer output matrix Hi+1=[Hi,ΔHi];
Figure BDA0002647098010000126
S2.10, adjusting the external weight matrix beta as follows:
Figure BDA0002647098010000127
and S2.11, returning to the hidden node growth stage.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. An increment semi-supervised overrun learning machine system for adaptively determining the number of hidden nodes comprises an overrun learning platform (100) and is characterized in that: the ultralimit learning platform (100) comprises a feedforward neural network unit (110), a semi-supervised learning unit (120), an ultralimit learning unit (130) and an increment learning unit (140); the feed-forward neural network unit (110) is used for receiving and outputting signals of various units and modules; the semi-supervised learning unit (120) is used for carrying out pattern recognition work by combining unlabeled samples and labeled samples; the overrun learning unit (130) is used for constructing a semi-supervised learning system; the increment learning unit (140) is used for increasing hidden layer nodes and determining the number of the hidden layer nodes;
the feedforward neural network unit (110) comprises an input module (111), a hidden layer module (112) and an output module (113); the input module (111) is used for receiving the characteristic dimension of the learning sample and transmitting the characteristic dimension to the hidden layer module (112) through a neuron; the hidden layer module (112) is used for calculating the characteristic dimension product through an excitation function and transmitting the calculation result to the output module (113) through a neuron; the output module (113) is used for packaging and outputting the calculation result;
the semi-supervised learning unit (120) comprises a generalization module (121), a hypothesis module (122) and an optimization module (123); the induction module (121) is used for inducing the received learning samples into unmarked samples and marked samples; the hypothesis module (122) is configured to hypothesize unlabeled samples and labeled samples; the optimization module (123) is used for optimizing the external weight matrix;
the overrun learning unit (130) comprises an initialization module (131) and an algorithm module (132); the initial module (131) is used for initially setting unlabeled samples, labeled samples and an external weight matrix in the semi-supervised learning unit (120); the algorithm module (132) is used for carrying out algorithm calculation on the unmarked samples, the marked samples and the external weight matrix after initial setting;
the incremental learning unit (140) comprises a learning module (141) and a dynamic adjustment module (142); the learning module (141) is used for gradually learning the updated knowledge and correcting and reinforcing the previous knowledge; the dynamic adjustment module (142) is used for enabling the weight vectors of the neurons in the learning module (141) and the topological structure of the network to be dynamically adjusted along with the arrival of input learning data.
2. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the input module (111), the hidden layer module (112) and the output module (113) form adjacent layers, and nodes of the adjacent layers are fully connected by connection weights.
3. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 2, wherein: the formula of the excitation function in the hidden layer module (112) is as follows:
Figure FDA0002647096000000021
wherein X is (X)1,x2,x3,…,xn)TInputting the n-dimensional input quantity of the network; wi=(wi1,wi2,wi3,…,win)TThe threshold values of the ith nodes of the input layer and the hidden layer are set; v. ofiThe connection right from the ith node of the hidden layer to the output layer;
Figure FDA0002647096000000022
is a hidden layer activation function; and f (X) is the network output.
4. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the feedforward neural network unit (110) adopts a single hidden layer feedforward network learning algorithm, and the algorithm is as follows:
the hidden layer output of the training sample x is represented as a row vector, then:
h(x)=[G(a1,b1,x),G(a2,b2,x),…G(ak,bk,x);
in the formula, aj,bj(j ═ 1,2, …, k) is a learning parameter corresponding to the jth hidden node given randomly; k is the number of hidden layer nodes; g (x) is an excitation function;
given N training samples (x)i,ti),xi∈Rm,ti∈RnThe mathematical model of the overrun learning machine is as follows:
Hβ=T;
in the formula, H is a hidden layer output matrix; beta is an outer weight matrix; t is an objective matrix, wherein:
Figure FDA0002647096000000023
the solution of this model is:
Figure FDA0002647096000000024
5. the incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the hypothesis module (122) includes hypothesis that the labeled sample set and the unlabeled sample set are taken from the same edge distribution, and a conditional probability similarity hypothesis that the samples of the labeled sample set and the unlabeled sample set are similar.
6. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the semi-supervised learning unit (120) learning process comprises the steps of:
s1, firstly, inducing l marked samples by an inducing module (121)
Figure FDA0002647096000000031
And u unlabeled samples
Figure FDA0002647096000000032
xi∈Rm,ti∈Rn
S2, making a first hypothesis mark sample set and an unmarked sample set from the same edge distribution through the hypothesis module (122);
s3, making the second step through the hypothesis module (122)Two hypothesis if two samples xiAnd xjSimilarly, then the conditional probability P (y | x)i) And P (y | x)j) Also similar;
s4, assuming that a is complete, (a)ij)n×nA similarity matrix for the training samples is determined and an outer weight matrix beta is determined.
7. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the optimization module (123) optimizes a problem formula as follows:
Figure FDA0002647096000000033
wherein the content of the first and second substances,
Figure FDA0002647096000000034
a training target matrix is obtained; first behavior
Figure FDA0002647096000000035
The rest is 0; c is a diagonal matrix of order l + u; the first one diagonal element is a parameter c, and the remaining u diagonal elements are 0;
the optimization module (123) optimizes a solution of a problem as:
Figure FDA0002647096000000036
wherein, IkIs an identity matrix of order k.
8. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the algorithm module (132) employs an ISS-ELM algorithm that obtains unsupervised clustering and character embedding oriented problems through manifold regularization.
9. The incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 8, wherein: the ISS-ELM algorithm comprises the following steps:
s1.1, setting the number of initial hidden layer nodes as k for given u unmarked samples of the marked samples0The initial hidden layer output matrix is H0
S1.2, completing the initial setting, when increasing0=k1-k0When each hidden node is generated:
Figure FDA0002647096000000041
then
Figure FDA0002647096000000042
The Schur supplement of (1) is:
Figure FDA0002647096000000043
s1.3, selecting a proper parameter lambda to enable a matrix P to be reversible; from the inverse of the 2 × 2 block matrix table we can derive:
Figure FDA0002647096000000044
s1.4, substituting the formula of S1.3 into the formula of S1.2 can obtain:
Figure FDA0002647096000000045
10. the incremental semi-supervised over-limit learning system for adaptively determining the number of hidden nodes of claim 1, wherein: the algorithm of the incremental learning unit (140) is as follows:
(I) input and output stage:
s2.1, inputting I marked samples
Figure FDA0002647096000000046
And u unlabeled samples
Figure FDA0002647096000000047
xi∈Rm,ti∈RnAnd outputs beta;
(II) initial stage:
s2.2, determining the number of initial hidden layer nodes as k for given u unmarked samples of the l marked samples0Randomly giving a learning parameter a of a jth hidden layer nodejAnd bj(j=1,2,…,k0);
S2.3, calculating an initial hidden layer output matrix:
Figure FDA0002647096000000051
s2.4, calculating an initial external weight matrix beta through the output matrix0
S2.5, calculating sample output error
Figure FDA0002647096000000052
S2.6, setting i to be 0;
(III) hidden layer node growth stage:
s2.7, number k of nodes in hidden layeri≤kmax(kmaxIs a preset maximum number of hidden nodes), and
Figure FDA0002647096000000053
S2.8、i=i+1;
s2.9, increasei-1Hidden layer nodes, the total number of the hidden layer nodes is kiRandomly giving the learning parameter a of the jth hidden layer nodejAnd bj(j=k0+1,k0+2,…,k1) Corresponding hidden layer output matrix Hi+1=[Hi,ΔHi];
Figure FDA0002647096000000054
S2.10, adjusting the external weight matrix beta as follows:
Figure FDA0002647096000000055
and S2.11, returning to the hidden node growth stage.
CN202010857885.2A 2020-08-24 2020-08-24 Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes Withdrawn CN112116088A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857885.2A CN112116088A (en) 2020-08-24 2020-08-24 Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857885.2A CN112116088A (en) 2020-08-24 2020-08-24 Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Publications (1)

Publication Number Publication Date
CN112116088A true CN112116088A (en) 2020-12-22

Family

ID=73805406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857885.2A Withdrawn CN112116088A (en) 2020-08-24 2020-08-24 Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes

Country Status (1)

Country Link
CN (1) CN112116088A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077388A (en) * 2021-04-25 2021-07-06 中国人民解放军国防科技大学 Data-augmented deep semi-supervised over-limit learning image classification method and system
CN113485829A (en) * 2021-07-02 2021-10-08 深圳万顺叫车云信息技术有限公司 Identification value generation method for data increment step of microservice cluster

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077388A (en) * 2021-04-25 2021-07-06 中国人民解放军国防科技大学 Data-augmented deep semi-supervised over-limit learning image classification method and system
CN113485829A (en) * 2021-07-02 2021-10-08 深圳万顺叫车云信息技术有限公司 Identification value generation method for data increment step of microservice cluster

Similar Documents

Publication Publication Date Title
Song et al. A self-organizing neural tree for large-set pattern classification
JP3088171B2 (en) Self-organizing pattern classification system and classification method
Souza Regularized fuzzy neural networks for pattern classification problems
CN112116088A (en) Incremental semi-supervised over-limit learning machine system for adaptively determining number of hidden nodes
CN112508192B (en) Increment heap width learning system with degree of depth structure
Amit et al. Perceptron learning with sign-constrained weights
WO2014060001A1 (en) Multitransmitter model of the neural network with an internal feedback
TWI787691B (en) Apparatus and method for neural network computation
CN108875933A (en) A kind of transfinite learning machine classification method and the system of unsupervised Sparse parameter study
CN114550847B (en) Medicine oral availability and toxicity prediction method based on graph convolution neural network
Eikens et al. Process identification with multiple neural network models
Lu et al. Real-Time stencil printing optimization using a hybrid multi-layer online sequential extreme learning and evolutionary search approach
Ozyildirim et al. Logarithmic learning for generalized classifier neural network
US5274744A (en) Neural network for performing a relaxation process
CN111898799B (en) BFA-Elman-based power load prediction method
Lee et al. A genetic algorithm based robust learning credit assignment cerebellar model articulation controller
Parvin et al. Divide & conquer classification and optimization by genetic algorithm
Wu et al. High-accuracy handwriting recognition based on improved CNN algorithm
Eom et al. Alpha-Integration Pooling for Convolutional Neural Networks
CN101339615B (en) Method of image segmentation based on similar matrix approximation
Aguilar et al. Recognition algorithm using evolutionary learning on the random neural networks
CN112215272A (en) Bezier curve-based image classification neural network attack method
Elshafei et al. Fuzzification using space-filling curves
Jain et al. Perceptron learning in the domain of graphs
Bundzel et al. Combining gradient and evolutionary approaches to the artificial neural networks training according to principles of support vector machines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201222