CN102968663A

CN102968663A - Unmarked sample-based neutral network constructing method and device

Info

Publication number: CN102968663A
Application number: CN2012105008602A
Authority: CN
Inventors: 储荣; 王敏; 赵燕容; 吴蓉; 田胜; 窦智; 戴云峰
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2012-11-29
Filing date: 2012-11-29
Publication date: 2013-03-13

Abstract

The invention discloses an unmarked sample-based neutral network constructing method belonging to the field of machine learning in intelligent science and technologies. According to the unmarked sample-based neutral network constructing method, by using the sensitivities of hidden neutrons in a neutral network as a reference and through gradually cutting neutrons with low sensitivity in the hidden neutrons, the network structure is gradually simplified and classifier performance is improved. Compared with the traditional mode, the unmarked sample-based neutral network constructing method has the advantages that in a process of calculating the sensitivities of the hidden neutrons, both marked samples and a large quantity of unmarked samples are used, so that the sensitivity calculating precision is greatly improved. The invention also discloses an unmarked sample-based neutral network constructing device. By the method and the device, the neutral network constructing efficiency can be effectively increased and the performance of the neural network is improved.

Description

Neural network construction method and device thereof based on unmarked sample

Technical field

The present invention relates to neural network network establishing method and the device thereof in when design, but relate in particular to a kind of Effective Raise neural network classification efficient or return network establishing method and the device thereof of efficient.

Background technology

When the design neural network classifier, the structure of how to confirm neural network is an important and crucial step.Make up a suitable network for particular problem raising nicety of grading, generalization ability are had huge help.Now widely used is three-layer neural network.Document is verified, and three-layer neural network can approach any continuous function when neuron number increases when the second layer (also claiming hidden layer, middle layer).In concrete the application, the ground floor neuron number of three-layer neural network depends on the dimension of input variable, and the 3rd layer of neuron number depends on the dimension of output variable.Because the dimension of input variable and output variable generally all knows, ground floor and the 3rd layer of neuronic number are generally also determined.Network struction for three-layer neural network is actually definite process of second layer neuron number purpose.

In neural network training, usually use the learning method that supervision is arranged.There is the learning method of supervision to refer in the process of neural network training to reach the purpose of neural network training by telling the network input to come the regulating networks parameter with corresponding output.In this process, need to use markd training sample.The mark of training sample generally is carried out by a specialist, and this often will spend a large amount of money and time cost.In actual applications, unmarked sample obtains manyly than markd sample easily.For example in some internet, applications, compare a part that only accounts for seldom by the marker samples that has of expert's mark with unmarked sample.Can use those not have markd sample to help the optimum structure of auxiliary definite network just to become necessary.

For three-layer neural network, determine that the method for hidden layer neural network mainly contains:

1) increases a method.This method determines that the process of hidden neuron quantity is at first to select a very little hidden neuron number.Because hidden nodes very little, so the structure of network is too simple, causes using the training sample neural network training unsuccessful.Mathematical sign is exactly that error does not restrain.On the basis of this hidden neuron quantity, increase singly hidden neuron quantity, do not increase a training primary network, until hidden neuron quantity when being increased to a certain quantity neural network can train successfully.Can be so that the minimum hidden neuron quantity of neural metwork training success be exactly the hidden nodes that we need to seek.

2) subtract a method.This method is with to increase a method opposite, its method of operating be at first determine one enough large hidden nodes construct a three-layer neural network, under this structure, can use markd sample to train at an easy rate neural network.Then hidden neuron is removed one, the network foundation after removing uses markd sample training, finishes so that network is trained again.Network training repeats above-mentioned removal process, until can not be finished.At this time get the final hidden nodes of determining of hidden neuron quantity conduct that minimum can be finished.Increasing a method and subtracting method theoretical foundation behind is that Statistical Learning Theory requires for a concrete classification problem sorter a suitable complexity to be arranged, and guarantees that neither over-fitting is not owed match yet.Sorter only in this way just can have best generalization ability.For the such sorter of three-layer neural network, network complexity just is embodied on the quantity of hidden neuron, and neuronal quantity very little network is owed match, and training can not be finished, the too many network over-fitting of neuronal quantity, and generalization ability is poor.

3) empirical method.This method determines that hidden nodes need to have deep understanding to the related field of particular problem, thereby by virtue of experience determines the quantity of hidden neuron.Even can not guarantee that like this hidden neuron quantity of getting is optimum.

Subtracting a method for said method uses more at present.Concrete subtracting in the process, at first deduct which hidden neuron, secondly deduct which hidden neuron for determining that final network structure is extremely important.It is generally acknowledged that effect or significance level that each hidden neuron plays are different in training process.At first remove in theory and do not have the effective or unessential neuron can be so that the Generalization Capability of the neural network that final training is finished is better to classification.How to utilize unmarked sample to assist and determine that hidden neuron is to determine that better network just becomes extremely important.

Summary of the invention

Goal of the invention: the objective of the invention is in order to solve the deficiencies in the prior art, provide a kind of utilization that marker samples and the auxiliary method of determining three-layer neural network hidden neuron susceptibility of unmarked sample are arranged, a kind of neural network construction device and method of work thereof based on unmarked sample also is provided simultaneously.

Technical scheme: in order to realize above purpose, a kind of neural network construction method based on unmarked sample of the present invention may further comprise the steps:

(S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;

(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e(e value less than 10-2 powers), obtain trained sorter;

(S105) utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility;

(S107) remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction;

(S109) new neural network being reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter that upgrades through parameter and repeating step (S107), (S109); If can not restrain then enter next step;

(S111) getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.

The method of the invention has improved the classification performance of neural network classifier for the cutting hidden neuron provides foundation with definite network structure.

Also disclose a kind of neural network construction device based on unmarked sample among the present invention, having comprised: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.

The method of work of above-mentioned neural network construction device based on unmarked sample, concrete steps are as follows:

(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;

(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;

(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;

(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.

The affiliated information of effectively having utilized unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thus more accurate, convenient than method in the past aspect the individual importance of judgement hidden neuron.

Beneficial effect: the present invention compared with prior art has the following advantages:

Neural network construction method based on unmarked sample of the present invention for the cutting hidden neuron to determine that network structure provides foundation, improved the classification performance of neural network classifier.

The information of effectively utilizing unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thereby more accurate than method in the past aspect the individual importance of judgement hidden neuron, convenient, adopt simultaneously the neural network based on the method for work cutting of the neural network construction device of unmarked sample to have better generalization.

Description of drawings

Fig. 1 is multi-layer perception（MLP） MLP (Multi-layer perceptron) neural network structure figure.

Fig. 2 is the neural network construction method process flow diagram based on unmarked sample of the specific embodiment of the invention.

Embodiment

Below in conjunction with specific embodiment, further illustrate the present invention, should understand these embodiment only is used for explanation the present invention and is not used in and limits the scope of the invention, after having read the present invention, those skilled in the art all fall within the application's claims limited range to the modification of the various equivalent form of values of the present invention.

Embodiment

Present embodiment illustrates according to feedforward neural network Method of Sample Selection of the present invention take Multilayer perceptron network MLP as example; The invention is not restricted to the MLP neural network, but can be applied to other feedforward neural network.

MLP is a kind of feedforward neural network (as shown in Figure 1) of full connection, is applicable to the classification of target.The structure of MLP as shown in Figure 1, it is a kind of three layers of feedforward network: input layer MA is comprised of the input pattern node, x _iI component of expression input mode vector (i=1,2 ..., n); The second layer is hidden layer MB, and it is by m node b _j(j=1,2 ..., m) form.The 3rd layer is output layer MC, and it is by p node c _k(k=1,2 ..., p) form.

Before training, need each the element standardization to input vector, here each element is standardized to [1,1].

Standard BP algorithm is adopted in training for above-mentioned MLP neural network.

The susceptibility of the above-mentioned neural network hidden neuron of the below's definition, definition can be generalized to other feedforward neural network at an easy rate.

After neural metwork training was finished, its mapping relations had also just been determined.If the mapping relations function is F (X) (wherein X is input vector).What suppose to dismiss is j hidden neuron, removes so that the mapping relations function of neural network becomes F behind j the hidden neuron _j(X), the susceptibility that defines j hidden neuron is: S _j(X)=E (‖ F (X)-F _j(X) || ²) (1)

‖ ‖ ²It is the operator of the euclideam norm asked for.E is the operator of asking for expectation.

Definition by susceptibility can find out that in fact susceptibility represented the difference that j hidden neuron is arranged and do not have j hidden neuron function output.J hidden neuron of the less expression of this difference is more inessential, otherwise then more important.

Calculating S _j(X) in the process, S _j(X) can be deformed into:

S _j(X)=∫ _Ω(F(X)-F _j(X)) ²p(X)dX （2）

Wherein Ω is field of definition, and p (X) is the density function of X in field of definition Ω.In general p (X) is unknown, because markd sample size seldom, only relies on markd sample to estimate that p (X) is very difficult.Unmarked sample size is very many, and the distributed intelligence of sample has also been contained in the inside.Here we adopt markd sample and unmarked sample to estimate together p (X), and this can make estimated accuracy greatly improve.

Suppose (X ₁, y ₁), (X ₂, y ₂) ... (X _b, y _b) be markd sample set, X _B+1, X _B+2X _NUnmarked sample set, so S _j(X) a good estimation is:

S_{j} (X) = \frac{1}{N} Σ_{s = 1}^{N} {(F (X_{s}) - F_{j} (X_{s}))}^{2} - - - (3)

This paper adopts (3) formula to calculate the susceptibility of hidden neuron.

Be illustrated in figure 2 as the neural network construction method process flow diagram that the present invention is based on unmarked sample.

In step S101, choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter.

In step S103, utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter.

In step S105, utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility.

In step S107, remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction.

In step S109, new neural network is reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter and repeating step S107, the S109 that upgrade through parameter.If can not restrain then enter next step.

In step S111, getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.

The present embodiment accompanying method is based on described neural network construction device based on unmarked sample, and this device comprises: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.

Claims

1. neural network construction method based on unmarked sample is characterized in that: may further comprise the steps: (S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;

(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter;

2. the neural network construction device based on unmarked sample is characterized in that: comprising: initial module, training module, hidden neuron selection module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.

3. method of work based on the neural network construction device of unmarked sample shown in claim 2, it is characterized in that: concrete steps are as follows: