CN102968663A - Unmarked sample-based neutral network constructing method and device - Google Patents

Unmarked sample-based neutral network constructing method and device Download PDF

Info

Publication number
CN102968663A
CN102968663A CN2012105008602A CN201210500860A CN102968663A CN 102968663 A CN102968663 A CN 102968663A CN 2012105008602 A CN2012105008602 A CN 2012105008602A CN 201210500860 A CN201210500860 A CN 201210500860A CN 102968663 A CN102968663 A CN 102968663A
Authority
CN
China
Prior art keywords
neural network
network
module
hidden
susceptibility
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105008602A
Other languages
Chinese (zh)
Inventor
储荣
王敏
赵燕容
吴蓉
田胜
窦智
戴云峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN2012105008602A priority Critical patent/CN102968663A/en
Publication of CN102968663A publication Critical patent/CN102968663A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an unmarked sample-based neutral network constructing method belonging to the field of machine learning in intelligent science and technologies. According to the unmarked sample-based neutral network constructing method, by using the sensitivities of hidden neutrons in a neutral network as a reference and through gradually cutting neutrons with low sensitivity in the hidden neutrons, the network structure is gradually simplified and classifier performance is improved. Compared with the traditional mode, the unmarked sample-based neutral network constructing method has the advantages that in a process of calculating the sensitivities of the hidden neutrons, both marked samples and a large quantity of unmarked samples are used, so that the sensitivity calculating precision is greatly improved. The invention also discloses an unmarked sample-based neutral network constructing device. By the method and the device, the neutral network constructing efficiency can be effectively increased and the performance of the neural network is improved.

Description

Neural network construction method and device thereof based on unmarked sample
Technical field
The present invention relates to neural network network establishing method and the device thereof in when design, but relate in particular to a kind of Effective Raise neural network classification efficient or return network establishing method and the device thereof of efficient.
Background technology
When the design neural network classifier, the structure of how to confirm neural network is an important and crucial step.Make up a suitable network for particular problem raising nicety of grading, generalization ability are had huge help.Now widely used is three-layer neural network.Document is verified, and three-layer neural network can approach any continuous function when neuron number increases when the second layer (also claiming hidden layer, middle layer).In concrete the application, the ground floor neuron number of three-layer neural network depends on the dimension of input variable, and the 3rd layer of neuron number depends on the dimension of output variable.Because the dimension of input variable and output variable generally all knows, ground floor and the 3rd layer of neuronic number are generally also determined.Network struction for three-layer neural network is actually definite process of second layer neuron number purpose.
In neural network training, usually use the learning method that supervision is arranged.There is the learning method of supervision to refer in the process of neural network training to reach the purpose of neural network training by telling the network input to come the regulating networks parameter with corresponding output.In this process, need to use markd training sample.The mark of training sample generally is carried out by a specialist, and this often will spend a large amount of money and time cost.In actual applications, unmarked sample obtains manyly than markd sample easily.For example in some internet, applications, compare a part that only accounts for seldom by the marker samples that has of expert's mark with unmarked sample.Can use those not have markd sample to help the optimum structure of auxiliary definite network just to become necessary.
For three-layer neural network, determine that the method for hidden layer neural network mainly contains:
1) increases a method.This method determines that the process of hidden neuron quantity is at first to select a very little hidden neuron number.Because hidden nodes very little, so the structure of network is too simple, causes using the training sample neural network training unsuccessful.Mathematical sign is exactly that error does not restrain.On the basis of this hidden neuron quantity, increase singly hidden neuron quantity, do not increase a training primary network, until hidden neuron quantity when being increased to a certain quantity neural network can train successfully.Can be so that the minimum hidden neuron quantity of neural metwork training success be exactly the hidden nodes that we need to seek.
2) subtract a method.This method is with to increase a method opposite, its method of operating be at first determine one enough large hidden nodes construct a three-layer neural network, under this structure, can use markd sample to train at an easy rate neural network.Then hidden neuron is removed one, the network foundation after removing uses markd sample training, finishes so that network is trained again.Network training repeats above-mentioned removal process, until can not be finished.At this time get the final hidden nodes of determining of hidden neuron quantity conduct that minimum can be finished.Increasing a method and subtracting method theoretical foundation behind is that Statistical Learning Theory requires for a concrete classification problem sorter a suitable complexity to be arranged, and guarantees that neither over-fitting is not owed match yet.Sorter only in this way just can have best generalization ability.For the such sorter of three-layer neural network, network complexity just is embodied on the quantity of hidden neuron, and neuronal quantity very little network is owed match, and training can not be finished, the too many network over-fitting of neuronal quantity, and generalization ability is poor.
3) empirical method.This method determines that hidden nodes need to have deep understanding to the related field of particular problem, thereby by virtue of experience determines the quantity of hidden neuron.Even can not guarantee that like this hidden neuron quantity of getting is optimum.
Subtracting a method for said method uses more at present.Concrete subtracting in the process, at first deduct which hidden neuron, secondly deduct which hidden neuron for determining that final network structure is extremely important.It is generally acknowledged that effect or significance level that each hidden neuron plays are different in training process.At first remove in theory and do not have the effective or unessential neuron can be so that the Generalization Capability of the neural network that final training is finished is better to classification.How to utilize unmarked sample to assist and determine that hidden neuron is to determine that better network just becomes extremely important.
Summary of the invention
Goal of the invention: the objective of the invention is in order to solve the deficiencies in the prior art, provide a kind of utilization that marker samples and the auxiliary method of determining three-layer neural network hidden neuron susceptibility of unmarked sample are arranged, a kind of neural network construction device and method of work thereof based on unmarked sample also is provided simultaneously.
Technical scheme: in order to realize above purpose, a kind of neural network construction method based on unmarked sample of the present invention may further comprise the steps:
(S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e(e value less than 10-2 powers), obtain trained sorter;
(S105) utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility;
(S107) remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction;
(S109) new neural network being reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter that upgrades through parameter and repeating step (S107), (S109); If can not restrain then enter next step;
(S111) getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
The method of the invention has improved the classification performance of neural network classifier for the cutting hidden neuron provides foundation with definite network structure.
Also disclose a kind of neural network construction device based on unmarked sample among the present invention, having comprised: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
The method of work of above-mentioned neural network construction device based on unmarked sample, concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.
The affiliated information of effectively having utilized unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thus more accurate, convenient than method in the past aspect the individual importance of judgement hidden neuron.
Beneficial effect: the present invention compared with prior art has the following advantages:
Neural network construction method based on unmarked sample of the present invention for the cutting hidden neuron to determine that network structure provides foundation, improved the classification performance of neural network classifier.
The information of effectively utilizing unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thereby more accurate than method in the past aspect the individual importance of judgement hidden neuron, convenient, adopt simultaneously the neural network based on the method for work cutting of the neural network construction device of unmarked sample to have better generalization.
Description of drawings
Fig. 1 is multi-layer perception(MLP) MLP (Multi-layer perceptron) neural network structure figure.
Fig. 2 is the neural network construction method process flow diagram based on unmarked sample of the specific embodiment of the invention.
Embodiment
Below in conjunction with specific embodiment, further illustrate the present invention, should understand these embodiment only is used for explanation the present invention and is not used in and limits the scope of the invention, after having read the present invention, those skilled in the art all fall within the application's claims limited range to the modification of the various equivalent form of values of the present invention.
Embodiment
Present embodiment illustrates according to feedforward neural network Method of Sample Selection of the present invention take Multilayer perceptron network MLP as example; The invention is not restricted to the MLP neural network, but can be applied to other feedforward neural network.
MLP is a kind of feedforward neural network (as shown in Figure 1) of full connection, is applicable to the classification of target.The structure of MLP as shown in Figure 1, it is a kind of three layers of feedforward network: input layer MA is comprised of the input pattern node, x iI component of expression input mode vector (i=1,2 ..., n); The second layer is hidden layer MB, and it is by m node b j(j=1,2 ..., m) form.The 3rd layer is output layer MC, and it is by p node c k(k=1,2 ..., p) form.
Before training, need each the element standardization to input vector, here each element is standardized to [1,1].
Standard BP algorithm is adopted in training for above-mentioned MLP neural network.
The susceptibility of the above-mentioned neural network hidden neuron of the below's definition, definition can be generalized to other feedforward neural network at an easy rate.
After neural metwork training was finished, its mapping relations had also just been determined.If the mapping relations function is F (X) (wherein X is input vector).What suppose to dismiss is j hidden neuron, removes so that the mapping relations function of neural network becomes F behind j the hidden neuron j(X), the susceptibility that defines j hidden neuron is: S j(X)=E (‖ F (X)-F j(X) || 2) (1)
‖ ‖ 2It is the operator of the euclideam norm asked for.E is the operator of asking for expectation.
Definition by susceptibility can find out that in fact susceptibility represented the difference that j hidden neuron is arranged and do not have j hidden neuron function output.J hidden neuron of the less expression of this difference is more inessential, otherwise then more important.
Calculating S j(X) in the process, S j(X) can be deformed into:
S j(X)=∫ Ω(F(X)-F j(X)) 2p(X)dX (2)
Wherein Ω is field of definition, and p (X) is the density function of X in field of definition Ω.In general p (X) is unknown, because markd sample size seldom, only relies on markd sample to estimate that p (X) is very difficult.Unmarked sample size is very many, and the distributed intelligence of sample has also been contained in the inside.Here we adopt markd sample and unmarked sample to estimate together p (X), and this can make estimated accuracy greatly improve.
Suppose (X 1, y 1), (X 2, y 2) ... (X b, y b) be markd sample set, X B+1, X B+2X NUnmarked sample set, so S j(X) a good estimation is:
S j ( X ) = 1 N Σ s = 1 N ( F ( X s ) - F j ( X s ) ) 2 - - - ( 3 )
This paper adopts (3) formula to calculate the susceptibility of hidden neuron.
Be illustrated in figure 2 as the neural network construction method process flow diagram that the present invention is based on unmarked sample.
In step S101, choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter.
In step S103, utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter.
In step S105, utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility.
In step S107, remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction.
In step S109, new neural network is reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter and repeating step S107, the S109 that upgrade through parameter.If can not restrain then enter next step.
In step S111, getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
The present embodiment accompanying method is based on described neural network construction device based on unmarked sample, and this device comprises: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
The method of work of above-mentioned neural network construction device based on unmarked sample, concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.

Claims (3)

1. neural network construction method based on unmarked sample is characterized in that: may further comprise the steps: (S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter;
(S105) utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility;
(S107) remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction;
(S109) new neural network being reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter that upgrades through parameter and repeating step (S107), (S109); If can not restrain then enter next step;
(S111) getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
2. the neural network construction device based on unmarked sample is characterized in that: comprising: initial module, training module, hidden neuron selection module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
3. method of work based on the neural network construction device of unmarked sample shown in claim 2, it is characterized in that: concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.
CN2012105008602A 2012-11-29 2012-11-29 Unmarked sample-based neutral network constructing method and device Pending CN102968663A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012105008602A CN102968663A (en) 2012-11-29 2012-11-29 Unmarked sample-based neutral network constructing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012105008602A CN102968663A (en) 2012-11-29 2012-11-29 Unmarked sample-based neutral network constructing method and device

Publications (1)

Publication Number Publication Date
CN102968663A true CN102968663A (en) 2013-03-13

Family

ID=47798794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012105008602A Pending CN102968663A (en) 2012-11-29 2012-11-29 Unmarked sample-based neutral network constructing method and device

Country Status (1)

Country Link
CN (1) CN102968663A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679211A (en) * 2013-12-05 2014-03-26 河海大学 Method and device for selecting characteristics based on neural network sensitivity
CN105550748A (en) * 2015-12-09 2016-05-04 四川长虹电器股份有限公司 Method for constructing novel neural network based on hyperbolic tangent function
CN105550747A (en) * 2015-12-09 2016-05-04 四川长虹电器股份有限公司 Sample training method for novel convolutional neural network
CN107622274A (en) * 2016-07-15 2018-01-23 北京市商汤科技开发有限公司 Neural network training method, device and computer equipment for image procossing
CN108932550A (en) * 2018-06-26 2018-12-04 湖北工业大学 A kind of optimization method of intensive sparse-dense algorithm
CN109214386A (en) * 2018-09-14 2019-01-15 北京京东金融科技控股有限公司 Method and apparatus for generating image recognition model
US11315018B2 (en) 2016-10-21 2022-04-26 Nvidia Corporation Systems and methods for pruning neural networks for resource efficient inference

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1363005B1 (en) * 2002-05-15 2009-07-15 Caterpillar Inc. Engine control system using a cascaded neural network
CN102692456A (en) * 2012-05-02 2012-09-26 江苏大学 Method for identifying position of microcrack in forming metal drawing part

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1363005B1 (en) * 2002-05-15 2009-07-15 Caterpillar Inc. Engine control system using a cascaded neural network
CN102692456A (en) * 2012-05-02 2012-09-26 江苏大学 Method for identifying position of microcrack in forming metal drawing part

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
盛高斌,姚明海: "基于半监督回归的选择性集成算法", 《计算机仿真》, vol. 26, 31 October 2009 (2009-10-31) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679211A (en) * 2013-12-05 2014-03-26 河海大学 Method and device for selecting characteristics based on neural network sensitivity
CN105550748A (en) * 2015-12-09 2016-05-04 四川长虹电器股份有限公司 Method for constructing novel neural network based on hyperbolic tangent function
CN105550747A (en) * 2015-12-09 2016-05-04 四川长虹电器股份有限公司 Sample training method for novel convolutional neural network
CN107622274A (en) * 2016-07-15 2018-01-23 北京市商汤科技开发有限公司 Neural network training method, device and computer equipment for image procossing
CN107622274B (en) * 2016-07-15 2020-06-02 北京市商汤科技开发有限公司 Neural network training method and device for image processing and computer equipment
US11315018B2 (en) 2016-10-21 2022-04-26 Nvidia Corporation Systems and methods for pruning neural networks for resource efficient inference
CN108932550A (en) * 2018-06-26 2018-12-04 湖北工业大学 A kind of optimization method of intensive sparse-dense algorithm
CN108932550B (en) * 2018-06-26 2020-04-24 湖北工业大学 Method for classifying images based on fuzzy dense sparse dense algorithm
CN109214386A (en) * 2018-09-14 2019-01-15 北京京东金融科技控股有限公司 Method and apparatus for generating image recognition model

Similar Documents

Publication Publication Date Title
CN102968663A (en) Unmarked sample-based neutral network constructing method and device
Tam et al. Artificial neural networks model for predicting excavator productivity
CN110481536B (en) Control method and device applied to hybrid electric vehicle
Bahremand HESS Opinions: Advocating process modeling and de-emphasizing parameter estimation
El-Shafie et al. Generalized versus non-generalized neural network model for multi-lead inflow forecasting at Aswan High Dam
CN106980073A (en) A kind of two ends of electric transmission line fault recorder data matching process based on convolutional neural networks
CN101706888B (en) Method for predicting travel time
CN108346293A (en) A kind of arithmetic for real-time traffic flow Forecasting Approach for Short-term
Nourani et al. Application of the Artificial Neural Network to monitor the quality of treated water
CN108681751A (en) Determine the method and terminal device of event influence factor
CN106777402A (en) A kind of image retrieval text method based on sparse neural network
CN107392307A (en) The Forecasting Methodology of parallelization time series data
CN108304674A (en) A kind of railway prediction of soft roadbed settlement method based on BP neural network
El-Shafie et al. Dynamic versus static neural network model for rainfall forecasting at Klang River Basin, Malaysia.
CN103679267A (en) Method and device for constructing RBF neural network based on unmarked samples
CN113609802A (en) Routing connections in reinforcement-based integrated circuits
Wang et al. Artificial intelligent fault diagnosis system of complex electronic equipment
CN115496201A (en) Train accurate parking control method based on deep reinforcement learning
CN113406503A (en) Lithium battery SOH online estimation method based on deep neural network
Ernst et al. Arch-comp 2022 category report: Falsification with ubounded resources
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN104504441A (en) Method and device for constructing MADALINE neural network based on sensitivity
CN103870699A (en) Hydrodynamics flood routing analogy method based on double-deck asynchronous iteration strategy
Zhang et al. Effluent Quality Prediction of Wastewater Treatment System Based on Small-world ANN.
CN104951871A (en) Intelligent evaluation method and system for power simulation training

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130313