CN102968663A - Unmarked sample-based neutral network constructing method and device - Google Patents
Unmarked sample-based neutral network constructing method and device Download PDFInfo
- Publication number
- CN102968663A CN102968663A CN2012105008602A CN201210500860A CN102968663A CN 102968663 A CN102968663 A CN 102968663A CN 2012105008602 A CN2012105008602 A CN 2012105008602A CN 201210500860 A CN201210500860 A CN 201210500860A CN 102968663 A CN102968663 A CN 102968663A
- Authority
- CN
- China
- Prior art keywords
- neural network
- network
- module
- hidden
- susceptibility
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an unmarked sample-based neutral network constructing method belonging to the field of machine learning in intelligent science and technologies. According to the unmarked sample-based neutral network constructing method, by using the sensitivities of hidden neutrons in a neutral network as a reference and through gradually cutting neutrons with low sensitivity in the hidden neutrons, the network structure is gradually simplified and classifier performance is improved. Compared with the traditional mode, the unmarked sample-based neutral network constructing method has the advantages that in a process of calculating the sensitivities of the hidden neutrons, both marked samples and a large quantity of unmarked samples are used, so that the sensitivity calculating precision is greatly improved. The invention also discloses an unmarked sample-based neutral network constructing device. By the method and the device, the neutral network constructing efficiency can be effectively increased and the performance of the neural network is improved.
Description
Technical field
The present invention relates to neural network network establishing method and the device thereof in when design, but relate in particular to a kind of Effective Raise neural network classification efficient or return network establishing method and the device thereof of efficient.
Background technology
When the design neural network classifier, the structure of how to confirm neural network is an important and crucial step.Make up a suitable network for particular problem raising nicety of grading, generalization ability are had huge help.Now widely used is three-layer neural network.Document is verified, and three-layer neural network can approach any continuous function when neuron number increases when the second layer (also claiming hidden layer, middle layer).In concrete the application, the ground floor neuron number of three-layer neural network depends on the dimension of input variable, and the 3rd layer of neuron number depends on the dimension of output variable.Because the dimension of input variable and output variable generally all knows, ground floor and the 3rd layer of neuronic number are generally also determined.Network struction for three-layer neural network is actually definite process of second layer neuron number purpose.
In neural network training, usually use the learning method that supervision is arranged.There is the learning method of supervision to refer in the process of neural network training to reach the purpose of neural network training by telling the network input to come the regulating networks parameter with corresponding output.In this process, need to use markd training sample.The mark of training sample generally is carried out by a specialist, and this often will spend a large amount of money and time cost.In actual applications, unmarked sample obtains manyly than markd sample easily.For example in some internet, applications, compare a part that only accounts for seldom by the marker samples that has of expert's mark with unmarked sample.Can use those not have markd sample to help the optimum structure of auxiliary definite network just to become necessary.
For three-layer neural network, determine that the method for hidden layer neural network mainly contains:
1) increases a method.This method determines that the process of hidden neuron quantity is at first to select a very little hidden neuron number.Because hidden nodes very little, so the structure of network is too simple, causes using the training sample neural network training unsuccessful.Mathematical sign is exactly that error does not restrain.On the basis of this hidden neuron quantity, increase singly hidden neuron quantity, do not increase a training primary network, until hidden neuron quantity when being increased to a certain quantity neural network can train successfully.Can be so that the minimum hidden neuron quantity of neural metwork training success be exactly the hidden nodes that we need to seek.
2) subtract a method.This method is with to increase a method opposite, its method of operating be at first determine one enough large hidden nodes construct a three-layer neural network, under this structure, can use markd sample to train at an easy rate neural network.Then hidden neuron is removed one, the network foundation after removing uses markd sample training, finishes so that network is trained again.Network training repeats above-mentioned removal process, until can not be finished.At this time get the final hidden nodes of determining of hidden neuron quantity conduct that minimum can be finished.Increasing a method and subtracting method theoretical foundation behind is that Statistical Learning Theory requires for a concrete classification problem sorter a suitable complexity to be arranged, and guarantees that neither over-fitting is not owed match yet.Sorter only in this way just can have best generalization ability.For the such sorter of three-layer neural network, network complexity just is embodied on the quantity of hidden neuron, and neuronal quantity very little network is owed match, and training can not be finished, the too many network over-fitting of neuronal quantity, and generalization ability is poor.
3) empirical method.This method determines that hidden nodes need to have deep understanding to the related field of particular problem, thereby by virtue of experience determines the quantity of hidden neuron.Even can not guarantee that like this hidden neuron quantity of getting is optimum.
Subtracting a method for said method uses more at present.Concrete subtracting in the process, at first deduct which hidden neuron, secondly deduct which hidden neuron for determining that final network structure is extremely important.It is generally acknowledged that effect or significance level that each hidden neuron plays are different in training process.At first remove in theory and do not have the effective or unessential neuron can be so that the Generalization Capability of the neural network that final training is finished is better to classification.How to utilize unmarked sample to assist and determine that hidden neuron is to determine that better network just becomes extremely important.
Summary of the invention
Goal of the invention: the objective of the invention is in order to solve the deficiencies in the prior art, provide a kind of utilization that marker samples and the auxiliary method of determining three-layer neural network hidden neuron susceptibility of unmarked sample are arranged, a kind of neural network construction device and method of work thereof based on unmarked sample also is provided simultaneously.
Technical scheme: in order to realize above purpose, a kind of neural network construction method based on unmarked sample of the present invention may further comprise the steps:
(S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e(e value less than 10-2 powers), obtain trained sorter;
(S105) utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility;
(S107) remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction;
(S109) new neural network being reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter that upgrades through parameter and repeating step (S107), (S109); If can not restrain then enter next step;
(S111) getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
The method of the invention has improved the classification performance of neural network classifier for the cutting hidden neuron provides foundation with definite network structure.
Also disclose a kind of neural network construction device based on unmarked sample among the present invention, having comprised: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
The method of work of above-mentioned neural network construction device based on unmarked sample, concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.
The affiliated information of effectively having utilized unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thus more accurate, convenient than method in the past aspect the individual importance of judgement hidden neuron.
Beneficial effect: the present invention compared with prior art has the following advantages:
Neural network construction method based on unmarked sample of the present invention for the cutting hidden neuron to determine that network structure provides foundation, improved the classification performance of neural network classifier.
The information of effectively utilizing unmarked sample to contain based on the neural network construction device of unmarked sample of the present invention, thereby more accurate than method in the past aspect the individual importance of judgement hidden neuron, convenient, adopt simultaneously the neural network based on the method for work cutting of the neural network construction device of unmarked sample to have better generalization.
Description of drawings
Fig. 1 is multi-layer perception(MLP) MLP (Multi-layer perceptron) neural network structure figure.
Fig. 2 is the neural network construction method process flow diagram based on unmarked sample of the specific embodiment of the invention.
Embodiment
Below in conjunction with specific embodiment, further illustrate the present invention, should understand these embodiment only is used for explanation the present invention and is not used in and limits the scope of the invention, after having read the present invention, those skilled in the art all fall within the application's claims limited range to the modification of the various equivalent form of values of the present invention.
Embodiment
Present embodiment illustrates according to feedforward neural network Method of Sample Selection of the present invention take Multilayer perceptron network MLP as example; The invention is not restricted to the MLP neural network, but can be applied to other feedforward neural network.
MLP is a kind of feedforward neural network (as shown in Figure 1) of full connection, is applicable to the classification of target.The structure of MLP as shown in Figure 1, it is a kind of three layers of feedforward network: input layer MA is comprised of the input pattern node, x
iI component of expression input mode vector (i=1,2 ..., n); The second layer is hidden layer MB, and it is by m node b
j(j=1,2 ..., m) form.The 3rd layer is output layer MC, and it is by p node c
k(k=1,2 ..., p) form.
Before training, need each the element standardization to input vector, here each element is standardized to [1,1].
Standard BP algorithm is adopted in training for above-mentioned MLP neural network.
The susceptibility of the above-mentioned neural network hidden neuron of the below's definition, definition can be generalized to other feedforward neural network at an easy rate.
After neural metwork training was finished, its mapping relations had also just been determined.If the mapping relations function is F (X) (wherein X is input vector).What suppose to dismiss is j hidden neuron, removes so that the mapping relations function of neural network becomes F behind j the hidden neuron
j(X), the susceptibility that defines j hidden neuron is: S
j(X)=E (‖ F (X)-F
j(X) ||
2) (1)
‖ ‖
2It is the operator of the euclideam norm asked for.E is the operator of asking for expectation.
Definition by susceptibility can find out that in fact susceptibility represented the difference that j hidden neuron is arranged and do not have j hidden neuron function output.J hidden neuron of the less expression of this difference is more inessential, otherwise then more important.
Calculating S
j(X) in the process, S
j(X) can be deformed into:
S
j(X)=∫
Ω(F(X)-F
j(X))
2p(X)dX (2)
Wherein Ω is field of definition, and p (X) is the density function of X in field of definition Ω.In general p (X) is unknown, because markd sample size seldom, only relies on markd sample to estimate that p (X) is very difficult.Unmarked sample size is very many, and the distributed intelligence of sample has also been contained in the inside.Here we adopt markd sample and unmarked sample to estimate together p (X), and this can make estimated accuracy greatly improve.
Suppose (X
1, y
1), (X
2, y
2) ... (X
b, y
b) be markd sample set, X
B+1, X
B+2X
NUnmarked sample set, so S
j(X) a good estimation is:
This paper adopts (3) formula to calculate the susceptibility of hidden neuron.
Be illustrated in figure 2 as the neural network construction method process flow diagram that the present invention is based on unmarked sample.
In step S101, choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter.
In step S103, utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter.
In step S105, utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility.
In step S107, remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction.
In step S109, new neural network is reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter and repeating step S107, the S109 that upgrade through parameter.If can not restrain then enter next step.
In step S111, getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
The present embodiment accompanying method is based on described neural network construction device based on unmarked sample, and this device comprises: initial module, training module, hidden neuron are selected module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
The method of work of above-mentioned neural network construction device based on unmarked sample, concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.
Claims (3)
1. neural network construction method based on unmarked sample is characterized in that: may further comprise the steps: (S101) choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(S103) utilize markd sample set neural network training until cost function converges to certain given very little threshold value e, obtain trained sorter;
(S105) utilize the susceptibility that mark and unmarked sample calculation hidden neuron are arranged, and according to the ascending ordering of susceptibility;
(S107) remove the hidden neuron of susceptibility minimum, obtain the neural network of new construction;
(S109) new neural network being reused markd sample set on the basis of original parameter trains, if cost function can converge to certain very little threshold value e, then obtain the sorter that upgrades through parameter and repeating step (S107), (S109); If can not restrain then enter next step;
(S111) getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and its network is the sorter of final output.
2. the neural network construction device based on unmarked sample is characterized in that: comprising: initial module, training module, hidden neuron selection module and output module; Above-mentioned module makes up in the following order successively: initial module, training module, hidden neuron are selected module, output module.
3. method of work based on the neural network construction device of unmarked sample shown in claim 2, it is characterized in that: concrete steps are as follows:
(1) make up initial module: its choose one enough large positive integer m make up a three-layer neural network as hidden nodes, and given initial network parameter;
(2) make up training module: it utilizes markd sample set neural network training until cost function converges to certain given very little threshold value e, obtains trained sorter;
(3) hidden neuron is selected module: it utilizes the susceptibility of markd sample and unmarked sample calculation hidden neuron, according to the ascending ordering of susceptibility, and removes the hidden neuron of susceptibility minimum, forms the neural network of new construction; Re-using has marker samples training to practice the neural network of new construction, if cost function can converge to certain very little threshold value e, then obtains through the sorter that parameter the is upgraded duplicate step of laying equal stress on; If cost function can not converge to certain very little threshold value e, then enter next step;
(4) make up output module: getting hidden nodes neural network network structure minimum and that can restrain is final network structure, and to export this network be final sorter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105008602A CN102968663A (en) | 2012-11-29 | 2012-11-29 | Unmarked sample-based neutral network constructing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105008602A CN102968663A (en) | 2012-11-29 | 2012-11-29 | Unmarked sample-based neutral network constructing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102968663A true CN102968663A (en) | 2013-03-13 |
Family
ID=47798794
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012105008602A Pending CN102968663A (en) | 2012-11-29 | 2012-11-29 | Unmarked sample-based neutral network constructing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102968663A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103679211A (en) * | 2013-12-05 | 2014-03-26 | 河海大学 | Method and device for selecting characteristics based on neural network sensitivity |
CN105550748A (en) * | 2015-12-09 | 2016-05-04 | 四川长虹电器股份有限公司 | Method for constructing novel neural network based on hyperbolic tangent function |
CN105550747A (en) * | 2015-12-09 | 2016-05-04 | 四川长虹电器股份有限公司 | Sample training method for novel convolutional neural network |
CN107622274A (en) * | 2016-07-15 | 2018-01-23 | 北京市商汤科技开发有限公司 | Neural network training method, device and computer equipment for image procossing |
CN108932550A (en) * | 2018-06-26 | 2018-12-04 | 湖北工业大学 | A kind of optimization method of intensive sparse-dense algorithm |
CN109214386A (en) * | 2018-09-14 | 2019-01-15 | 北京京东金融科技控股有限公司 | Method and apparatus for generating image recognition model |
US11315018B2 (en) | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1363005B1 (en) * | 2002-05-15 | 2009-07-15 | Caterpillar Inc. | Engine control system using a cascaded neural network |
CN102692456A (en) * | 2012-05-02 | 2012-09-26 | 江苏大学 | Method for identifying position of microcrack in forming metal drawing part |
-
2012
- 2012-11-29 CN CN2012105008602A patent/CN102968663A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1363005B1 (en) * | 2002-05-15 | 2009-07-15 | Caterpillar Inc. | Engine control system using a cascaded neural network |
CN102692456A (en) * | 2012-05-02 | 2012-09-26 | 江苏大学 | Method for identifying position of microcrack in forming metal drawing part |
Non-Patent Citations (1)
Title |
---|
盛高斌,姚明海: "基于半监督回归的选择性集成算法", 《计算机仿真》, vol. 26, 31 October 2009 (2009-10-31) * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103679211A (en) * | 2013-12-05 | 2014-03-26 | 河海大学 | Method and device for selecting characteristics based on neural network sensitivity |
CN105550748A (en) * | 2015-12-09 | 2016-05-04 | 四川长虹电器股份有限公司 | Method for constructing novel neural network based on hyperbolic tangent function |
CN105550747A (en) * | 2015-12-09 | 2016-05-04 | 四川长虹电器股份有限公司 | Sample training method for novel convolutional neural network |
CN107622274A (en) * | 2016-07-15 | 2018-01-23 | 北京市商汤科技开发有限公司 | Neural network training method, device and computer equipment for image procossing |
CN107622274B (en) * | 2016-07-15 | 2020-06-02 | 北京市商汤科技开发有限公司 | Neural network training method and device for image processing and computer equipment |
US11315018B2 (en) | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
CN108932550A (en) * | 2018-06-26 | 2018-12-04 | 湖北工业大学 | A kind of optimization method of intensive sparse-dense algorithm |
CN108932550B (en) * | 2018-06-26 | 2020-04-24 | 湖北工业大学 | Method for classifying images based on fuzzy dense sparse dense algorithm |
CN109214386A (en) * | 2018-09-14 | 2019-01-15 | 北京京东金融科技控股有限公司 | Method and apparatus for generating image recognition model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102968663A (en) | Unmarked sample-based neutral network constructing method and device | |
Tam et al. | Artificial neural networks model for predicting excavator productivity | |
CN110481536B (en) | Control method and device applied to hybrid electric vehicle | |
Bahremand | HESS Opinions: Advocating process modeling and de-emphasizing parameter estimation | |
El-Shafie et al. | Generalized versus non-generalized neural network model for multi-lead inflow forecasting at Aswan High Dam | |
CN106980073A (en) | A kind of two ends of electric transmission line fault recorder data matching process based on convolutional neural networks | |
CN101706888B (en) | Method for predicting travel time | |
CN108346293A (en) | A kind of arithmetic for real-time traffic flow Forecasting Approach for Short-term | |
Nourani et al. | Application of the Artificial Neural Network to monitor the quality of treated water | |
CN108681751A (en) | Determine the method and terminal device of event influence factor | |
CN106777402A (en) | A kind of image retrieval text method based on sparse neural network | |
CN107392307A (en) | The Forecasting Methodology of parallelization time series data | |
CN108304674A (en) | A kind of railway prediction of soft roadbed settlement method based on BP neural network | |
El-Shafie et al. | Dynamic versus static neural network model for rainfall forecasting at Klang River Basin, Malaysia. | |
CN103679267A (en) | Method and device for constructing RBF neural network based on unmarked samples | |
CN113609802A (en) | Routing connections in reinforcement-based integrated circuits | |
Wang et al. | Artificial intelligent fault diagnosis system of complex electronic equipment | |
CN115496201A (en) | Train accurate parking control method based on deep reinforcement learning | |
CN113406503A (en) | Lithium battery SOH online estimation method based on deep neural network | |
Ernst et al. | Arch-comp 2022 category report: Falsification with ubounded resources | |
CN109408896B (en) | Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production | |
CN104504441A (en) | Method and device for constructing MADALINE neural network based on sensitivity | |
CN103870699A (en) | Hydrodynamics flood routing analogy method based on double-deck asynchronous iteration strategy | |
Zhang et al. | Effluent Quality Prediction of Wastewater Treatment System Based on Small-world ANN. | |
CN104951871A (en) | Intelligent evaluation method and system for power simulation training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20130313 |