CN105550747A - Sample training method for novel convolutional neural network - Google Patents

Sample training method for novel convolutional neural network Download PDF

Info

Publication number
CN105550747A
CN105550747A CN201510903228.6A CN201510903228A CN105550747A CN 105550747 A CN105550747 A CN 105550747A CN 201510903228 A CN201510903228 A CN 201510903228A CN 105550747 A CN105550747 A CN 105550747A
Authority
CN
China
Prior art keywords
training
sample
learning rate
convolutional neural
weighted value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510903228.6A
Other languages
Chinese (zh)
Inventor
游萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201510903228.6A priority Critical patent/CN105550747A/en
Publication of CN105550747A publication Critical patent/CN105550747A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a neural network algorithm, and aims to solve the problem that sample training and collecting consumes a lot of time in the whole structure design and calculation process of the existing neural network. The invention provides a sample training method for a novel convolutional neural network. The method comprises the following steps: determining a certain number of sample sets as benchmark data sets of training, moderately distorting the training weight, and setting the initial learning rate and the final learning rate of training; and using a second-order back propagation learning algorithm to train the sample sets based on the initial learning rate, and ending training when the learning rate reaches the final learning rate. The sample training method of the invention is applicable to sample training of neural network models.

Description

A kind of sample training method of novel convolutional neural networks
Technical field
The present invention relates to neural network algorithm, particularly a kind of sample training method of novel convolutional neural networks.
Background technology
Convolutional neural networks is computer vision and pattern-recognition important field of research, and it refers to that computing machine is copied biological brain thinking to inspire and carried out the information handling system of the similar mankind to special object.It is widely used, and object detection and identification technology is the important component part in present information treatment technology fast and accurately.Because quantity of information increases in recent years sharp, we also exigence have suitable object detection and recognition technology can allow people from a large amount of information, find out information required for oneself.Image retrieval and Text region all belong to this classification, and the detection and indentification system of word is then the pacing items of carrying out information retrieval.Detection and indentification technology is computer vision and field of human-computer interaction important component part.
Convolutional neural networks is a kind of algorithm model being widely used in the field such as pattern-recognition and computer vision recently, test, and then the generalization of application programs is had higher requirement for increasing algorithm for the actual performance of True Data.In particular in the performance of generalization, neural network has a large amount of time loss in sample collection and training process in total project navigator process, in the application of reality, the effect of training has very large embodiment on sample set, instruct in the training pattern of neural network in the nothing of reality, according to the external feedback from output node, the adjustment being exactly desired output connects weights, makes the actual output of network output node consistent with outside desired output.Relative to the integral frame of convolutional neural networks, activation function, topological structure is compared the collection of sample and is arranged and may get the brush-off with aspects such as selecting the mode of training, almost all there is the training set comprising error sample in any neural network that can train, the factual error depositing difference sample collection be like this taken as correct numerical value import into network training process and not by network error correction and differentiation.For training set sample for.Real problem is how to perform the classification in real world and recognition mode by neural network and don't is only verify only a part of experimental data collection.
Summary of the invention
The object of the invention is a kind of sample training method in order to provide fast neuronal network, in total project navigator process, having the problem of a large amount of time loss in sample collection and training process to solve existing neural network.
For achieving the above object, the invention provides a kind of sample training method of novel convolutional neural networks, comprising the steps:
Determine the reference data set of the sample set of some as training, training weights are carried out to the distortion of appropriateness, the initial learn rate of training and final learning rate are set;
Based on initial learn rate, use second order back propagation learning algorithm to train sample set, when learning rate reaches final learning rate, terminate training.
Particularly, use in second order back-propagation algorithm process and can produce error, the partial derivative of error equals the difference of later layer network real output value and later layer network objectives output valve, utilizes described difference to upgrade weighted value.
Particularly, the concrete grammar upgrading weighted value is: new weighted value equals the product that the weighted value before upgrading deducts learning rate and described difference.
Particularly, described initial learn rate is 0.001, and final learning rate is 0.00005.
Alternatively, the method for training weights being carried out to distortion comprises zoom factor, reversion and elastic deformation.
Particularly, network neural unit node weighted value is verified, when weighted value much larger than or much smaller than learning rate decline scope time, abandon this weighted value.
The invention has the beneficial effects as follows: by method of the present invention, can when not taking a large amount of computational resource, promote the training effect of convolutional neural networks to the full extent, to promote and learning rate and iteration to be returned the Optimal Parameters such as number configure and keep sufficient amount training and testing sample and can be classified and identification work in follow-up experiment and doing for handwritten numeral in simulating at training algorithm.Based on the convolutional neural networks model optimized in the distortion of character set sample mode, sampling is incomplete or realize high precision identification under being disturbed the high noise environments such as sample mode.And when conformability significantly promotes, also can promote that pattern-recognition more widely and computer vision field are for the usable range of target detection and Object identifying, aphalangia based on this novel convolutional neural networks leads the performance of the basic engineering skill upgrading intelligent appliance product of the optimization method of network learning and training and sample collection, improve the intelligent and generalization of household electrical appliances in visual interactive, to obtain better Consumer's Experience in actual product use procedure.
Embodiment
Below technical scheme of the present invention is described in further detail.
The present invention solves existing neural network to have the problem of a large amount of time loss in sample collection and training process in total project navigator process, and provide a kind of sample training method of novel convolutional neural networks, the method comprises the steps:
Determine the reference data set of the sample set of some as training, training weights are carried out to the distortion of appropriateness, the initial learn rate of training and final learning rate are set;
Based on initial learn rate, use second order back propagation learning algorithm to train sample set, when learning rate reaches final learning rate, terminate training.
Use in second order back-propagation algorithm process and can produce error, the partial derivative of error equals the difference of later layer network real output value and later layer network objectives output valve, utilizes described difference to upgrade weighted value.The concrete grammar upgrading weighted value is: new weighted value equals the product that the weighted value before upgrading deducts learning rate and described difference.The method of training weights being carried out to distortion comprises zoom factor, reversion and elastic deformation.
Initial learn rate is 0.001, and final learning rate is 0.00005.Network neural unit node weighted value is verified, when weighted value much larger than or much smaller than learning rate decline scope time, abandon this weighted value.
The technical solution used in the present invention is:
First we contain the reference data set of sample set as training of 10000 patterns with one.In a random series, process back-propagation algorithm continuously, wherein each Epoch by 10000 kinds of patterns we be simply called an iteration.The object of training makes great efforts to improve the Generalization Capability of neural network, the expression that training algorithm is plain in the example that this patent is set forth makes weight saltus step in domain of walker larger, avoid the weight properties of error sample to the full extent, particularly in the training set pattern of input, even if data acquisition is regular again, making time manpower does a large amount of preliminary work again, but sample also can not be completely identical with the attributive character in real world.For improving the change training weights that the even more important technology of generalization is exactly appropriateness, make the distortion of weights appropriateness: for training mode before network backpropagation by slight distortion.And each iteration of neural network runs the calculating of weight.This is the training set that distortion is exaggerated from another perspective, is calculated, create new training mode artificially from existing pattern by iterative feedback again and again.Suitable distortion weight computing weight makes the random backpropagation again of each pattern.The dissimilar application of three kinds of distortion weights has: zoom factor, reversion and elastic deformation.
Secondly, after suitable change training weights, the more optimum results for Support Training also needs to arrange other parameters, and the content of core is the setting of learning rate.The training process of network must specify an initial learn rate and final learning rate.In general, when not having neural network training, initial value should be larger, because large learning rate will change in feedback repeatedly calculates for learning rate.Learning rate is slow Gradient Descent in the process of training, due to the study mechanism of neural network, thus makes weight converges in last numerical value.But it is too many that it never really allows learning rate reduce, if lower than certain numerical network by deconditioning.The problem of elaboration initial learn rate that Here it is and final learning rate.To the elaboration of learning rate initial value in this patent, suggestion not use value is greater than about 0.001.It is too fast that larger value makes weight change, and in fact causes weight to disperse instead of restrain.Most neural metwork training is often using the function of the number of learning rate as a number of iterations.Our program is using current learning rate as the first factor, and computation process makes it in a learning rate change procedure constantly declined, and second is the number of times restriction setting iteration in the scope that can allow in the time.If more specifically.Our training system initial learn rate is 0.001, and this is larger comparatively speaking.Such as: learning rate have dropped 79 percentage points after every two iterative computation, and setting last learning rate is 0.00005, so can reach predetermined value according to such decline rate in the process of 25-27 level iteration and terminate training.
Again, what how to allow training process run is faster, uses second order back propagation learning algorithm when training.At this moment Two Order Method is discussed.Such the constringent of quickening neural network has the greatest impact, because so greatly reduce the number of times required for weight converges.So Second Order BP reduces the T.T. needed for training.
The partial derivative of back-propagation process error equals later layer network real output value and deducts later layer network objectives output valve, namely this result is for upgrading the numerical value of weight change, and wherein after the error signal of the hidden layer neuron error direction of signal that directly will be connected according to all hidden layer neurons, recurrence determines.For convergence speedup proposes computing formula:
∂ E n p ∂ w n i j = x n - 1 j · ∂ E n p ∂ y n i Formula 1
In formula, n represents n layer network node, relates to the process of tanh activation function here, is because be easy to obtain its derivative selecting tanh activation function.We obtain the changes values of weight at current network layer.Subsequently about front one deck error calculating we use following formula:
∂ E n - 1 p ∂ x n - 1 k = Σ i w n i k · ∂ E n p ∂ y n i Formula 2
From the value of formula 1, the value of equation, as the initial value of one deck error before calculating, meanwhile, tells how we calculate the weight of trying to achieve change at current layer, be calculated as object according to whole network delivery error and learning rate.Particularly we upgrade weighted value and can use formula below:
( w n i j ) n e w = ( w n i j ) o l d - η · ( ∂ E n p ∂ w n i J ) Formula 3
Wherein η is learning rate, normally a very little number, and setting last learning rate is 0.00005, reduces gradually in the training process.Because formula 3 changes the value of weight.In the constant situation of weight, we can the setting parameter of change learning rate of repetition test, and briefly learning rate value can be multiplied by the learning rate η of a constant times.
Finally, in order to the speed of convergence speedup, we use the strategy abandoning fractional weight.We can solve formula 2 now.If the neuron node verification of network finds that the weight that certain thread is using is wrong value, usually much larger than or much smaller than learning rate decline parameter area, this situation is completely negligible and has reason to ignore current wrong weights propagation, it also avoid calculating and the time overhead of repetition.Less in the error of every one deck, thus more reasonably ignore this invalid numerical, as the one strategy of the training result of raising neural network.Very little error can be there is for some training networks for specific sample collection.Such as this patent for numeral and the sample changed of character both there is very little error in acquisition phase, if only only when the data acquisition of 60000 test sample books, several mistakes seem insignificant, and the process of relatively whole training seems meaningless to net result.The few error of error sample number is just very little, and weight would not change mutually and even can not change very greatly.Test according to reality, in a single-threaded neural computing process, our training algorithm can process 12-15 sample mode (the difference slightly difference of concrete computer hardware performance), because the excellent working time of algorithm performance is relatively short, only need 12-18 hour, the error back propagation of the calculated amount of nearly 1,000,000 samples so several mistakes is just extremely limited on the impact of weight.In addition to the strategy that fractional weight abandons, affect can regard as especially and ignore.Thus final when not affecting network Fast Convergent completely, again guarantee is provided to the training precision of algorithm.
In the configuration of training parameter, training sample set closes in the optimum performance of test, with the test of this patent experience constructing neural network as a result.This section openly gives a brief research experience.The standard evaluated is exactly about the square error of training set is tested; the object of first experiment determines whether training process has the needs of a distortion; general in many network models the process of iteration be there is no methods that weights distortion is arranged or other are artificially interfered, this patent the method is proposed and for the protection of.Actual test set sample size 10000, both there is the error rate of 1.40% in test by mistake identification about 140.Although the very low but space in order to improve and exist optimization of error rate, this means the needs of distortion weight, otherwise convolutional neural networks can not summarize its learning behavior effectively.Need after determining distortion weights to improve network training and generalization ability.Test center of gravity below to turn to and select to be inclined to the configuration optimization aspect of better distortion weights in training, but too many distortion seems to affect the nerves, network reaches the stable point of its weight.If in other words weights distortion is comparatively large, then in the end of each iteration, do not reach a suitable stationary value to the square error of training set is still very high.On the other hand, the result that too little weights distortion prevents neural network extensive to test set.How the parameter of distortion makes trade-offs and takes into account performance and stablize in brief.

Claims (6)

1. a sample training method for novel convolutional neural networks, is characterized in that, comprise the steps:
Determine the reference data set of the sample set of some as training, training weights are carried out to the distortion of appropriateness, the initial learn rate of training and final learning rate are set;
Based on initial learn rate, use second order back propagation learning algorithm to train sample set, when learning rate reaches final learning rate, terminate training.
2. the sample training method of novel convolutional neural networks as claimed in claim 1, it is characterized in that, use in second order back-propagation algorithm process and can produce error, the partial derivative of error equals the difference of later layer network real output value and later layer network objectives output valve, utilizes described difference to upgrade weighted value.
3. the sample training method of novel convolutional neural networks as claimed in claim 2, is characterized in that, the concrete grammar upgrading weighted value is: new weighted value equals the product that the weighted value before upgrading deducts learning rate and described difference.
4. the sample training method of the novel convolutional neural networks as described in claim 1 or 2 or 3, it is characterized in that, described initial learn rate is 0.001, final learning rate is 0.00005.
5. the sample training method of novel convolutional neural networks as claimed in claim 4, is characterized in that, the method for training weights being carried out to distortion comprises zoom factor, reversion and elastic deformation.
6. the sample training method of novel convolutional neural networks as claimed in claim 5, is characterized in that, network neural unit node weighted value is verified, when weighted value much larger than or much smaller than learning rate decline scope time, abandon this weighted value.
CN201510903228.6A 2015-12-09 2015-12-09 Sample training method for novel convolutional neural network Pending CN105550747A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510903228.6A CN105550747A (en) 2015-12-09 2015-12-09 Sample training method for novel convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510903228.6A CN105550747A (en) 2015-12-09 2015-12-09 Sample training method for novel convolutional neural network

Publications (1)

Publication Number Publication Date
CN105550747A true CN105550747A (en) 2016-05-04

Family

ID=55829928

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510903228.6A Pending CN105550747A (en) 2015-12-09 2015-12-09 Sample training method for novel convolutional neural network

Country Status (1)

Country Link
CN (1) CN105550747A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154239A (en) * 2017-12-27 2018-06-12 郑州云海信息技术有限公司 A kind of machine learning method and its device
CN108764489A (en) * 2018-06-05 2018-11-06 北京百度网讯科技有限公司 Model training method based on virtual sample and equipment
CN109670577A (en) * 2018-12-14 2019-04-23 北京字节跳动网络技术有限公司 Model generating method and device
CN109754078A (en) * 2017-11-03 2019-05-14 三星电子株式会社 Method for optimization neural network
CN109919931A (en) * 2019-03-08 2019-06-21 数坤(北京)网络科技有限公司 Coronary stenosis degree evaluation model training method and evaluation system
CN112115974A (en) * 2020-08-18 2020-12-22 郑州睿如信息技术有限公司 Intelligent visual detection method for classification treatment of municipal waste

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968663A (en) * 2012-11-29 2013-03-13 河海大学 Unmarked sample-based neutral network constructing method and device
JP2014049118A (en) * 2012-08-31 2014-03-17 Fujitsu Ltd Convolution neural network classifier system, training method for the same, classifying method, and usage
CN104077595A (en) * 2014-06-15 2014-10-01 北京工业大学 Deep belief network image recognition method based on Bayesian regularization
CN104794527A (en) * 2014-01-20 2015-07-22 富士通株式会社 Method and equipment for constructing classification model based on convolutional neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014049118A (en) * 2012-08-31 2014-03-17 Fujitsu Ltd Convolution neural network classifier system, training method for the same, classifying method, and usage
CN102968663A (en) * 2012-11-29 2013-03-13 河海大学 Unmarked sample-based neutral network constructing method and device
CN104794527A (en) * 2014-01-20 2015-07-22 富士通株式会社 Method and equipment for constructing classification model based on convolutional neural network
CN104077595A (en) * 2014-06-15 2014-10-01 北京工业大学 Deep belief network image recognition method based on Bayesian regularization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
康明: "手写体数字识别技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754078A (en) * 2017-11-03 2019-05-14 三星电子株式会社 Method for optimization neural network
CN108154239A (en) * 2017-12-27 2018-06-12 郑州云海信息技术有限公司 A kind of machine learning method and its device
CN108764489A (en) * 2018-06-05 2018-11-06 北京百度网讯科技有限公司 Model training method based on virtual sample and equipment
CN109670577A (en) * 2018-12-14 2019-04-23 北京字节跳动网络技术有限公司 Model generating method and device
CN109919931A (en) * 2019-03-08 2019-06-21 数坤(北京)网络科技有限公司 Coronary stenosis degree evaluation model training method and evaluation system
CN112115974A (en) * 2020-08-18 2020-12-22 郑州睿如信息技术有限公司 Intelligent visual detection method for classification treatment of municipal waste
CN112115974B (en) * 2020-08-18 2024-04-09 郑州睿如信息技术有限公司 Intelligent visual detection method for urban garbage classification treatment

Similar Documents

Publication Publication Date Title
CN105550747A (en) Sample training method for novel convolutional neural network
JP6275868B2 (en) Neural watchdog
JP6092477B2 (en) An automated method for correcting neural dynamics
US9558442B2 (en) Monitoring neural networks with shadow networks
JP2017509951A (en) Construct a sparse neural network
KR20160123309A (en) Event-based inference and learning for stochastic spiking bayesian networks
JP2017509982A (en) In-situ neural network coprocessing
CN104636801A (en) Transmission line audible noise prediction method based on BP neural network optimization
CN104636985A (en) Method for predicting radio disturbance of electric transmission line by using improved BP (back propagation) neural network
KR20160084401A (en) Implementing synaptic learning using replay in spiking neural networks
CN106796667B (en) Dynamic spatial target selection
KR20160076531A (en) Evaluation of a system including separable sub-systems over a multidimensional range
JP2017519268A (en) Modulating plasticity by global scalar values in spiking neural networks
CN108460462A (en) A kind of Interval neural networks learning method based on interval parameter optimization
JP6219509B2 (en) Assigning and examining synaptic delays dynamically
KR101825937B1 (en) Plastic synapse management
Zhao Research and application on BP neural network algorithm
KR101825933B1 (en) Phase-coding for coordinate transformation
KR20160123312A (en) Auditory source separation in a spiking neural network
JP2017513110A (en) Contextual real-time feedback for neuromorphic model development
US9342782B2 (en) Stochastic delay plasticity
Zhang et al. A deep reinforcement learning based human behavior prediction approach in smart home environments
JP2017509956A (en) Method for converting values to spikes
De Aguiar et al. Using reservoir computing for forecasting of wind power generated by a wind farm
JP2017509979A (en) Unbalanced crossing suppression mechanism for spatial target selection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160504

RJ01 Rejection of invention patent application after publication