CN107145893A

CN107145893A - A kind of image recognition algorithm and system based on convolution depth network

Info

Publication number: CN107145893A
Application number: CN201710144957.7A
Authority: CN
Inventors: 丁世飞; 张健
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2017-03-13
Filing date: 2017-03-13
Publication date: 2017-09-08

Abstract

The present invention a kind of image recognition algorithm (CCDBN models) and system based on convolution depth network, it is used as sample set CCDBN by picture of the structure with label to be trained, the neutral net trained is preserved, using picture to be identified as input, result is identified according to output vector.Image is recognized by CCDBN, it is to avoid artificial characteristic extraction procedure, directly using picture as the input of network, recognition accuracy is higher；Network one is trained can Reusability, treatment effeciency height.

Description

A kind of image recognition algorithm and system based on convolution depth network

Technical field

The present invention relates to pattern-recognition and machine learning field, and in particular to depth confidence net (DBN), convolutional neural networks (CNN), concept and formula and the C++ programming languages such as restricted Boltzmann machine (RBM), recruitment factor, conditional Gaussian distribution Speech.

Background technology

Machine learning is to study how to make a subject of computer simulation mankind's learning behavior.Machine learning is based on study Strategy can be divided into rote learning, analogical learning, deductive learning, the study based on explanation, inductive learning, based on neutral net Practise etc..The emphasis studied herein is at artificial neural network (ANN).Neutral net is a kind of network of parallel distributed information processing Structure, with very strong non-linear mapping capability and higher fault-tolerant ability.ANN can trace back to nineteen forty-three neuropsychologist The neuron models (M-P models) that Mcculloch and mathematician Pitts is proposed from the angle of mathematical logic, ANN starts hair since then Exhibition.At present, common ANN model species is a lot, and we can be divided into three kinds of basic network models according to ANN structure：It is single Layer feedforward network, Multi-layered Feedforward Networks, Recursive Networks.As multilayer neural network, deep learning model is due to can be from input Sample directly approaches the Nonlinear Mapping of complexity, and is widely used in many fields, and common model has convolutional neural networks (CNN) With the autocoder model (SAE) and depth confidence net (DBN) depth Boltzmann machine (DBM) of stacking etc..Convolutional Neural Network is designed exclusively for processing 2-D data, it is considered to be first deep learning side using multi-layer Network Framework Method, achieves huge success in field of image recognition in recent years, and based on CNN, extends many deep learning frames Frame and system, typical framework is including Caffe, Tensorflow etc..Meanwhile, convolution operation can be regarded as a kind of regularization side Method, due to being shared using local connection and weights, is kept greatly reducing network parameter again while network deep structure, makes mould There is type good generalization ability to be easier to training again.Therefore, as a kind of regularization method, CNN has also been used in voice Deng other field and cross discipline, such as deeply study and depth migration study.

It is usually to extract feature in advance in classical pattern-recognition.Extract after all multiple features, these features are carried out Correlation analysis, the extraction of these features too relies on the experience and subjective consciousness of people, and the difference for the feature extracted is to classification Performance impact is very big.Meanwhile, the quality of image preprocessing also influences whether the feature extracted.And deep learning algorithm need not pair Image carries out complicated pretreatment operation, can easily using image as input, by substantial amounts of data come learning characteristic, The feature extraction of display is avoided, it is more more reliable than conventional artificial selected characteristic.

Limited Boltzmann machine (RBM) is a kind of typical unsupervised learning model, comes from the nothing in probability graph model To graph model.By training RBM models and successively stacking, a typical depth model structure can be obtained：Depth confidence net (DBN).But RBM is a kind of two-value model, in order to model real value image, conditional Gaussian distribution is introduced in RBM models, I According to conditional Gaussian be distributed and energy function form, obtain following activation primitive：

Wherein, v represents visible layer unit, and c represents recruitment factor, and h represents to hide layer unit.Obtained hiding node layer is made For next layer of input of network, build RBM models and continue pre-training process.

The content of the invention

In order to preferably solve the problems, such as the identification of image, the present invention proposes a kind of image recognition based on convolution depth network Algorithm and system, it is to avoid explicit extraction feature, directly using digitized image pixel as input, training is obtained based on volume Long-pending supplement DBN model (CCDBN), then increases an associative memory layer as recognition result, effectively realizes image Identification and reconstruct.

The present invention is achieved by the following scheme：

The present invention relates to a kind of image recognition algorithm based on convolution depth network, made by the training set for building tape label Pre-training and training are carried out to convolutional neural networks for sample set, it is then to be identified using the convolutional neural networks processing trained Picture, the output vector finally according to neutral net judges recognition result.

The present invention is comprised the following steps that：

Step 1, it is simple to pre-process training set and regard pixel as input：Image data set is divided first Batch, each small batch include 100 samples, then by image normalization and adjusting size is 32*32 again；

Step 2, construction depth learning model network structure：The network includes：Input layer, 3 convolutional layers, 3 pond layers, 1 full linking layer and 1 output layer, wherein：Input layer is the pixel of two dimensional image, and first convolutional layer has 24 convolution spies Figure is levied, second convolutional layer there are 48 convolution characteristic patterns, and output layer is set to 10 nodes.Output is also the prediction of system；

Step 3, deep neural network is trained：Weights, convolution kernel, biasing are initialized first.Then instructed in advance Practice process, the first step：RBM models are improved, conditional Gaussian distribution is introduced, structure is restricted Boltzmann machine based on recruitment factor (CRBM) convolution operation, is then introduced, a CRBM (CCRBM) based on convolution is built, its network topology is as shown in Figure 1.Connect Get off using weight uncertainty methods and to sdpecific dispersion algorithm to train CCRBM models to alleviate over-fitting problem. Then, based on one supplement depth confidence net (CCDBN) based on convolution of CCRBM model constructions, its network topology such as Fig. 2 institutes Show.Pre-training is carried out using view data, after pre-training terminates, using CCDBN as deep neural network model, input picture, Obtain result.Finally, weight and biasing are adjusted with reference to BP back-propagation algorithms, detailed process is as follows：

Step 3.1, to netinit：Random initializtion is carried out to weights, convolution kernel and biasing；

Step 3.2, the pre-training of network：Training sample is imported to the network initialized and carries out pre-training, is improved first RBM models, introduce conditional Gaussian distribution, and structure is restricted Boltzmann machine (CRBM) based on recruitment factor, then introduces volume Product operation, builds a CRBM (CCRBM) based on convolution.Next trained using weight uncertainty methods CCRBM models are to alleviate over-fitting problem.Finally, based on one supplement depth confidence net based on convolution of CCRBM model constructions (CCDBN).Pre-training is carried out using view data；

Step 3.3, reality output is contrasted with label, obtains error, using CCDBN as neutral net, utilized Weight uncertainty BP algorithms are finely adjusted, the neural network model trained.

Step 4, the identification module of image：The view data being under the jurisdiction of in CIFAR-10 classifications is obtained, place is normalized Reason, is then adjusted to 32*32 Pixel Dimensions.It is then enter into the convolutional neural networks trained, finally gives knowledge Other result.

By above content, the application provides a kind of image recognition algorithm based on convolution depth network and is System, makes training set and label, then the parameter such as number of plies of planned network, carries out pre-training afterwards according to actual needs first, Then the adjustment to network weight and biasing is completed using weight uncertainty BP algorithms, recently enters image, pre- place Neutral net is inputted after reason, the identification to image is completed.The application passes through neural network recognization image, it is to avoid the feature of display Extract, directly using picture as the input of network, recognition accuracy is higher；And network one it is trained can Reusability, place Manage efficiency high；Training time is short.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.

Fig. 1 is the schematic diagram of CCRBM models used in this application.

Fig. 2 is deep neural network structural representation used in this application.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is carried out clear, complete Site preparation is described, and described embodiment is only some embodiments of the present application, rather than whole embodiments.Based on the application In embodiment, all other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not paid Example, belongs to the scope of the application protection.

Embodiment 1

The present embodiment comprises the following steps：

Step 1, picture is pre-processed：

Step 1.1, handwritten numeral image is normalized；

Step 1.2, it is 32*32 sizes the image normalization obtained by step 1.2, and is stored in training set, then Corresponding tally set, 1,0*1 one digital label of matrix representative are made according to training set.

Step 2, CCDBN depth models are built：

The CCDBN models used in the present embodiment are the neutral nets of a multilayer, by input layer, intermediate layer and output layer Etc. multilayer composition, every layer is made up of multiple node units.Construction multilayer neural network as shown in Figure 2, each layer is due to being all A kind of probability graph model, therefore, based on energy function, the activation primitive of each hiding layer unit is all the shape of sigmoid functions Formula；

Step 3, training convolutional neural networks：

Step 3.1, with different small random numbers (between 0-1) pair can training parameter initialize, to biasing be initialized as 0；

Step 3.2, the activation new probability formula that network carries out pre-training network model can be expressed as follows：

Wherein, h represents to hide layer unit, and v represents visible layer unit, and W represents weight matrix, and b represents biasing, then introduced Weight uncertainty algorithms, so, the calculating of derivative are changed to following form：

According to above-mentioned formula, CCDBN pre-training process is completed.

Step 3.3, residual error is calculated, adjustable parameters and biasing are updated with reference to backpropagation BP algorithm, is completed to CCDBN's Whole training process.

Step 4, the identification module of image：

Step 4.1, the view data being under the jurisdiction of in CIFAR-10 classifications is obtained, is normalized, is then adjusted to 32*32 Pixel Dimensions；

Step 4.2, the picture after being pre-processed, is input in the CCDBN networks trained, waits to be output, Take output vector maximum to obtain line number for recognition result, that is, complete the identification to real value image.

Claims

1. a kind of image recognition algorithm and system based on convolution depth network, it is characterised in that by building with label Picture (CIFAR-10 data sets) is trained as sample set to neutral net, and the neutral net trained is preserved, will be treated Recognize that picture, as input, result is identified according to output vector.

2. according to the method described in claim 1, it is characterized in that, described deep neural network for multilayer convolution depth nerve Network, including input layer, 3 convolutional layers, 3 pond layers and 1 output layer, wherein：Input layer is the pixel of two dimensional image, First convolutional layer has 24 convolution characteristic patterns, and second convolutional layer has 48 convolution characteristic patterns, and output layer is set to 10 sections Point.

3. according to the method described in claim 1, it is characterized in that, signified training refers to：By sample set (including image and mark Label) neutral net set is input to, pre-training is carried out using restricted Boltzmann machine (RBM) innovatory algorithm, then To obtained models coupling backpropagation BP algorithm adjusting parameter and biasing, the complete training process to neutral net is completed.

4. the method according to claim 1 or 3, it is characterized in that, described training includes：

4.1st, first to netinit；Random initializtion is carried out to convolution kernel, full articulamentum weights and biasing；

The 4.2nd, training sample and tally set are imported to the network initialized and carries out pre-training；RBM models are improved first, introduce bar Part Gaussian Profile, structure is restricted Boltzmann machine (CRBM) based on recruitment factor, then introduces convolution operation, builds one CRBM (CCRBM) based on convolution.Next CCRBM models are trained to alleviate using weight uncertainty methods Fitting problems.Finally, based on one supplement depth confidence net (CCDBN) based on convolution of CCRBM model constructions.Utilize picture number According to pre-training is carried out, after pre-training terminates, using CCDBN as deep neural network model, input picture obtains result.

4.3rd, reality output and label are contrasted, obtains error, using CCDBN as neutral net, utilize weight Uncertainty BP algorithms are finely adjusted, the neural network model trained.

5. method according to claim 4, it is characterized in that, described training sample includes：Input two dimensional image and label. Process is divided into pre-training process and global weights trim process.Pre-training process is successively carried out, and is unsupervised learning；Trained Cheng Zhong, input picture is by successively converting, in output layer output, obtains reality output vector.

6. according to the method described in claim 1, it is characterized in that, described identification includes：Picture to be identified input is trained CCDBN models in, output vector is obtained to recognize classification belonging to picture.

7. the method according to claim 1 or 6, it is characterized in that, described identification includes：

7.1st, obtain correlation type input image data, picture is pre-processed, normalization and adjust Pixel Dimensions (as Element is the most suitable for 32*32)；

7.2nd, the picture (being derived from the classification that CIFAR-10 data sets are included) after being pre-processed, is input to and has trained CCDBN neutral nets in, wait to be output, take output vector maximum line number be recognition result, that is, complete the reality to input It is worth the identification of image.

8. a kind of system for realizing any of the above-described claim methods described, it is characterised in that：CCDBN neural network modules and from The identification module of right image, wherein CCDBN is trained to one by CCDBN mixed-media network modules mixed-medias can recognize the classification of limited natural image Device (class that the image to be recognized belongs in CIFAR-10 data set sample labels), picture recognition module obtains image and inputted The CCDBN networks that train are identified.