CN109934293A

CN109934293A - Image-recognizing method, device, medium and obscure perception convolutional neural networks

Info

Publication number: CN109934293A
Application number: CN201910198639.8A
Authority: CN
Inventors: 钟宝江; 言俐光
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2019-06-25
Anticipated expiration: 2039-03-15
Also published as: CN109934293B

Abstract

The embodiment of the invention discloses a kind of image-recognizing method, device, equipment, computer readable storage medium and obscure perception convolutional neural networks.Wherein, obscure the prediction classifier for perceiving convolutional neural networks and being used as including the use of traditional convolution neural network classifier of training sample set training, obscure sensor model, correction classifier group and probability average layer.Obscure the confusion matrix that sensor model carries out cross validation acquisition based on prediction classifier on training sample set to be constructed；It is decision system that each correction classifier, which is using sensor model is obscured, and being concentrated using training sample, there is the easy of smeared out boundary to obscure classification sample data training gained；For probability average layer according to the classification results of the class probability of prediction classifier output and the class probability output images to be recognized of target correction classifier output, target correction classifier is the correction classifier for obscuring prediction classification selection of the sensor model according to prediction classifier.The application is conducive to be promoted the accuracy of image recognition.

Description

Image-recognizing method, device, medium and obscure perception convolutional neural networks

Technical field

The present embodiments relate to image classification identification technology field, more particularly to a kind of image-recognizing method, device, Equipment, computer readable storage medium and obscure perception convolutional neural networks.

Background technique

With computer vision technique fast development, the requirement of image classification identification is higher and higher.Classify in image Before identification, the pre-treatment steps such as binaryzation and standardization can be generally carried out to input picture, are added compared to manual feature extraction The conventional machines learning method of classifier classification, using convolutional neural networks automatically extracts the feature in image and classifies Method will more accurately and efficiently.

It includes convolutional calculation and tool that convolutional neural networks (Convolutional Neural Networks, CNN), which are a kind of, Have the feedforward neural network of depth structure, in the various tasks of computer vision have state-of-the-art performance, such as image classification, Object detection, semantic segmentation etc..Previous research is concentrated mainly on enhancing CNN component, such as tether layer or activation unit.

Most of existing convolutional neural networks classifiers for image classification are mostly using flat structure, by institute There is classification to be considered as independent classification and ignore their visually separabilities, in actual classification, certain classifications may be than other classes It is not more difficult to distinguish, it is therefore desirable to more special classifier.For example, in CIFAR-10 data set, it is easy to by " cat " and " card Vehicle " distinguishes, but since there are smeared out boundaries between some classifications, it is thus possible to be difficult to distinguish " cat " and " dog ". ResNet-18 classifier can realize 94.63% accuracy on the data set, however, the mistake between ' cat ' and ' dog ' Classification ratio r reaches 21.04%, this is significantly larger than any other two classifications.

The hierarchical classification of broad sense is to refer to according to a huge class hierarchy, specifies class in class hierarchy Affiliated classification.Here object of classification can be text object, such as the entry information of Baidupedia, be also possible to multimedia pair As, such as video, image, music information.The method of hierarchical classification both can also be based on machine by manually classifying The automatic classification of study or the automatic classification with expert's verifying, for example, related art propose it is a kind of with class hierarchy 3 attributes of tag path depth of structure, the categorical measure of example and example describe a hierarchical classification problem.On a large scale Image recognition tasks in, be not only simple sane level relationship, such as extensive visual identity contest between classification It include the image of 1000 classes in ImageNet, visually separability is not strong for certain classifications therein, between some classifications Then there are similar class relations.Hierarchical classification is solved by the way that classifier to be embedded into two or more category hierarchies Classification problem, as shown in Figure 1.Upper classifier generates rough sort as a result, and lower classifier generates sophisticated category result.In level point In class, hierarchical structure can be predefined, or by from top to bottom or bottom-to-top method study.

Although existing hierarchy classification method can improve the accuracy of classification to a certain extent, it is faced with wrong biography The problem of broadcasting, that is to say, that the classification of higher level's classifier classification error travels to the classifier of low level, and eventually leading to can not keep away The classification error exempted from.If classification is that the picture of cat is classified in thick classification N in the first subseries to example as shown in figure 1, It can not just be classified correct by sophisticated category device.

Summary of the invention

The embodiment of the present disclosure provides a kind of image-recognizing method, device, equipment, computer readable storage medium and obscures Perceive convolutional neural networks, solve the relevant technologies there are the drawbacks of, be conducive to promoted image classification accuracy.

In order to solve the above technical problems, the embodiment of the present invention the following technical schemes are provided:

On the one hand the embodiment of the present invention provides one kind and obscures perception convolutional neural networks, including predict classifier, obscure Sensor model, multiple correction classifiers and probability average layer；

The prediction classifier is the convolutional neural networks classifier using training sample set training；

Described to obscure sensor model be the model based on the prediction classifier corresponding confusion matrix building, described to obscure Matrix is to carry out obtained by cross validation acquisition on the training sample set；

Each correction classifier is that obscure sensor model be decision system using described, is had using training sample concentration Smeared out boundary easily obscures classification sample data training gained；

The probability average layer is used for defeated according to the class probability and target correction classifier of the prediction classifier output The classification results of class probability output images to be recognized out, the target correction classifier be it is described obscure sensor model according to The correction classifier of the prediction classification selection of the prediction classifier.

Optionally, the confusion matrix be on the training sample set carry out cross validation acquisition obtained by include:

The training sample set is divided into multiple sub- training sets；

To every sub- training set, it regard current sub- training set as verifying collection, its minor instruction of the non-current sub- training set Practice collection as the training set training prediction classifier, verifying collection test using trained prediction classifier To misclassification image；

The misclassification image that each sub- training set is obtained as verifying collection is summarized to construct the confusion matrix.

Optionally, each sub- training set includes that the number of training sample image is all the same.

Optionally, the probability average layer includes standardized module and probability evaluation entity；

The standardized module is used for class probability and the target correction classifier to the prediction classifier output The class probability of output is standardized；

The probability evaluation entity is to be calculated according to obtaining class probability after standardization, is obtained described to be identified Image generic.

Optionally, class probability and institute of the standardized module for the following formula of utilization to the prediction classifier output The class probability for stating the output of target correction classifier is standardized:

In formula, z is K dimensional vector, and its j-th of element z_j(0,1) probability for being mapped to section is σ (z)_j。

Optionally, the probability evaluation entity is to determine the images to be recognized generic label using following formula:

In formula, X is the images to be recognized, and y is the class label of the images to be recognized, B_jFor the prediction classifier The class probability of output,For the target correction classifier output class probability,It is described easily mixed The corresponding classification of classification of confusing sample data.

On the other hand the embodiment of the present invention provides a kind of image-recognizing method, comprising:

Images to be recognized is input to construct in advance obscure perception convolutional neural networks in；

The prediction classifier that perception convolutional neural networks are obscured described in calling identifies the images to be recognized, obtains The prediction classification and first category probability of the images to be recognized；

That perception convolutional neural networks are obscured described in calling obscures sensor model according to prediction classification selection target school Positive classifier；

The images to be recognized is identified using the target correction classifier, obtains the of the images to be recognized Two class probabilities；

Using the probability average layer for obscuring perception convolutional neural networks according to the first category probability and described the Two class probabilities export the classification results of the images to be recognized.

The embodiment of the invention also provides a kind of pattern recognition devices, comprising:

Image input module, for images to be recognized is input to construct in advance obscure perception convolutional neural networks in；

Picture recognition module, for calling the prediction classifier for obscuring perception convolutional neural networks to described to be identified Image is identified, the prediction classification and first category probability of the images to be recognized are obtained；Perception convolution is obscured described in calling The sensor model of obscuring of neural network corrects classifier according to the prediction classification selection target；Classified using the target correction Device identifies the images to be recognized, obtains the second category probability of the images to be recognized；Perception is obscured using described The probability average layer of convolutional neural networks is to be identified according to the first category probability and the second category probability output The classification results of image.

The embodiment of the invention also provides a kind of image recognition apparatus, including processor, the processor is deposited for executing It is realized when the computer program stored in reservoir such as the step of any one of preceding described image recognition methods.

The embodiment of the present invention finally additionally provides a kind of computer readable storage medium, the computer readable storage medium On be stored with image recognition program, realized when described image recognizer is executed by processor such as any one of preceding described image identification The step of method.

The advantages of technical solution provided by the present application is, propose a kind of prediction correction layering obscures perception convolution mind Through network structure, confusion matrix is obtained by cross validation, smeared out boundary is carried out by the sensor model of obscuring of confusion matrix construction The identification of data category；Layered structure is corrected by using prediction, which can distinguish the class with smeared out boundary.Its On the one hand CNN model is improved to the processing capacity of classification smeared out boundary, can effectively promote image classification accuracy, on the other hand The Error propagation problems of hierarchical classification generation are in turn avoided, and because it uses more accurate estimation confusion matrix, institutes Bigger promotion can be obtained on small-scale data set.

In addition, the embodiment of the present invention provides corresponding implementation method, device, equipment and meter also directed to image-recognizing method Calculation machine readable storage medium storing program for executing, further such that described obscure perception convolutional neural networks with more practicability, the method, dress Set, equipment and computer readable storage medium have the advantages that it is corresponding.

It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited It is open.

Detailed description of the invention

It, below will be to embodiment or correlation for the clearer technical solution for illustrating the embodiment of the present invention or the relevant technologies Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.

Fig. 1 is two-stage classification layered structure schematic diagram provided in an embodiment of the present invention；

Fig. 2 is that a kind of structural framing of the embodiment provided in an embodiment of the present invention for obscuring perception convolutional neural networks shows It is intended to；

Fig. 3 is the flow diagram that a kind of cross validation provided in an embodiment of the present invention constructs confusion matrix；

Fig. 4 is the image sample schematic diagram of Mnist data set provided in an embodiment of the present invention；

Fig. 5 is Mnist ConvNets schematic network structure provided in an embodiment of the present invention；

Fig. 6 is LeNet-5 schematic network structure provided in an embodiment of the present invention；

Fig. 7 is ResNet-18 schematic network structure provided in an embodiment of the present invention；

Fig. 8 is a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention；

Fig. 9 is a kind of flow diagram of the image-recognizing method of illustrative example provided in an embodiment of the present invention；

Figure 10 is a kind of specific embodiment structure chart of pattern recognition device provided in an embodiment of the present invention.

Specific embodiment

In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.

The description and claims of this application and term " first ", " second ", " third " " in above-mentioned attached drawing Four " etc. be for distinguishing different objects, rather than for describing specific sequence.Furthermore term " includes " and " having " and Their any deformations, it is intended that cover and non-exclusive include.Such as contain a series of steps or units process, method, System, product or equipment are not limited to listed step or unit, but may include the step of not listing or unit.

A kind of prior art level deep neural network (Hierarchical Deep Convolutional Neural Networks, HD-CNN) by using from slightly to the Training strategy of the class hierarchy combination Fineturn of essence come to rough sort device Classify with sophisticated category device, and is weighted and averaged using recognition result of the probability average layer to two layers of classified device, Accuracy rate on ImageNet with CIFAR-100 data set compares the convolutional neural networks that it is used and obtains biggish promotion. However it obtains the effect that confusion matrix can not obtain small-scale data set using the verifying of stochastical sampling collection.? That is this technology is suitable only for possessing the application scenarios of large scale training data, and the improvement of small-scale data set is made With being not obvious.

In image classification, specific boundary has good larger impact for the precise classification of image between classification, still Boundary between certain classifications is compared to be easy to mutually obscure for other classifications.Convolutional neural networks are known as advanced image Other technology, possesses extremely powerful classification performance, however, common convolutional neural networks are to the processing of smeared out boundary and pay no attention to Think.Such as in CIFAR-10 data set, using convolutional neural networks we can easily by " cat " and " truck " the two Classification is distinguished, but is difficult the two classifications by " cat " and " dog " and is distinguished.Utilize residual error network (Residual Network, ResNet) ResNet-18, best one of the convolutional neural networks of present image identification field performance, next pair CIFAR-10 database is trained in the case where being enhanced using data, can reach on test set 94.63% it is accurate Rate counts the image of its classification error, obtains its confusion matrix (confusion matrix), as shown in table 1.

Table 1 utilizes the confusion matrix of ResNet-18 training pattern on CIFAR-10 (diagonal entry sets 0)

Confusion matrix is a kind of special table, is to carry out visual tool for the performance to classifier.Its is each Row is the classification of classifier prediction, and each column are the concrete class of image.A can be enabled_ijFor the number of the misclassification picture in table Amount, wherein i represents the classification that these pictures are classified device prediction, and j represents concrete class.So square is obscured comprising k classification Battle array F can be indicated are as follows:

F=(a_ij), i, j=1 ..., K；(1)

Wrong picture number sum is 537, i.e. ∑ in table 1_i≠ja_ij=537, prediction classification is i in confusion matrix, real Border classification is the picture number a of j_ijAccount for the ratio r of all misclassification pictures_ijCalculation formula are as follows:

Ratio of error between every two class, i.e. prediction classification are i, and concrete class is j or prediction classification is j, practical class Not Wei the picture of i to account for the ratio of all misclassification pictures be r=r_ij+r_ji。

Confusion matrix from table 1 can obviously learn that the misclassification quantity between " cat (Cat) " and " dog (Dog) " reaches To 113, ratio of error r=21.04%, and the r of remaining any two class is only up to 7.45%.Although it can be seen that ResNet-18 has very strong classification capacity, but it is ineffective for partially easily obscuring classification.

In consideration of it, in order to improve the relevant technologies suitable for small scale application scenarios and to easily obscure type identification effect compared with The status of difference, present applicant proposes a kind of new convolutional neural networks frameworks, obscure perception convolutional neural networks.Obscure perception volume The architecture of product neural network looks similar to tradition hierarchical classification shown in Fig. 1.However it is substantially deposited between them In significant difference.Obscure perception convolutional neural networks do not follow classification by slightly to essence thinking, but use predict school Positive recognition strategy carries out hierarchical classification, this is pushed by fallout predictor-corrector numerical method, the frequent quilt of the numerical method For solving various mathematics and engineering problem.

After describing the technical solution of the embodiment of the present invention, the various non-limiting realities of detailed description below the application Apply mode.

Referring firstly to Fig. 2, Fig. 2 obscures perception convolutional network in a kind of specific embodiment party to be provided in an embodiment of the present invention Structural schematic diagram under formula, the embodiment of the present invention may include the following contents:

Obscuring perception convolutional neural networks may include prediction classifier 1, obscures sensor model 2, correction classifier group 3 and general Rate average layer 4.

Prediction classifier 1 is the plane classifier of the training on the training set with all categories, in sorting phase The prediction classification of the images to be recognized of input is generated, but its prediction is usually not accurate enough, it is therefore desirable to be corrected.

Predict that classifier 1 is to utilize the convolutional neural networks classifier of training sample set training, convolutional neural networks classification Device can not do any restriction to this for existing conventional convolutional neural networks structure any in the related technology, the application.In advance It surveys classifier 1 and concentrates all images of all categories to be trained using training sample.

A can be enabled_ijFor the quantity of the mistake classification image with concrete class i and prediction classification j, and r_ijFor a_ijWith it is all The ratio of mistake classification image, is defined as:

Pass through { r_ijSet, it can identify those classifications for being easy to obscure, by using threshold value with subsequent for establishing Obscure sensor model 2.

In practical applications, prediction classifier 1 carries out the first subseries to the images to be recognized of input first, and classification generates Two parts output, is the prediction classification (Cprediction) of images to be recognized and the class probability vector for belonging to the category respectively. Wherein, class probability vector is the output for predicting 1 the last layer of classifier, and size represents images to be recognized and belongs to each classification Confidence level, it is prediction classification Cprediction that component of maximum is corresponding.

Sensor model 2 is obscured as the core component for obscuring perception convolutional neural networks, is right based on prediction classifier 1 What the estimation confusion matrix answered was constituted.No matter sensor model is obscured in training stage (for training correction classifier) or classification Stage (for selecting correction classifier appropriate) all exists as a decision system, and decision object is exactly to correct classifier Selection, main function are that the easy of perception test concentration obscures classification.

The convolutional neural networks classifier common as one of classifier 1 is predicted, for easily obscuring the identification energy of classification Power is not strong.Often there is fuzzy boundary between these classifications, such as " cat (Cat) " in CIFAR-10, " dog (Dog) " and Boundary and not as good as clear between remaining classification between " horse (Horse) " three classifications, conventional classifier is often by these sides The image misclassification of boundary overlapping region.And sensor model is obscured by what the estimation confusion matrix of building prediction classifier 1 was established It can be used to accurately perceive the sample data for easily obscuring classification.

Estimate that it obscures classification since the confusion matrix on test set can not be directly acquired, but can be tested by intersecting Card obtains the confusion matrix on training sample set to simulate its distribution.Cross validation also referred to as recycles estimation (Rotation It Estimation), is a kind of practical approach that data sample is statistically cut into relatively small subset.

Confusion matrix can be to carry out obtained by cross validation acquisition on training sample set, and generation method can be as described below:

Training sample set is divided into multiple sub- training sets；To every sub- training set, it regard current sub- training set as verifying collection, Its minor training set of the sub- training set of non-present utilizes trained prediction classifier pair as training set training prediction classifier Verifying collection is tested to obtain misclassification image；Using each sub- training set as the obtained misclassification image of verifying collection summarize with Construct confusion matrix.In a kind of specific embodiment, the data sample image for including in every sub- training set can be identical, That is training sample set can be equally divided into multiple sub- training sets.

It please refers to shown in Fig. 3, training sample set (Trainset) can be equally divided into 5 sub- training sets first, select every time It uses one of as verifying and collects (Validation data), using remaining as training set (Training data) Lai Xunlian Convolutional neural networks classifier namely Validation data and Training data are 1:4.Then using prediction classifier 1 pair of verifying collection carries out the image that test obtains the misclassification of its output, repeats above operation until this 5 sub- training sets are all made It was tested for verifying collection, and finally carried out all misclassification images to summarize construction confusion matrix (confusion matrix)。

Previous hierarchical classification structure is all the method using random sampling, picks out a verifying from training sample concentration Collection obtains confusion matrix.Compared to these methods, it is more accurate mixed to obtain that cross validation can maximize the use of training set Confuse matrix.

Obscure sensor model 2 as decision system, one group of correction classifier of training, be exclusively used in each pair of fuzzy category it Between generate clearly boundary.That is, it is decision system that respectively correction classifier, which is using sensor model is obscured, training sample is utilized There is this concentration the classification sample data of easily obscuring of smeared out boundary to train gained.That is correction classifier is obscuring the pre- of sensing network Part in survey-correcting structure as correction exists, and is by based on a series of convolutional neural networks classifiers for obscuring classification It constitutes, therefore corrects total categorical measure that the correction classifier quantity in classifier group 3 is generally less than or equal to training sample concentration.

Easily obscuring classification sample data is that training sample concentrates the data with smeared out boundary classification, such as cat, dog to correspond to All image datas.It, can be to obscuring square after obtaining estimation confusion matrix of the prediction classifier 1 on training sample set The wrong classification chart picture of institute is ranked up in battle array, and its is selected preceding 30% as easy to obscure classification.It specifically, is to have selected threshold Value T makes 30% a_ij>=T, remaining 70% a_ij<T.Such as predicting that classification is k-th of classification and concrete class is i-th A quantity is a_ij, if there is a_ij>=T, then what i can be considered as to k easily obscures classification, corrects classifier for subsequent training.

In the stage of training pattern, for each prediction classification C in prediction classifier 1_k, it is selected easily to obscure classification C_i、C_j(i.e. a_ki>=T and a_ki>=T), then obscure category set C for these three classifications composition one_k, it is based on C_kIn several classification lists Solely one correction classifier of training.In classification, if the images to be recognized of input is predicted as classification C by prediction classifier 1_k, Corresponding target correction classifier can be selected then for it by so obscuring sensor model, using target correction classifier to figure to be identified As carrying out once correcting classification again, the class probability vector of correction classifier is obtained.Final classification results are dependent on prediction point The combination of the output of class device 1 and target correction classifier, and combining the mode used can be probability average layer 4.

Probability average layer 4 is used for the class probability according to prediction classifier output and the classification of target correction classifier output The final classification of probability output images to be recognized is as a result, target correction classifier is to obscure sensor model according to prediction classifier Predict the correction classifier of classification selection.

Layered structure collaboration is using prediction and corrects classifier to classify, since all classifiers being utilized simultaneously Output can avoid the problem that single classifier is selected to export to a certain extent, to prevent error propagation, solve to pass United hierarchy model the drawbacks of.

Due to prediction classifier 1 and class probability vector and disunity that classifier group 3 exports are corrected, for the ease of subsequent Two probability vectors can be standardized namely probability average layer may include standardized module by convenient data processing, standardization The class probability of class probability and the output of target correction classifier that module is used to export prediction classifier 1 is standardized place Reason, optionally, available softmax function handles probability vector.Namely using following formula to the prediction The class probability of classifier output and the class probability of target correction classifier output are standardized:

It is mapped in section (0,1) by that will export, the probability vector and target school that can also will predict that classifier 1 exports The probability vector of positive classifier output combines, and formula can be as follows:

In formula, X is images to be recognized, and y is the class label of images to be recognized, B_jClassification for prediction classifier output is general Rate,For target correction classifier output class probability,It is corresponding easily to obscure classification sample data Classification.Predict classifier 1 in j-th of classification (classification C_j) component probability be B_j, classifier is corrected to classification C_jObtain probability With forThe highest classification of probability is to obscure perception convolutional Neural net in probability average layer p (y=j | X) output The final classification of network.

It should be noted that the prediction classifier 1 convolutional Neural net that it is not absolutely required to be fixed using some of the application Network classifier, prediction classifier 1 and the network structure for correcting classifier could alternatively be any existing convolutional Neural Network structure either other non-convolutional neural networks classifiers.

In technical solution provided in an embodiment of the present invention, propose it is a kind of prediction correction layering obscure perception convolution mind Through network structure, confusion matrix is obtained by cross validation, smeared out boundary is carried out by the sensor model of obscuring of confusion matrix construction The identification of data category；Layered structure is corrected by using prediction, which can distinguish the class with smeared out boundary.Its On the one hand CNN model is improved to the processing capacity of classification smeared out boundary, effectively promotes image classification accuracy, on the other hand again Avoid the Error propagation problems of hierarchical classification generation, and because it uses more accurate estimation confusion matrix, Bigger promotion can be obtained on small-scale data set.

The precision and accuracy of image classification can be promoted in order to verify technical solution provided by the present application, the application also mentions A series of confirmatory experiment has been supplied, has been assessed, is made on Mnist and data set using the perception CNN that obscures that the application proposes It is tested on single NVIDIA Titan X video card with deep learning frame PyTorch, network is carried out by backpropagation Training.The embodiment of the present invention may include the following contents:

PyTorch is the deep learning that Facebook artificial intelligence study institute (FAIR) increased income on github in 2017 Frame, predecessor are the Torch for being born in New York University in 2002.PyTorch has advanced design concept frame, All modules on Tensor are reconstructed on the basis of Torch, and have increased state-of-the-art automatic derivation system newly, at For most popular Dynamic Graph frame instantly.PyTorch remaining leading most of deep learning in flexibility, speed and ease for use Frame.PyTorch design is succinct, and for source code it can be readily appreciated that the interface flexible of its design is easy-to-use, easily studied personnel realize that it is thought Method, but not to sacrifice the speed of service as cost while keeping flexibility, in each deep learning frame, speed It is still leading.PyTorch provides complete document, and possesses active community, in Facebook artificial intelligence study institute Remain update steady in a long-term under support, thus be selected test the application proposition obscure perception convolutional neural networks.

Mnist(Modified National Institute of Standards and Technology Database) data set is a hand-written digital data sets, the gray level image comprising 70000 handwritten numerals, wherein 60000 A is training data, remaining 10000 are test datas.Mnist is made of the handwritten numeral image that size is 28 × 28, including 10 classifications correspond respectively to number 0 to 9, such as Fig. 4.CIFAR-10 data set is a common computer vision data Collection, comprising 10 classifications, the image that 60000 sizes are 32 × 32 × 3 in total, wherein 50000 images can be used as training set, Remaining image can be used as test set.

On Mnist data set, the convolutional neural networks MnistConvNets in PyTorch official sample can be used to come Training benchmark classifier.Mnist ConvNets structural model is made of two convolutional layers and two full articulamentums, such as Fig. 5 institute Show.Predict classifier 500 batches of iteration, learning rate 0.01, momentum 0.9 on training set.Training set is divided into six A part, every sub- training set have 10000 images.Confusion matrix is realized using 6 times of cross validation methods.Using from obscuring square What is selected in battle array obscures classification to train correction classifier.Then confusion-aware CNN is tested on test set.Experiment Show that the accuracy of test set is increased to 99.31%.Obscure the error rate a quarter lower than single CNN of perception CNN.Obscure The parameter for perceiving CNN is 238K, is 10 times of Mnist ConvNets, compared with the ResNet-32 with 460k parameter, obscures The CNN of perception possesses better performance, and experimental result is as shown in table 2.

Performance (no data enhancing) of the 2. difference CNN model of table on Mnist.

It is found that the accuracy of test set is increased to by the 99.05% of benchmark sorter network Mnist obscures perception from upper table The 99.31% of convolutional neural networks (technical solution that the application proposes).As it can be seen that obscuring the error rate of perception convolutional neural networks A quarter lower than baseline network.And from parameter scale, the parameter for obscuring perception convolutional neural networks is about benchmark Mnist 10 times, be 238K, compared with the ResNet-32 with 460k parameter, obscure perception convolutional neural networks possess better table Now with lower parameter scale.

Based on CIFAR-10 data set, three kinds of different CNN can be used as basic model and obscure perception CNN's to test Performance, the depth of these networks and total parameter gradually increase.First network is identical as LeNet-5 in structure, and unique Modification is that the size of input picture is only adjusted to 32 × 32 × 3.Second network increases volume on the basis of first network The quantity of product core and concealed nodes.The structure of the first two network is as shown in fig. 6, the learning rate of the classifier training is 0.1, often 100 batches reduce 10 times.Their 500 batches of iteration on training set, momentum 0.9, weight decays to 0.0005. The last one CNN has used 18 layers of residual error network, referred to as ResNet-18.ResNet-18 includes that 17 convolutional layers and one are complete Articulamentum.Since residual error network is constructed using special residual block, thousands of layers of depth can be reached without ladder The problem of spending disperse.The network structure of ResNet-18 is as shown in fig. 7, when using residual error network training classifier, each classifier 200 batches of iteration.Initial learning rate is set as 0.01, and every 50 batches reduce 10 times.

Three above network all uses stochastic gradient descent method in training, and additional data has been used to enhance, i.e., will The original image of training set carries out random cropping and overturning before training, to enhance the performance of classifier.The instruction of CIFAR-10 Practice collection and be divided into 5 sub- training sets, every sub- training set includes 10000 images, obscures square using the acquisition of 5 folding cross validations Battle array.Random cropping and overturning can be used to train by the final effect of experiment such as table, and experimental result is shown in table 3.

3. 3 CNN networks of table and its accuracy for obscuring perception CNN accordingly compare

As shown in table 3, with the increase of basic model complexity, the raising of precision drops to 0.21% from 4.07%.Cause This, the present inventors considered that: the benchmark classifier of use is simpler, and complexity is lower, obscures perception convolutional neural networks and mentions The effect risen is more obvious.

Based on the above embodiment, present invention also provides be used for image classification identification in, refer to Fig. 8 and Fig. 9, scheme 8 be a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention, and Fig. 9 is that the process of illustrative example is illustrated Figure, the embodiment of the present invention may include the following contents:

S801: images to be recognized is input to construct in advance obscure perception convolutional neural networks in.

Obscure perception convolutional neural networks may include prediction classifier, obscure sensor model, multiple correction classifiers and generally Rate average layer.Obscure the functional structure of perception convolutional neural networks and realizes that process sees the description of above-described embodiment.

S802: call obscure perception convolutional neural networks prediction classifier images to be recognized is identified, obtain to Identify the prediction classification and first category probability of image.

S803: the sensor model of obscuring for obscuring perception convolutional neural networks is called to be divided according to prediction classification selection target correction Class device.

S804: identifying images to be recognized using target correction classifier, obtains the second category of images to be recognized Probability.

S805: general according to first category probability and second category using the probability average layer for obscuring perception convolutional neural networks The classification results of rate output images to be recognized.

The class probability vector obtained by probability average layer fusion forecasting classifier and correction classifier, and select to merge The corresponding final classification for analogizing to the final output of network namely images to be recognized of largest component in vector.From the foregoing, it will be observed that this Inventive embodiments solve the relevant technologies there are the drawbacks of, be conducive to promoted image classification accuracy.

The embodiment of the present invention provides corresponding realization device also directed to image-recognizing method, further such that the method With more practicability.Pattern recognition device provided in an embodiment of the present invention is introduced below, image recognition described below Device can correspond to each other reference with above-described image-recognizing method.

Referring to Figure 10, Figure 10 is a kind of knot of the pattern recognition device provided in an embodiment of the present invention under specific embodiment Composition, the device can include:

Image input module 1001, for images to be recognized is input to construct in advance obscure perception convolutional neural networks In.

Picture recognition module 1002, for calling the prediction classifier for obscuring perception convolutional neural networks to images to be recognized It is identified, obtains the prediction classification and first category probability of images to be recognized；It calls and obscures the mixed of perception convolutional neural networks Sensor model confuse according to prediction classification selection target correction classifier；Images to be recognized is known using target correction classifier Not, the second category probability of images to be recognized is obtained；Using obscure perception convolutional neural networks probability average layer according to first The classification results of class probability and second category probability output images to be recognized.

Optionally, in some embodiments of the present embodiment, described image identification module 1002 can be that will train sample This collection is divided into multiple sub- training sets；To every sub- training set, it regard current sub- training set as verifying collection, the sub- training set of non-present Its minor training set test to verifying collection as training set training prediction classifier, using trained prediction classifier To misclassification image；The misclassification image that each sub- training set is obtained as verifying collection is summarized to construct the mould of confusion matrix Block.

In another embodiment, described image identification module 1002 for example can also be comprising standardized module and general Rate computing module；Standardized module is used for the class probability to prediction classifier output and the classification of target correction classifier output Probability is standardized；Probability evaluation entity is to be calculated according to obtaining class probability after standardization, obtain to Identify image generic.

In some embodiments of the embodiment of the present invention, the standardized module can be also using following formula to prediction The class probability of classifier output and the class probability of target correction classifier output are standardized:

In other embodiments of the embodiment of the present invention, the probability evaluation entity can also be true using following formula Determine images to be recognized generic label:

In formula, X is images to be recognized, and y is the class label of images to be recognized, B_jClassification for prediction classifier output is general Rate,For target correction classifier output class probability,It is corresponding easily to obscure classification sample data Classification.

The function of each functional module of described image identification device of the embodiment of the present invention can be according in above method embodiment Method specific implementation, specific implementation process is referred to the associated description of above method embodiment, and details are not described herein again.

From the foregoing, it will be observed that the embodiment of the present invention solve the relevant technologies there are the drawbacks of, be conducive to promoted image classification it is accurate Property.

The embodiment of the invention also provides a kind of image recognition apparatus, specifically can include:

Memory, for storing computer program；

Processor realizes the step of any one embodiment described image recognition methods as above for executing computer program Suddenly.

Described image of the embodiment of the present invention identifies that the function of each functional module of equipment can be according in above method embodiment Method specific implementation, specific implementation process is referred to the associated description of above method embodiment, and details are not described herein again.

The embodiment of the invention also provides a kind of computer readable storage mediums, are stored with image recognition program, the figure When being executed by processor as recognizer as above any one embodiment described image recognition methods the step of.

The function of each functional module of computer readable storage medium described in the embodiment of the present invention can be according to above method reality The method specific implementation in example is applied, specific implementation process is referred to the associated description of above method embodiment, herein no longer It repeats.

Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part Explanation.

Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.

The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.

Above to a kind of image-recognizing method provided by the present invention, device, equipment, computer readable storage medium and mixed Confuse and perceives convolutional neural networks and be described in detail.Specific case used herein is to the principle of the present invention and embodiment It is expounded, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas.It should be pointed out that For those skilled in the art, without departing from the principle of the present invention, can also to the present invention into Row some improvements and modifications, these improvements and modifications also fall within the scope of protection of the claims of the present invention.

Claims

1. one kind obscures perception convolutional neural networks, which is characterized in that including prediction classifier, obscure sensor model, multiple schools Positive classifier and probability average layer；

It is described to obscure sensor model for the model based on the corresponding confusion matrix building of the prediction classifier, the confusion matrix To be carried out obtained by cross validation acquisition on the training sample set；

Each correction classifier is that obscure sensor model be decision system using described, concentrated using the training sample have it is fuzzy Easily obscure classification sample data training gained in boundary；

What the probability average layer was used to be exported according to the class probability and target correction classifier of the prediction classifier output Class probability exports the classification results of images to be recognized, and the target correction classifier obscures sensor model according to be described Predict the correction classifier of the prediction classification selection of classifier.

2. according to claim 1 obscure perception convolutional neural networks, which is characterized in that the confusion matrix is described Cross validation acquisition gained is carried out on training sample set includes:

The training sample set is divided into multiple sub- training sets；

To every sub- training set, it regard current sub- training set as verifying collection, its minor training set of the non-current sub- training set As the training set training prediction classifier, verifying collection is tested using trained prediction classifier and is missed Classification image；

3. according to claim 2 obscure perception convolutional neural networks, which is characterized in that each sub- training set includes training sample The number of this image is all the same.

4. according to claim 1 to obscuring perception convolutional neural networks described in 3 any one, which is characterized in that the probability Average layer includes standardized module and probability evaluation entity；

The standardized module is used to export the class probability of the prediction classifier output and the target correction classifier Class probability be standardized；

The probability evaluation entity is to be calculated according to obtaining class probability after standardization, obtains the images to be recognized Generic.

5. according to claim 4 obscure perception convolutional neural networks, which is characterized in that the standardized module is to utilize Following formula carry out the class probability of the prediction classifier output and the class probability of target correction classifier output Standardization:

6. according to claim 1 to 3 obscure perception convolutional neural networks, which is characterized in that the probability meter Calculating module is to determine the images to be recognized generic label using following formula:

In formula, X is the images to be recognized, and y is the class label of the images to be recognized, B_jFor prediction classifier output Class probability,For the target correction classifier output class probability,Class is easily obscured to be described The corresponding classification of other style notebook data.

7. a kind of image-recognizing method characterized by comprising

The prediction classifier that perception convolutional neural networks are obscured described in calling identifies the images to be recognized, obtains described The prediction classification and first category probability of images to be recognized；

That perception convolutional neural networks are obscured described in calling obscures sensor model according to prediction classification selection target correction point Class device；

The images to be recognized is identified using the target correction classifier, obtains the second class of the images to be recognized Other probability；

Using the probability average layer for obscuring perception convolutional neural networks according to the first category probability and second class The classification results of images to be recognized described in other probability output.

8. a kind of pattern recognition device characterized by comprising

Picture recognition module, for calling the prediction classifier for obscuring perception convolutional neural networks to the images to be recognized It is identified, obtains the prediction classification and first category probability of the images to be recognized；Perception convolutional Neural is obscured described in calling The sensor model of obscuring of network corrects classifier according to the prediction classification selection target；Utilize the target correction classifier pair The images to be recognized is identified, the second category probability of the images to be recognized is obtained；Perception convolution is obscured using described The probability average layer of neural network images to be recognized according to the first category probability and the second category probability output Classification results.

9. a kind of image recognition apparatus, which is characterized in that including processor, the processor is used to execute to store in memory The step of image-recognizing method as claimed in claim 7 is realized when computer program.

10. a kind of computer readable storage medium, which is characterized in that be stored with image knowledge on the computer readable storage medium The step of other program, described image recognizer realizes image-recognizing method as claimed in claim 7 when being executed by processor.