CN109214426A

CN109214426A - A kind of method and deep neural network model of the detection of object appearance

Info

Publication number: CN109214426A
Application number: CN201810895905.8A
Authority: CN
Inventors: 王新维
Original assignee: Individual
Current assignee: Shenzhen Yujun Vision Intelligent Technology Co ltd
Priority date: 2018-08-08
Filing date: 2018-08-08
Publication date: 2019-01-15

Abstract

The present invention relates to the method and deep neural network model of a kind of detection of object appearance, the methods of object appearance detection, the specific steps are as follows: build deep neural network model；Deep neural network model is learnt by circuit training, has the resolving ability to appearance；Learnt according to above-mentioned a large amount of circuit training, deep neural network model gradually restrains the optimal weights for obtaining each characteristic value；Convolution operation is carried out to the appearance picture for needing detection object, is worth classification results according to the probability interval of setting.Deep neural network model, including training module, evaluation module and prediction module, the present invention solves problems present in the artificial judgment rule for formulating object appearance classification, overcomes current manual operation inefficiency or the low problem of conventional automated judging nicety rate；Simultaneously because its sustainable self iteration upgrading, its recognition efficiency can be continuously improved theoretically.And module of the present invention is simple, and low in hardware cost has wide range of applications.

Description

A kind of method and deep neural network model of the detection of object appearance

[technical field]

The present invention relates to intelligence image identification technical field, be related to object appearance detect automatically more particularly to object outside See the method and deep neural network model of detection.

[background technique]

As industrial continuous development and people are continuously increased material requisite, some consumer electrical products Annual output alreadys exceed 100,000,000.The object appearance detecting method of mainstream still leans on producing line operator Manual Visual Inspection at present, according to fixed The good judgment criteria of justice is detected.Artificial detection scheme needs a large amount of manpowers, and error rate is high, and inefficiency is easy to make At production capacity bottleneck and higher return factory's customer complaint.In order to improve efficiency, manpower is saved, while reducing and returning factory's customer complaint, to automatic detection Demand it is more more and more intense.

Some fields also begin trying to be replaced with the method detected automatically artificial at present, specific as follows, existing automatic Detection method is to formulate judgment rule by technical staff according to pervious experience and inspection criterion and realized with program.This Although working efficiency can be improved in kind method, but because of the formulation of rule mainly by the previous experience of operator and pervious test Data, and manually be also difficult to find the mutual mapping relations of various data between mass data, this results in rule definition inclined The problems such as face, the situation that do not meet can not judge, for rule inherently with serious defect, this method can not be from principle On settle the matter once and for all.

In consideration of it, the art, which urgently occurs one, can establish comprehensive and accurate decision rule, automatic Iterative upgrading is high Effect and the method and deep neural network model of object appearance easy to spread and high accuracy rate detection.

[summary of the invention]

The technical problem to be solved in the present invention is to provide a kind of methods of object appearance detection.

The present invention adopts the following technical scheme:

The present invention provides a kind of methods of object appearance detection, the specific steps are as follows:

It builds convolutional neural networks and adds deep neural network model based on full linking layer, and typing object correlation figure Piece；

Deep neural network model obtains object appearance classification and picture by carrying out circuit training study to object picture Correlation between feature, to have the resolving ability to appearance；

Learnt according to above-mentioned a large amount of circuit training, deep neural network model, which gradually restrains, obtains each characteristic value most Excellent weight；

The appearance picture for needing detection object is obtained when detection by deep neural network model, one obtained according to training Group characteristic value carries out convolution operation to respective image, finally obtains its classification further according to the activation situation and respective weights of characteristic value Probability, finally according to the probability interval of setting be worth classification results.

Further, the method also includes automatic Iteratives, the picture of misjudgment are automatically added to training set and in mesh It is trained automatically on the basis of the model of preceding optimal weight, and whether the accuracy rate of the model after comparative training is more than optimal at present The accuracy rate of weight is persistently iterated model if it exceeds newest model will be automatically imported.

Further, picture processing is first carried out before executing training study, basis is manually distinguished and is placed on not first With the picture of file, just classification annotation, and all pictures are converted into unified size and format, then the picture after conversion Each pixel information write specified file and record its corresponding markup information

Further, the training study, all can basis after each circuit training study to carry out by the circulation of batch Error amount derivation goes out the new weight of each characteristic value and update, so that error amount can be gradually reduced in training, depth nerve Model state in which when network model can judge automatically trained, and respective operations are executed automatically.

Further, the training pattern state in which is according to the accuracy rate of training learning outcome, or training study As a result the accuracy rate for being verified result with verifying pictures verifying judges to obtain.

Further, the state includes not convergence state, poor fitting state and over-fitting state, when state is not restrain Training study terminates when state and over-fitting state, and training study continues when for poor fitting state；

The not convergence state refers to the accuracy rate of trained learning outcome lower than setting value；

The over-fitting state refers to that the accuracy rate of verification result persistently reduces；

The poor fitting state refers to that the accuracy rate of verification result persistently increases.

Further, deep neural network model weight for updating after the training of all batches, and by the power of each batch It is restored to model again and is assessed with test data, the corresponding accuracy rate of weight of the batch is recorded, all batch weights After being completed, deep neural network model can recommend the weight in ranking forefront automatically.

Further, the class probability in each face that classification results are provided according to deep neural network model, obtains each The judgement in face is as a result, obtain classification results further according to the judging result in all faces.

A kind of deep neural network model is also disclosed, including

Training module, for being learnt according to the circuit training for being set for batch to object correlation picture；

Each weight for storing the weight updated after all training, and is restored to model and with test by evaluation module Data are assessed, and record the corresponding accuracy rate of weight of the batch, then model can recommend the weight being arranged in front automatically；

Prediction module calculates every for obtaining the appearance picture of article, and using the deep-neural-network connected entirely The class probability of picture obtains judging result further according to class probability；

Deep neural network model further comprises picture preprocessing module, automatic to carry out for reading object correlation picture Classification marker, and handling is the information for meeting setting.

Further, the neural network model further includes iteration module, and the picture for that will occur judging incorrectly is added It is trained automatically to training module, and on the basis of the model of current optimal weight, and the model after comparative training is accurate Whether rate is more than that the accuracys rate of current optimal weights persistently changes to model if it exceeds newest model will be automatically imported Generation.

Compared with prior art, the beneficial effects of the present invention are:

The present invention utilizes a large amount of marks by building multilayered model based on convolutional neural networks (CNN) and full linking layer The picture for having infused object state of appearance is trained multilayered model so that model association object appearance classification and picture feature it Between correlation, to have to the resolving ability of appearance.And iteration module system is added in multilayered model normal pre- Continuous optimization is carried out to model in survey and the model after optimization is assessed automatically, and is determined whether automatically according to assessment result Import normal prediction.The present invention realizes model buildings, training, assessment, the entire closed loop for importing normal prediction and automatic Iterative. In addition to model buildings, training and assessment need artificial simple intervention in the present invention, subsequent normal prediction and automatic Iterative are complete It is automatically performed by system.

The present invention solves problems present in the artificial judgment rule for formulating object appearance classification, overcomes at present Manual operation inefficiency or the low problem of conventional automated judging nicety rate；Simultaneously because its sustainable self iteration upgrading, Its recognition efficiency can be continuously improved theoretically.And module of the present invention is simple, and low in hardware cost has wide range of applications.

[Detailed description of the invention]

Fig. 1 is the module connection diagram of deep neural network model；

Fig. 2 is another embodiment schematic diagram of Fig. 1；

Fig. 3 is another embodiment schematic diagram of Fig. 2；

Fig. 4 is prediction module operation schematic diagram；

Fig. 5 is the deep neural network model algorithm architecture diagram of an embodiment.

[specific embodiment]

In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The model that the present invention protects all should belong in member's every other embodiment obtained without making creative work It encloses.

Description and claims of this specification and term " first " in above-mentioned attached drawing, " second " and " third " etc. are For distinguishing different objects, not for description particular order.In addition, term " includes " and their any deformations, it is intended that Non-exclusive include in covering.Such as the process, method, system, product or equipment for containing a series of steps or units do not have It is defined in listed step or unit, but optionally further comprising the step of not listing or unit, or optionally further comprising For the intrinsic other step or units of these process, methods, product or equipment.

Lead to deep neural network model when detection and obtains the appearance picture for needing detection object, one group obtained according to training Characteristic value carries out convolution operation to respective image, finally obtains its classification further according to the activation situation and respective weights of characteristic value Probability is finally worth classification results according to the probability interval of setting.

In order to enable machine can independently find out the mapping relations between the various appearance classifications of object and its characteristics of image, And the method independently classified accordingly to object appearance.This programme is the object crossed using a large amount of marks (such as acceptance or rejection) Body picture is trained model, successively extracts picture feature value by multilayer convolutional neural networks, then pass through predictive marker Penalty values are obtained with the difference of training label, according to the reversed derivation of penalty values, are measured the weight for leading to maximum loss and are carried out more Newly, predicted value is calculated finally by full linking layer.

Here deep neural network model, including

Training module, for learning to object correlation picture, and according to the circuit training for being set for batch；

Each weight for storing the weight updated after all training, and is restored to model and with test by evaluation module Data are assessed, and the corresponding accuracy rate of weight of the batch is recorded, and then recommend the weight being arranged in front to model；

Deep neural network model further comprises picture preprocessing module, for reading object correlation picture, classifies Label, and handling is the information for meeting setting.

With reference to shown in attached drawing 1, deep neural network model has only included training module in an embodiment, evaluation module and pre- The detection and result judgement of object appearance can be realized by surveying module.

With reference to shown in attached drawing 2, deep neural network model has only included training module in another embodiment, evaluation module and Prediction module, while picture preprocessing module is added before training module, so that efficiency of the entire model in training study It is higher.

Further, the neural network model also further includes iteration module, for adding the picture for occurring judging incorrectly Enter to training module, and is trained automatically on the basis of the model of current optimal weights, and the standard of the model after comparative training Whether true rate is more than the accuracy rate of current optimal weights, if it exceeds newest model will be automatically imported, persistently carries out to model Iteration.

With reference to shown in attached drawing 3, deep neural network model has only included training module in another embodiment, evaluation module and Prediction module, while picture preprocessing module is added before training module, so that efficiency of the entire model in training study It is higher, it also adds iteration module and effectively guarantees the accuracy rate of neural network model in varied situations.

It should be pointed out that the multilayered model built includes the camera for acquiring item pictures, the camera It is required to realize and multi-angle, which acquires picture, simultaneously is realized to same object, the quantity of camera can be according to object actual conditions It is increased and decreased.

The multilayered model is deep neural network model, specifically adds full articulamentum to form by multi-layer C NN network, described Model rises the concept tieed up and realizes deep-neural-network by introducing identical mapping and sub-module, and thousand layers of nerve theoretically may be implemented Network is built.

Concrete implementation method is: building in mode for each layer of tradition of the input exported as next layer is changed to The output of first layer is simultaneously added the final output as third layer with the output of third layer simultaneously as the input of the second layer, then The final output of third layer is used as the 4th layer of input again and is added with the output of layer 5, and so on build model.

Model also introduces modularization liter dimension concept and sets for every 4 layers since the second layer in addition to first layer and the last layer simultaneously For a module, the convolution kernel that the first layer of each module is K (K is a constant) times rises dimension operation, behind three layers of volume Product nuclear volume remains unchanged.

In one embodiment, with reference to shown in attached drawing 5, multilayered model is made of 13 layers of CNN network and one layer of full articulamentum, The output of first layer is simultaneously added the final output as third layer with the output of third layer simultaneously as the input of the second layer, then The final output of third layer is used as the 4th layer of input again and is added with the output of layer 5, and so on build model.

In embodiment, with reference to shown in attached drawing 5, in addition to first layer and the last layer, one is set as every 4 layers since the second layer A module, therefore the second layer, to 13 layers of formation, 3 modules, the first floor, that is, second layer of 3 modules, six layers and ten layers are K (wherein K is a constant) times convolution kernel rise dimension operation, and three layers of module rear of convolution nuclear volume remains unchanged.

13 layers of CNN network finally calculating every picture with a full articulamentum corresponds to every kind of possible probability value, Here it is just calculated in embodiment using softmax function.

With reference to shown in attached drawing 4, when normally being predicted, the appearance picture input of each object can be acquired by camera Multilayered model, multilayered model can carry out convolution operation to image according to one group of characteristic value that training obtains, and according to characteristic value Activation situation and respective weights finally obtain the probability of its various classification, and are obtained a result according to the probability interval value of setting.This Invention fundamentally solves artificial inaccuracy and limitation to machine set classifying rules.

In order to guarantee system model can constantly to improve, result is more accurate, and the method also includes automatic Iteratives, is The picture of misjudgment is automatically added to training set and is trained automatically on the basis of the model of current optimal weight, and it is right Whether the accuracy rate than the model after training is more than the accuracy rate of current optimal weights, if it exceeds newest mould will be automatically imported Type is persistently iterated model.That is upgraded by self sustainable iteration, so that the recognition efficiency of model can be continuous It improves.

In order to enable pictorial information when training uniformly improves training effectiveness, further, it is also necessary to the object of typing Picture just pre-processes, particularly by before object correlation appearance picture typing model will first just picture is handled, basis first The picture of different files is manually distinguished and be placed on, just classification annotation, and all pictures are converted into unified size and lattice Formula, then each pixel information of the picture after conversion is write specified file and records its corresponding markup information, here The pixel information writes the file that specified file refers to tfrecord format.

In one embodiment, firstly, object picture is respectively stored into as required in specified file, model is according to text The specific name of part folder carries out classification annotation, and mark here includes qualified and unqualified two class.Then, by the lattice of all pictures Formula is converted into JPG format, while the size of picture is switched to 64*64.Each pixel information of the picture after conversion is write again Enter the file to tfrecord format and record its corresponding markup information, so completes the pretreatment of picture.

Deep neural network model can all go out each characteristic value after each circuit training study according to error amount derivation New weight and update, since error amount can be gradually reduced in training, model can judge automatically trained state in which, and from It is dynamic to execute respective operations.

Particularly, picture is transmitted to model after pretreatment automatically and is trained, in systems setting model circulation instruction The picture number of experienced number and the training study of each batch criticizes learning to model training for next batch for one in order, New weight and the update of each characteristic value can all be gone out after circulation according to error amount derivation every time.

The trained state in which is according to the accuracy rate of training learning outcome, or training learning outcome and verifying picture The accuracy rate that collection verifying is verified result judges to obtain；

The state includes not convergence state, poor fitting state and over-fitting state, when state is not convergence state and mistake Training study terminates when fitting state, and training study continues when for poor fitting state；

In one embodiment, the cycle-index by training study is set as 10 times, is found after circulation terminates when at 10 times Loss value is still very big, sets value > 500 loss here, and accuracy rate is very low is set lower than 55% system decision model here Not convergence state, it is automatic to terminate training, save the training time.

Meanwhile after completing to recycle 10 training study, loss value and accuracy rate all meet sets requirement, hereafter The number of comparison, is set as five times, such as five by result and verifying pictures proving and comparisom per the training study recycled twice here Secondary verification result accuracy rate persistently increases, then it is assumed that model poor fitting, model continue to train；Otherwise, five verification results are accurate Rate persistently reduces, then it is assumed that model over-fitting, model terminate automatically.

Here verifying pictures are the pictures that prior manual sort prepares, the training study for initial mask.

The weight updated after all batch training of deep neural network model will be stored for the power of each batch It is restored to model again and is assessed with test data, the corresponding accuracy rate of weight of the batch is recorded, all batch weights After being completed, system can recommend the weight in ranking forefront automatically.

In one embodiment, after setting is completed all batch weights, system can recommend first three power of ranking automatically Weight.

The class probability in each face provided in prediction according to deep neural network model, obtains the judgement knot in each face Fruit.

In one embodiment, only when the result in all faces is all qualification, system can just determine the appearance of the object for setting It is qualified；Otherwise it is unqualified that the result in which face of operator, which can be prompted,.

The present invention fundamentally solve existing object appearance detect automatically in the presence of artificially to machine set classifying rules and Lead to inaccuracy and limited problem.It also solves in the application scenarios for needing certain article of mass production simultaneously, people The problem of work progress classification accuracy is low, low efficiency, manpower wastes.Modularized design of the present invention is promoted simply, and hardware cost is low Honest and clean, model can save manpower, improve efficiency in various image classification Domain Reuses.Industry internet is greatly developed in country Under background, this method can be combined with the artificial scene for carrying out repeated object appearance classification of needs various in production, be had wide General application range.

Specifically, just illustrating by taking 13 layers of CNN network as an example:

Firstly, being based on Tensorflow platform, an entitled Multiply_ is developed using Python The model of Classification_CNN_Upgrade_Model, the major function of the model are subject image Classification and Identifications.

Here entire model construction is encapsulated in a class (class) of python, by the method for class by model knot Structure refinement and modularization, finally construct entire model.

The component for constructing model includes: get_tfrecord_data, batch_data, convolution_net, one_ Hot, loss, optimizer.

Get_tfrecord_data: right for reading image data and its labeled data from tfrecord formatted file Image data carries out random left and right overturning, and picture luminance and contrast adjust at random in a certain range.

Batch_data: for specifying 100 picture data to be placed in an array, the picture of another opposite mark is set In another array, two arrays form the data of a batch.

Convolution_net: for carrying out convolution sum classification processing to a batch picture.

This example totally 13 layer model, including preceding 12 layers of convolutional layer and the full articulamentum of the last layer, in the present embodiment first Input of the output of layer as the second layer, and the output of first layer simultaneously is added with the output of third layer as the final of third layer It exports (being thusly-formed identical mapping), then the final output of third layer is used as the 4th layer of input again, and third layer is final Output is added final output (being thusly-formed identical mapping) as layer 5 with the output of layer 5, and so on build mould Type.

In addition to first layer and the last layer, a module is set as every 4 layers since the second layer, therefore the second layer is to 13 layers 3 modules are formed, the convolution kernel that the first floor, that is, second layer of 3 modules, six layers and ten layers do 8 times rises dimension and operates, and module rear Three layers of convolution nuclear volume remains unchanged.

It is, 16 the second to five layers of first layer convolution kernel initial value rise dimension 16*8=128, the six to nine layer rises dimension 32*8 =256, the ten to ten three layer rises dimension 64*8=512, and learning rate uses linear decrease, this example (0.001 to 0.00001 line Property is successively decreased).

One_hot: for label data to be converted to one_hot formatted data, such as there are four classification, it is labeled as the one of 2 Picture, one_hot data are [0,0,1,0].

Loss: penalty values are calculated using cross entropy.

Optimizer: using majorized function Optimized model data, this model uses tf.train.AdamOptimizer letter Number optimization loss value.

It is compared using this deep neural network model and tradition CNN network test results:

Same training set (70000) and test set (4600) in comparison；

13 layers of traditional CNN network general models highest accuracy rate 72%；

Highest accuracy rate 79% after identity mapping method is added in 13 layers of traditional CNN network general models；

Identity mapping is added in 13 layers of traditional CNN network general models and modularization rises dimension (k=8) highest accuracy rate energy afterwards It gets at up to 84.7%；

Identity mapping is added in 13 layers of traditional CNN network general models and modularization rises dimension (k=8), and is added automatic Highest accuracy rate can reach 87.8% at present after upgrading iteration module.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. a kind of method of object appearance detection, which is characterized in that specific step is as follows:

It builds convolutional neural networks and adds deep neural network model based on full linking layer, and typing object correlation picture；

Deep neural network model obtains object appearance classification and picture feature by carrying out circuit training study to object picture Between correlation, to have to the resolving ability of appearance；

Learnt according to above-mentioned a large amount of circuit training, deep neural network model gradually restrains the optimal power for obtaining each characteristic value Weight；

The object appearance picture detected will be needed to be transmitted to model when detection, one group of characteristic value that model is obtained according to training is to corresponding Image carries out convolution operation, and the probability of its classification, last root are finally obtained further according to the activation situation and respective weights of characteristic value It is worth classification results according to the probability interval of setting.

2. the method for object appearance detection as described in claim 1, it is characterised in that: further include automatic after the method detection Iteration, the picture that will test misjudgment are automatically added to training set and are instructed automatically on the basis of the model of current optimal weight Practice, and whether the accuracy rate of the model after comparative training is more than accuracy rate under current optimal weight, if it exceeds by automatic Newest model is imported, persistently model is iterated.

3. the method for object appearance detection as described in claim 1, it is characterised in that: first to be carried out before executing training study Picture processing carries out classification annotation, and all figures first according to the picture for manually distinguishing and being placed on different files automatically Piece is converted into unified size and format, then each pixel information of the picture after conversion is write 1 specified file and note Record its corresponding markup information.

4. the method for object appearance detection as described in claim 1, it is characterised in that: the training study is following by batch Ring carries out, and can all go out new weight and the update of each characteristic value according to error amount derivation after each circuit training study, make Obtaining the error amount in training can be gradually reduced, model state in which when deep neural network model can judge automatically trained, and It is automatic to execute respective operations.

5. the method for object appearance detection as claimed in claim 4, it is characterised in that: model state in which when described trained It is that the accuracy rate of result is verified according to the accuracy rate of training learning outcome, or training learning outcome and verifying pictures verifying Judgement obtains.

6. the method for object appearance as claimed in claim 5 detection, it is characterised in that: the state include not convergence state, Poor fitting state and over-fitting state, when state be not convergence state and when over-fitting state training study terminate, when to owe to intend Training study continues when conjunction state；

The not convergence state refers to that the accuracy rate of trained learning outcome is still below setting value after certain circulation；

7. the method for object appearance detection as described in claim 1, it is characterised in that: deep neural network model is at all batches The weight updated after secondary training, and the weight of each batch is restored to model and is assessed with test data, record this batch The corresponding accuracy rate of secondary weight, after all batch weights are completed, deep neural network model can recommend ranking automatically The weight in forefront.

8. the method for object appearance detection as described in claim 1, it is characterised in that: classification results are according to deep neural network The class probability in each face that model provides obtains the judgement in each face as a result, being divided further according to the judging result in all faces Class result.

9. a kind of deep neural network model, it is characterised in that: including

Identical mapping is added based on convolutional neural networks model and sub-module rises the concept of dimension

Training module learns for reading object correlation picture, and according to the circuit training for being set for batch；

Each weight for storing the weight updated after all training, and is restored to model and uses test data by evaluation module It is assessed, records the corresponding accuracy rate of weight of the batch, then recommend the weight sets in ranking forefront to model；

Prediction module calculates every picture for obtaining the appearance picture of article, and using the deep-neural-network connected entirely Class probability, obtain judging result further according to class probability；

Deep neural network model further comprises picture preprocessing module, for reading object correlation picture, classifies automatically Label, and handling is the information for meeting setting.

10. deep neural network model as claimed in claim 9, it is characterised in that: the neural network model further includes changing For module, for the picture for occurring judging incorrectly to be added to training module, and it is enterprising on the model basis of current optimal weight Capable automatic training, and whether the accuracy rate of the model after comparative training is more than the accuracy rate of current optimal weights, if it exceeds inciting somebody to action It is automatically imported newest model, persistently model is iterated.