CN110598752B

CN110598752B - Image classification model training method and system for automatically generating training data set

Info

Publication number: CN110598752B
Application number: CN201910759146.7A
Authority: CN
Inventors: 刘骏; 张啸宇
Original assignee: Shenzhen Yujun Vision Intelligent Technology Co ltd
Current assignee: Shenzhen Yujun Vision Intelligent Technology Co ltd
Priority date: 2019-08-16
Filing date: 2019-08-16
Publication date: 2023-04-18
Anticipated expiration: 2039-08-16
Also published as: CN110598752A

Abstract

The invention discloses an image classification model training method and system for automatically generating a training data set, which comprises the following steps: building a key feature learning model, and training the feature learning model by using an original training set consisting of pictures only labeled with key features of objects; capturing target features of each picture in the original training set, and automatically labeling the captured features according to the classification of the captured features to generate key feature data; training key characteristic data; and after the key characteristic data are trained, feeding back the pictures with the wrong classification to the characteristic learning model, and putting the pictures with the wrong classification into a training set of the characteristic learning model for iterative upgrade. The invention solves the problems of deep learning model training at present, and overcomes the problems of direction guidance of data on irrelevant keys of model training, disordered data labeling, wrong data classification, low data sorting efficiency, difficulty in meeting the production requirement of model accuracy rate and the like under the large data training at present.

Description

Image classification model training method and system capable of automatically generating training data set

Technical Field

The invention relates to the technical field of image vision, in particular to an image classification model training method and system for automatically generating a training data set.

Background

With the continuous development of industrial production and the continuous increase of material demand of people, the yield of various electronic consumer products is increased year by year, and the annual yield of some enterprise consumer electronic products exceeds 2 hundred million. With the continuous expansion of the yield, the enterprise has stronger and stronger requirements on automatic product appearance detection. Besides traditional automation enterprises, a small number of AI-related enterprises start to be laid out on the appearance detection technology, but because the development of AI is still in the starting stage, the accuracy rate of the current image recognition of the whole industry is about 90%, and the generalization is poor, so that the industry is difficult to land.

The current deep learning model needs big data to train the model, and data collection, data sorting and data labeling in the previous period need to be completed manually, so that a large amount of human resources and time are consumed, and the situations of classification errors and labeling errors occur sometimes. And because the training data amount is huge, most of the training data of the model only simply classify and label the original pictures. The above situations directly result in poor final training effect, research and development personnel need to spend a lot of time to collect more data for labeling, and repeated experiments prove that consumed resources are quite huge. Even if the test requirement is finally met, the product price is high due to the large research and development cost in the early stage, and the popularization of the AI technology is not facilitated.

Disclosure of Invention

In view of the above-mentioned defects of the prior art, the technical problem to be solved by the present invention is to provide an image classification model training method and system for automatically generating a training data set, wherein a batch of key image features are screened out by using the experience of an expert, then a key feature learning model is used to capture key features of original data, and finally the generated accurate training data set is used for training an image classification model, so that the model can quickly learn the key features in a real sense and improve the weight value thereof, thereby achieving the purpose of quickly and highly completing the training task of the model.

In order to achieve the above object, the present invention provides an image classification model training method for automatically generating a training data set, comprising:

step 1, building a key feature learning model, and training the feature learning model by using an original training set consisting of pictures only labeled with key features of objects;

step 2, capturing target features of each picture in the original training set, and automatically labeling the captured features according to the classification of the captured features to generate key feature data;

step 3, training the key characteristic data;

and 4, after the key feature data are trained, feeding back the pictures with the wrong classification to the feature learning model, and putting the pictures with the wrong classification into a training set of the feature learning model for iterative upgrading.

Further, the step 1 specifically comprises:

1) Performing convolution operation on the image by utilizing a WideResNet network to extract image characteristics, wherein the image characteristics are used for subsequent candidate area screening and classification;

2) Screening a foreground candidate area with the maximum probability based on image characteristics extracted by a WideResNet through a 3*3 convolutional layer and a softmax full-link layer;

3) Classifying the candidate regions by using the image features extracted in the step 1);

4) And finishing the correction of the position of the candidate region in a linear regression mode.

Further, the step 2 specifically includes:

1) And (3) automatic generation: returning coordinate axes of the key feature region relative to the upper left corner (x 1, y 1) and the lower right corner (x 2, y 2) of the original image by the feature learning model, and automatically generating a local key feature picture according to the coordinate axes;

2) Automatic labeling: automatically labeling the original picture and the local key feature picture according to the identified key feature category to realize automatic labeling of training data;

3) Automatic data expansion: according to the quantity comparison condition of the generated various training data, automatic data expansion is carried out on the classification with less data so as to achieve the following conditions that the various data 1:1, automatic data expansion is performed on the original picture in a random cutting mode.

Further, while training the key feature data in step 3, the method further includes: and storing the updated weights after all batches are trained, recovering the weights of each batch to key feature learning, evaluating the weights by using pre-imported test data, recording the accuracy rate corresponding to the weights of each batch, and automatically recommending the weights in the front of the ranks after all batches are tested.

Further, in the step 4: and automatically adding the pictures judged to be wrong into a training set in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

An image classification model training system that automatically generates a training data set, comprising:

the key feature learning module is used for training the feature learning model by utilizing an original training set consisting of pictures only labeled with key features of the object;

the key feature data automatic generation module is used for capturing the target features of each picture in the original training set, automatically labeling the captured features according to the classification of the captured features and generating key feature data;

the model automatic training module is used for training the key characteristic data;

and the key characteristic learning model automatic iteration module is used for feeding back the pictures with wrong classification to the characteristic learning model after the key characteristic data is trained, and putting the pictures with wrong classification into a characteristic learning model training set for iterative upgrade.

Further, the key feature learning module is specifically configured to:

2) Screening a foreground candidate area with the maximum probability through 3*3 convolutional layers and softmax full-link layers based on image characteristics extracted by WideResNet;

Further, the key feature data automatic generation module is specifically configured to:

1) And (3) automatic generation: the feature learning model returns coordinate axes of the key feature region relative to the upper left corner (x 1, y 1) and the lower right corner (x 2, y 2) of the original image, and a local key feature picture is automatically generated according to the coordinate axes;

3) Automatic data expansion: according to the quantity comparison condition of the generated various training data, automatic data expansion is carried out on the classification with less data so as to achieve the following conditions that the various data 1:1, automatic data expansion is performed on the original picture in a random cropping mode.

Further, the model auto-training module is further configured to:

and storing the updated weights after all batches are trained, restoring the weight of each batch to key feature learning, evaluating the weight by using pre-imported test data, recording the accuracy corresponding to the weight of each batch, and automatically recommending the weight which is ranked in the front after all batches are tested.

Further, the key feature learning model automatic iteration module is specifically configured to: and automatically adding the pictures with wrong judgment into a training set in the process of normally grabbing the key features, automatically training on the basis of the current model with the best weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

The invention has the beneficial effects that:

according to the invention, by building a model training system which guides and learns key features of images and automatically generates a precise training data set, a plurality of problems of deep learning model training at present are solved in principle, and the problems that under the large data training at present, data guide to the direction of a key irrelevant to the model training, data labeling is disordered, data classification is wrong, data sorting efficiency is low, the model accuracy rate hardly meets the production requirement and the like are solved; meanwhile, due to the sustainable self-iterative upgrade, the early data processing work of the system can theoretically achieve zero errors. Training data is greatly optimized, and meanwhile, the system is based on modular (SDK) management, is convenient to integrate, simple to popularize and wide in application range.

The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.

Drawings

FIG. 1 is a flow diagram of a key feature learning module of the present invention;

FIG. 2 is an architecture diagram of the key feature learning module of the present invention;

FIG. 3 is a system architecture diagram of the present invention;

FIG. 4 is a flow diagram of a key feature learning model of the present invention;

FIG. 5 is a schematic diagram illustrating the correction of the position of the candidate region by linear regression according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be described below clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments, and that the present application is not limited by the exemplary embodiments disclosed and described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The invention provides an image classification model training method for automatically generating a training data set, which comprises the following steps:

step 3, training key characteristic data;

Wherein, the step 1 specifically comprises the following steps:

Wherein, the step 2 specifically comprises the following steps:

Wherein, when training the key feature data in step 3, the method further comprises: and storing the updated weights after all batches are trained, restoring the weight of each batch to key feature learning, evaluating the weight by using pre-imported test data, recording the accuracy corresponding to the weight of each batch, and automatically recommending the weight which is ranked in the front after all batches are tested.

Wherein, in the step 4: and automatically adding the pictures judged to be wrong into a training set in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

The invention also provides an image classification model training system for automatically generating a training data set, which comprises:

Wherein the key feature learning module is specifically configured to:

1) Performing convolution operation on the image by using a WidereResNet network to extract image characteristics, wherein the image characteristics are used for subsequent candidate area screening and classification;

The key feature data automatic generation module is specifically configured to:

Wherein the model auto-training module is further configured to: and storing the updated weights after all batches are trained, restoring the weight of each batch to key feature learning, evaluating the weight by using pre-imported test data, recording the accuracy corresponding to the weight of each batch, and automatically recommending the weight which is ranked in the front after all batches are tested.

The key feature learning model automatic iteration module is specifically configured to: and automatically adding the pictures judged to be wrong into a training set in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

The following specifically discusses the principles of the image classification model training method and system for automatically generating a training data set according to the present invention:

the whole process comprises the steps of firstly screening some key features of an image by using the experience of an expert, enabling a feature learning model to learn the key features (small data volume learning (500-1000)), then automatically capturing the key features of a data set for training by the feature learning model, summarizing and automatically labeling the captured key features and other training images, then automatically putting the captured key features into the model for training, and automatically putting pictures with wrong classification into the model for retraining. The training system can enable the feature learning model to automatically generate key feature training data, guide the trained image classification model to quickly learn the real key feature images, extract more fine key features by matching with the self characteristics of the deep learning model, reasonably distribute weighted values of various features, quickly converge and greatly improve the accuracy.

In order to achieve the purpose, the scheme of the invention is as follows: a key feature learning model (KFSM) is built on the basis of fast RCNN + WideResNet, as shown in FIG. 4. A small number (500) of pictures marked with key features of objects (in this example, crush injury, poor clamping, exposed inner cores and good products) are used for training the model, so that the model learns the correlation between the key feature classification and the picture features, and the model has the capability of identifying and extracting the appearance key features. And then, the feature learning model is utilized to capture the target features of each image in the original training set, the captured features are automatically labeled according to the classification of the captured features, and finally, the learning model randomly rotates the original data and the feature data, adjusts the contrast and the brightness, and finally, the image feature capturing method comprises the following steps of 1:2, the final training data is automatically generated, and then the model is automatically called for training.

After model training is completed, an evaluation system is used for comparing the accuracy of test data of the model, the wrongly classified pictures are fed back to the KFSM model, and the KFSM model can place the wrongly classified pictures into a training set of the KFSM model for iterative upgrading. The technology realizes the closed loop of key feature learning, key feature capturing, automatic data labeling and training and iterative whole model training, and the whole process is automatically completed by the system.

As shown in fig. 3, the present invention is composed of the following modules:

1. a key feature learning module; 2. a key characteristic data automatic generation module; 3. a model automatic training module; 4. the key feature learning model is an automatic iteration module, and each module is introduced below.

1. Key feature learning module

The core of the module is a key feature learning model built based on a FasterRCNN target detection model and a WideResNet classification model, the model consists of a candidate region generation module, a feature extraction module, a classification module and a position correction module, a flow chart of the model is shown in figure 1, and an architecture diagram of the model is shown in figure 2.

1) Convolving the image with a wideResNet (CNN 1-CNN 13 layer) network to extract image features, which are used for subsequent candidate region screening and classification.

The principle of the method is that a large number of object pictures marked with types are used for training a model, picture features are extracted layer by layer through a plurality of layers of CNNs (convolutional neural networks), loss values are obtained through difference values of prediction marks and training marks, the weight causing the maximum loss is measured and updated according to reverse derivation of the loss values, the model gradually converges to obtain the optimal weight of each feature value in a large number of cyclic training, and training is completed. The model can carry out convolution one by one on the input image in work, and a new feature map can be formed and output after the features are found.

2) And screening 300 foreground candidate areas with the highest probability based on the image features extracted by the WideResNet through a 3*3 convolutional layer and a softmax full-link layer.

3) And classifying the candidate regions by using the image features extracted in the first step.

4) The position of the candidate region is corrected by linear regression, as shown in fig. 5. The target detection adopts a rectangular box to screen out the target, so that four-dimensional vector expressions (x, y, w, h) are used for respectively representing the coordinates of the central point of the box, the width and the height. The sparse dotted line box represents the original check area, the dense dotted line box represents the correct area, and the original check area is adjusted through translation and scaling to be closer to the correct area. The original area is O (Ox, oy, ow, oh), the correct area is T (Tx, ty, tw, th), O gets the target area O1 through a linear transformation C, let O1 approach T infinitely,

C(Ox,Oy,Ow,Oh)=(O1x,O1y,O1w,O1h)≈(Tx,Ty,Tw,Th)；

the translation formula: o1x = Oxdx (O) + Ox, O1y = Oydy (O) + Oy;

scaling formula: o1w = Owexp (dw (O)), O1h = Ohexp (dh (O));

the key point is to obtain dx (O), dy (O), dw (O) and dh (O) through linear regression learning.

Inputting a feature vector = X (feature extracted by convolution), learning parameters = Z (transformation amount (tx, ty, tw, th) from an original region to a correct region), outputting = Y (dx (O), dy (O), dw (O), dh (O)), and Y = ZX.

In the training process of the model, all classified and labeled key feature images in a specified file are read (pre-screened by experts), then pictures in various formats are converted into JPG formats with uniform sizes (300 x 300 in the example), and the blank spaces of the reduced pictures are automatically filled with black. And writing the information of each pixel point of the converted picture into a record file, recording corresponding marking information of the record file, and determining the weight value of each key feature by training a model (in the example, each type of 200 original samples is automatically expanded to 2000 samples by automatic rotation, brightness adjustment and random cutting).

2. Key characteristic data automatic generation module

After the training of the key feature learning model is completed, the generated model (pb file) is automatically used in a key feature capture module to capture key features of all original images, which specifically comprises the following steps:

1) And (3) automatic generation: the model returns coordinate axes of the key feature region relative to the upper left corner (x 1, y 1) and the lower right corner (x 2, y 2) of the original image, and a local key feature picture is automatically generated according to the coordinate axes.

2) Automatic labeling: the system automatically marks the original picture and the local key feature picture according to the identified key feature category, so as to realize the automatic marking of the training data.

3) Automatic data expansion: the system can automatically expand data of the classes with less data according to the quantity comparison condition of the generated various types of training data so as to achieve the following conditions that the various types of data 1: 1. Automatic data expansion is performed on an original picture in a random cutting mode.

3. Model automatic test evaluation module

The system stores the updated weights after training for all batches of the model (each batch refers to the fixed cycle times of all training data, and the cycle times can be set as 10 in this example), restores the weights of each batch to the model, evaluates the weights with the pre-imported test data, and records the accuracy corresponding to the weights of each batch. After all the batch weights are tested, the system automatically recommends the top (top ten in this example) weight.

4. Automatic iteration module of key feature learning model

And automatically adding the pictures judged to be wrong into a training set by the model in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy rate of the trained model exceeds the accuracy rate of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

In conclusion, the design of the invention introduces the concepts of target detection, deep learning, fasterRCNN, CNN, automatic labeling, automatic iteration and WidereResNet, so that the key characteristic sorting of mass data becomes possible (theoretically, the data has no upper limit), and the time for sorting and labeling the data is greatly prolonged. Because the data contains the key characteristics defined by human (expert) and having definite significance, the convergence speed and the generalization capability of the model training are obviously improved. The invention fundamentally solves the problems of manpower requirement, wrong labeling, disordered classification, low efficiency and the like caused by the requirement on a large amount of training data in the deep learning model training at present. The training data are efficiently optimized, and the accuracy can be greatly improved in a vertical subdivision scene.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element described by the phrase "comprising a. -" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image classification model training method for automatically generating a training data set, comprising:

step 3, training key characteristic data;

step 4, after the key feature data are trained, feeding back the pictures with the wrong classification to the feature learning model, and putting the pictures with the wrong classification into a training set of the feature learning model for iterative upgrade;

the step 1 specifically comprises the following steps:

4) Finishing correction on the position of the candidate region in a linear regression mode;

the step 2 specifically comprises the following steps:

2. The method as claimed in claim 1, wherein the step 3, while training the key feature data, further comprises: and storing the updated weights after all batches are trained, recovering the weights of each batch to key feature learning, evaluating the weights by using pre-imported test data, recording the accuracy rate corresponding to the weights of each batch, and automatically recommending the weights in the front of the ranks after all batches are tested.

3. The method for training an image classification model to automatically generate a training data set according to claim 1, wherein in the step 4: and automatically adding the pictures judged to be wrong into a training set in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.

4. An image classification model training system that automatically generates a training data set, comprising:

the key characteristic learning module is used for training the characteristic learning model by utilizing an original training set consisting of pictures only labeled with the key characteristics of the object;

the key characteristic learning model automatic iteration module is used for feeding back the pictures with wrong classification to the characteristic learning model after the key characteristic data are trained, and putting the pictures with wrong classification into a characteristic learning model training set for iterative upgrade;

the key feature learning module is specifically configured to:

4) Completing correction on the position of the candidate region in a linear regression mode;

the key feature data automatic generation module is specifically configured to:

3) Automatic data expansion: according to the quantity comparison condition of the generated various types of training data, automatic data expansion is carried out on the classification with less data so as to achieve the following conditions that the various types of data 1:1, automatic data expansion is performed on the original picture in a random cropping mode.

5. The image classification model training system for automatically generating a training data set according to claim 4, wherein the model automatic training module is further configured to:

6. The image classification model training system for automatically generating a training data set according to claim 4, wherein the key feature learning model automatic iteration module is specifically configured to: and automatically adding the pictures judged to be wrong into a training set in the process of normally grabbing the key features, automatically training on the basis of the model with the current optimal weight, comparing whether the accuracy of the trained model exceeds the accuracy of the current weight, and if so, automatically importing the latest model and continuously iterating the model.