CN112488173B

CN112488173B - Model training method and system based on image augmentation and storage medium

Info

Publication number: CN112488173B
Application number: CN202011345783.9A
Authority: CN
Inventors: 林成创; 赵淦森; 李壮伟; 黄润桦; 彭璟; 吴清蓝; 张奇之; 杨晋吉; 罗浩宇; 李双印; 樊小毛; 唐华
Original assignee: South China Normal University
Current assignee: South China Normal University
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2022-09-27
Anticipated expiration: 2040-11-26
Also published as: CN112488173A

Abstract

The invention discloses a model training method, a system and a storage medium based on image augmentation, wherein the method comprises the following steps: performing first training on the first model by adopting first training data acquired in advance to obtain a second model; performing image augmentation on the first training data by adopting a plurality of preset image augmentation modes to obtain second training data; predicting the second training data by adopting a second model; determining a target augmentation mode from a plurality of image augmentation modes according to a prediction result; taking the data obtained by image augmentation of the first training data in the target augmentation mode and the first training data as third training data; and performing second training on the first model through third training data to obtain a target model. The method can effectively improve the training effect of the image data after the image is enlarged on the model. The invention can be widely applied to the technical field of model training.

Description

Model training method and system based on image augmentation and storage medium

Technical Field

The invention relates to the technical field of model training, in particular to a model training method and system based on image broadening and a storage medium.

Background

Deep learning: deep learning refers to a model using a multi-layer neural network architecture technology, which can be applied to aspects of image classification, image segmentation, target detection and the like.

Generalization ability: generalization capability is the ability of a trained model to perform on a test data set.

Machine vision applications based on deep learning are an application of end-to-end technology. Taking image classification as an example, a training image and corresponding labeling information thereof are prepared as a training data set, and after a model is designed, the classified operation can be finished directly by continuously learning the model without designing specific classification logic by technicians. The biggest benefit of the method is that the specific classification logic is not required to be concerned, the expression capability is strong, and the application applicability is strong. However, it requires tens of thousands of massive annotation images as training data sets. In some applications, such as medical image applications, collecting and labeling images is an extremely time-consuming, labor-intensive, and expensive process. Therefore, the training data set can be effectively enlarged by adopting an image augmentation technology, and the model generalization capability is improved. However, different image augmentation methods have great differences in the enhancement effect of the model, however, at present, a fixed image augmentation method is generally used to augment the image data, so that the augmented image data may not be able to effectively train the model.

Disclosure of Invention

To solve the above technical problems, the present invention aims to: the model training method, the system and the storage medium based on image augmentation are provided, and the model training effect of image data after image augmentation on the model can be effectively improved.

In a first aspect, an embodiment of the present invention provides:

a model training method based on image augmentation comprises the following steps:

performing first training on the first model by adopting first training data acquired in advance to obtain a second model;

performing image augmentation on the first training data by adopting a plurality of preset image augmentation modes to obtain second training data;

predicting the second training data using the second model;

determining a target augmentation mode from the plurality of image augmentation modes according to the prediction result;

taking the data obtained by image enhancement of the first training data in the target enhancement mode and the first training data as third training data;

and carrying out second training on the first model through the third training data to obtain a target model.

Further, the performing the first training on the first model by using the pre-acquired first training data to obtain the second model includes:

acquiring first training data;

performing first training on a first model by using the first training data;

and obtaining a second model after a plurality of iteration times of the loss function of the first model in the first training meet a preset requirement.

Further, adopt a plurality of kinds of image increase modes of presetting to right first training data carries out the image increase, obtains second training data, and it specifically is:

and sequentially adopting one of the image augmentation modes to perform image augmentation on the first training data to obtain second training data corresponding to the one of the image augmentation modes.

Further, the determining a target augmentation mode from the plurality of image augmentation modes according to the prediction result includes:

sequentially calculating a predicted loss value corresponding to the second training data;

the image enhancement mode corresponding to the maximum predicted loss value is set as the target enhancement mode.

Further, the performing a second training on the first model through the third training data to obtain a target model includes:

performing second training on the first model by using the third training data;

and obtaining a target model after a plurality of iteration times of the loss function of the first model in the second training meet a preset requirement.

Further, the number of iterations is a consecutive number of iterations.

Further, the plurality of image augmentation modes include random rotation, random cropping, random pasting and optical image augmentation; each image augmentation mode includes different parameters.

In a second aspect, an embodiment of the present invention provides:

a model training system based on image augmentation, comprising:

the first training module is used for performing first training on the first model by adopting first training data acquired in advance to obtain a second model;

the image augmentation module is used for performing image augmentation on the first training data by adopting a plurality of preset image augmentation modes to obtain second training data;

a prediction module configured to predict the second training data using the second model;

the determining module is used for determining a target augmentation mode from the plurality of image augmentation modes according to a prediction result;

the storage module is used for taking the data obtained by image amplification of the first training data in the target amplification mode and the first training data as third training data;

and the second training module is used for carrying out second training on the first model through the third training data to obtain a target model.

In a third aspect, an embodiment of the present invention provides:

a model training system based on image augmentation, comprising:

at least one memory for storing a program;

and the at least one processor is used for loading the program to execute the model training method based on the image augmentation.

In a fourth aspect, an embodiment of the present invention provides:

a storage medium having stored therein processor-executable instructions for implementing the image augmentation-based model training method when executed by a processor.

The embodiment of the invention has the beneficial effects that: the embodiment of the invention performs the first training on the first model through the first training data which is acquired in advance to obtain the second model, then, image augmentation is carried out on the first training data by adopting a plurality of preset image augmentation modes to obtain second training data, a second model is adopted to predict the second training data, then determining a target augmentation mode from a plurality of image augmentation modes according to the prediction result, using data obtained by image augmentation of the first training data and the first training data as third training data, and finally performing second training on the first model by the third training data to obtain a target model, therefore, the situation that the model cannot be effectively trained due to the fact that the image data are augmented by the fixed image augmentation mode is avoided, and the training effect of the image data after the image augmentation on the model is effectively improved.

Drawings

Fig. 1 is a flowchart of a model training method based on image augmentation according to an embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and the specific embodiments. For the step numbers in the following embodiments, they are set for convenience of illustration only, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict. Furthermore, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the invention, "a plurality" means two or more unless otherwise specified.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Referring to fig. 1, an embodiment of the present invention provides a model training method based on image broadening, and the embodiment may be applied to a server or each processing terminal. The embodiment comprises the following steps:

s1, performing first training on the first model by adopting first training data acquired in advance to obtain a second model; the first training data in this step is data that has not been processed in any image enhancement manner, and is raw image data. The first model may be an image classification model, an image segmentation model, or an object detection model, etc. The second model is the model after the first model is subjected to the first training.

In some embodiments, step S1 may be implemented by:

firstly, acquiring first training data; then, performing first training on the first model by adopting first training data; and obtaining a second model after a plurality of iteration times of the loss function of the first model in the first training meet a preset requirement. Wherein the number of iterations is a number of iterations that are consecutive in the loss function. The preset requirement means that the number of consecutive iterations does not decrease any more. In this embodiment, when the number of successive iterations does not decrease any more, the model is considered to have reached convergence, and the model having reached the convergence is taken as the second model.

S2, performing image augmentation on the first training data by adopting a plurality of preset image augmentation modes to obtain second training data; specifically, in the step, one of a plurality of image augmentation modes is sequentially adopted to perform image augmentation on the first training data, so that second training data corresponding to the current image augmentation mode are obtained. For example, if the plurality of image augmentation modes include A, B, C, D and E five image augmentation modes, image augmentation is performed on the first training data by using an a image augmentation mode first, and second training data corresponding to the a image augmentation mode is obtained; and then, performing image augmentation on the first training data by adopting a B image augmentation mode to obtain second training data corresponding to the B image augmentation mode, and repeating the steps until the first training data is subjected to image augmentation by adopting an E image augmentation mode to obtain second training data corresponding to the E image augmentation mode, so that the image augmentation operation is completed. In this embodiment, the plurality of image augmentation modes include random rotation, random cropping, random pasting, and optical image augmentation, where each image augmentation mode includes different parameters, and the different parameters correspond to different image augmentation modes.

S3, predicting the second training data by adopting a second model; in the step, second training data obtained by various image augmentation modes are predicted in sequence by adopting a second model.

S4, determining a target augmentation mode from the plurality of image augmentation modes according to the prediction result; the prediction result in this step is the prediction result in step S3.

In some embodiments, step S4 may be implemented by:

sequentially calculating a predicted loss value corresponding to the second training data; the calculation process is to calculate through a loss function, wherein the calculation mode includes multiple modes, such as cross entropy loss calculation, mean square error loss calculation, smooth loss calculation and the like. Then, the image enhancement mode corresponding to the maximum predicted loss value is set as the target enhancement mode.

In this embodiment, the larger the predicted loss value is, the poorer the recognition capability of the image obtained by performing the operation on the original image by using the augmentation strategy to obtain the augmented image is, which indicates the first model that has been trained at present. The method shows that the maximum difference information can be brought by the augmentation strategy under the condition of ensuring that the image label is not changed. Thus, using this strategy as the final augmentation strategy enables the model to learn the most discriminative features of the image.

S5, using the data obtained by image amplification of the first training data in the target amplification mode and the first training data as third training data; and carrying out second training on the first model through third training data to obtain a target model.

In some embodiments, the second training of the first model by the third training data results in a target model, which may be implemented by:

performing second training on the first model by adopting third training data;

and when a plurality of iteration times of the loss function of the first model in the second training meet the preset requirement, obtaining a target model.

In this embodiment, the number of iterations is a number of iterations that are consecutive in the loss function. The preset requirement means that the number of consecutive iterations does not decrease any more. In this embodiment, when the number of successive iterations does not decrease any more, the model is considered to reach convergence, and the model reaching the convergence is used as the target model, so as to improve the prediction accuracy of the target model in the application process.

In summary, the embodiment of the method can quickly determine the optimal enhancement strategy only by one additional training cost, so that only an image training set is given, a model is designed, and an image augmentation mode is configured, and the embodiment of the method can automatically search out the strongest image augmentation mode, thereby improving the model training effect and improving the prediction accuracy of the target model in the application process.

The embodiment of the invention provides a model training system based on image broadening corresponding to the image 1, which comprises the following steps:

the first training module is used for carrying out first training on the first model by adopting first training data acquired in advance to obtain a second model;

the storage module is used for taking the data obtained by image enhancement on the first training data in the target enhancement mode and the first training data as third training data;

The contents of the embodiment of the method of the invention are all applicable to the embodiment of the system, the functions specifically realized by the embodiment of the system are the same as those of the embodiment of the method, and the beneficial effects achieved by the embodiment of the system are also the same as those achieved by the method.

The embodiment of the invention provides a model training system based on image augmentation, which comprises:

at least one memory for storing a program;

at least one processor configured to load the program to perform the method of FIG. 1.

Embodiments of the present invention provide a storage medium having stored therein processor-executable instructions, which when executed by a processor, are used to implement the method shown in fig. 1.

The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A model training method based on image augmentation is characterized by comprising the following steps:

predicting the second training data using the second model;

performing second training on the first model through the third training data to obtain a target model;

wherein, it is right to adopt a plurality of kinds of image increase modes of presetting first training data carries out the image and increases extensively, obtains second training data, and it specifically is:

sequentially adopting one of the plurality of image augmentation modes to perform image augmentation on the first training data to obtain second training data corresponding to the one of the image augmentation modes;

the determining a target augmentation mode from the plurality of image augmentation modes according to the prediction result includes:

2. The image augmentation-based model training method according to claim 1, wherein the first training is performed on the first model by using pre-acquired first training data to obtain a second model, and comprises:

acquiring first training data;

performing first training on a first model by using the first training data;

3. The image augmentation-based model training method of claim 1, wherein the second training of the first model through the third training data to obtain a target model comprises:

performing second training on the first model by adopting the third training data;

4. The image augmentation-based model training method according to claim 2 or 3, wherein the number of iterations is a continuous number of iterations.

5. The method for model training based on image augmentation according to any one of claims 1-3, wherein the plurality of image augmentation modes include random rotation, random cropping, random pasting, and optical image augmentation; each image augmentation mode includes different parameters.

6. The utility model provides a model training system based on image augmentations which characterized in that includes:

7. The utility model provides a model training system based on image augmentations which characterized in that includes:

at least one memory for storing a program;

at least one processor configured to load the program to perform the method for model training based on image augmentation of any one of claims 1-5.

8. A storage medium having stored therein processor-executable instructions, which when executed by a processor, are configured to implement the image augmentation-based model training method according to any one of claims 1-5.