WO2021164228A1

WO2021164228A1 - Method and system for selecting augmentation strategy for image data

Info

Publication number: WO2021164228A1
Application number: PCT/CN2020/111666
Authority: WO
Inventors: 王俊; 高鹏; 谢国彤; 杨苏辉
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-02-17
Filing date: 2020-08-27
Publication date: 2021-08-26
Also published as: CN111275129A; CN111275129B

Abstract

The present application relates to the field of artificial intelligence, and provided in the embodiments thereof are a method and system for selecting an augmentation strategy for image data. The present application relates to the technical field of artificial intelligence. The method comprises: from within an augmentation strategy set, selecting a plurality of strategy subsets to be determined to perform sample augmentation on a preset sample training set so as to obtain a plurality of augmented sample training sets; training an initialized classification model by using each augmented sample training set so as to obtain a plurality of trained classification models; inputting a preset sample validation set into each trained classification model to obtain classification accuracy degrees corresponding to the trained classification models; and determining an optimal strategy subset from among the plurality of strategy subsets by using a Bayesian optimization algorithm on the basis of the classification accuracy degrees corresponding to each trained classification model. The technical solution provided in the embodiments of the present application is able to solve the problem of difficulty in determining which augmentation strategy is most effective for the current type of image sample.

Description

Method and system for selecting augmentation strategy of image data

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 20, 2020, the application number is 202010095784.6, and the invention title is "A method and system for selecting an augmentation strategy for image data", the entire content of which is incorporated by reference Incorporated in this application.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method and system for selecting an augmentation strategy for image data.

Background technique

The success of deep learning in the field of computer vision is due to a large amount of labeled training data, because the performance of the model usually increases with the increase in the quality, diversity and quantity of training data. However, it is often very difficult and expensive to collect enough high-quality data to train the model to have good performance.

At present, some data augmentation strategies are commonly used to increase the amount of data for training computer vision models, such as translation, rotation, and flipping through random "expansion" to increase the number and diversity of training samples.

The inventor realizes that there are various augmentation strategies currently available, and they perform differently when facing different data sets, and it is difficult to determine which augmentation strategy is most effective for the current type of image data set.

Summary of the invention

In view of this, the embodiments of the present application provide a method and device for selecting an augmentation strategy for image data to solve the problem in the prior art that it is difficult to determine which augmentation strategy is most effective for the current type of image data set.

In order to achieve the above objective, according to one aspect of the present application, an augmentation strategy selection method for image data is provided. The method includes: selecting a plurality of pending strategy subsets from an augmentation strategy set to a preset sample training set Perform sample augmentation to obtain multiple augmented sample training sets, wherein each of the pending strategy subsets is composed of at least one augmentation strategy in the augmentation strategy set; each of the augmented strategy subsets is used Training the initialized classification model on the sample training set to obtain multiple trained classification models; inputting a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model; A Bayesian optimization algorithm is used to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.

In order to achieve the foregoing objective, according to one aspect of the present application, an augmentation strategy selection system for image data is provided, the system including an augmenter, a classification model, and a controller;

The augmenter is used to select multiple undetermined strategy subsets from the augmentation strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the The undetermined strategy subset is composed of at least one augmentation strategy in the augmentation strategy set;

The classification model is used to train an initialized classification model using each of the augmented sample training sets to obtain a plurality of trained classification models; and input a preset sample verification set into each of the trained samples A classification model to obtain the classification accuracy corresponding to the trained classification model;

The controller is configured to use a Bayesian optimization algorithm to determine an optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.

In order to achieve the above objective, according to one aspect of the present application, a computer-readable storage medium is provided, the storage medium includes a stored program, and when the program runs, the device where the storage medium is located is controlled to perform the following steps:

Select multiple undetermined strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the undetermined strategy subsets is augmented by the At least one augmentation strategy in the broad strategy set;

Training an initialized classification model using each of the augmented sample training sets to obtain multiple trained classification models;

Input a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model;

A Bayesian optimization algorithm is used to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.

In order to achieve the above objective, according to one aspect of the present application, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes all When the computer program is described, the following steps are implemented:

In this scheme, by using different augmentation strategies to perform sample augmentation on similar samples, each augmented sample training set is used to train the initialized classification model to obtain multiple trained classification models, and use samples to verify Validate the trained classification model, and then according to the classification accuracy of the classification model and the Bayesian optimization algorithm to obtain a suitable augmentation strategy that meets this type of sample, which can improve the efficiency of augmentation strategy selection.

Description of the drawings

FIG. 1 is a flowchart of an optional method for selecting an augmentation strategy for image data according to an embodiment of the present application;

2 is a schematic diagram of an optional image data augmentation strategy selection system provided by an embodiment of the present application;

FIG. 3 is a functional block diagram of an optional controller provided by an embodiment of the present application;

Fig. 4 is a schematic diagram of an optional computer device provided by an embodiment of the present application.

Detailed ways

In order to better understand the technical solutions of the present application, the following describes the embodiments of the present application in detail with reference to the accompanying drawings.

It should be clear that the described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present application. The singular forms of "a", "the" and "the" used in the embodiments of the present application and the appended claims are also intended to include plural forms, unless the context clearly indicates other meanings.

It should be understood that the term "and/or" used in this text is only an association relationship describing the associated objects, indicating that there can be three types of relationships, for example, A and/or B can mean that A alone exists, and both A and A exist at the same time. B, there are three cases of B alone. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.

It should be understood that although the terms first, second, third, etc. may be used to describe terminals in the embodiments of the present application, these terminals should not be limited to these terms. These terms are only used to distinguish terminals from each other. For example, without departing from the scope of the embodiments of the present application, the first terminal may also be referred to as the second terminal, and similarly, the second terminal may also be referred to as the first terminal.

Depending on the context, the word "if" as used herein can be interpreted as "when" or "when" or "in response to determination" or "in response to detection". Similarly, depending on the context, the phrase "if determined" or "if detected (statement or event)" can be interpreted as "when determined" or "in response to determination" or "when detected (statement or event) )" or "in response to detection (statement or event)".

Fig. 1 is a flowchart of a method for selecting an augmentation strategy for image data according to an embodiment of the present application. As shown in Fig. 1, the method includes:

Step S01: Select multiple pending strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, where each pending strategy subset is augmented by At least one augmentation strategy in the strategy set;

Step S02: Use each augmented sample training set to train an initialized classification model to obtain multiple trained classification models;

Step S03, input the preset sample verification set into each trained classification model, and obtain the classification accuracy corresponding to the trained classification model;

In step S04, the Bayesian optimization algorithm is used to determine the optimal strategy subset from the multiple pending strategy subsets based on the classification accuracy corresponding to each trained classification model.

Among them, the samples in the sample training set are graphic data samples.

The specific technical solution of the method for selecting the augmentation strategy of image data provided by this embodiment will be described in detail below.

In this embodiment, the samples in the sample training set are medical image samples of the same type, such as lung images and stomach images. Each training sample has a label. For example, a training sample with a positive label is a lung image marked as having pneumonia symptoms, and a training sample with a negative label is a lung image without pneumonia symptoms. Exemplarily, the training sample is a 512*512 medical image sample.

Among them, augmentation strategies include rotation transformation, flip transformation, zoom transformation, translation transformation, scale transformation, region cropping, adding noise, segmented affine, random masking, boundary detection, contrast transformation, color dithering, random mixing, and composite overlay. The augmentation strategy is, for example, flipped transformation.

1) Rotation: randomly rotate the image at a preset angle, changing the orientation of the image content;

2) Flip transform (Flip): Flip the image along the horizontal or vertical direction;

3) Zoom: zoom in or zoom out the image according to the preset ratio;

4) Translation transformation (Shift): Translate the image in a preset manner on the image plane;

5) Scale: the image is enlarged or reduced according to the preset scale factor, or the scale space is constructed by filtering the image with the preset scale factor to change the size or blur degree of the image content;

6) Crop: Crop the region of interest of the picture;

7) Add noise (Noise): randomly superimpose some noise on the original picture;

8) Piecewise Affine: Place a regular point grid on the image, and move these points and the surrounding image area according to the number of samples in the normal distribution;

9) Random concealment (Dropout): The loss of information is realized in a rectangular area with a selectable area and a random location to achieve conversion. The loss of information in all channels produces black rectangular blocks, and the loss of information in some channels produces color noise;

10) Edge Detect: Detect all edges in the image, mark them as black and white images, and then superimpose the result with the original image;

11) Contrast transformation (Contrast): In the HSV color space of the image, change the saturation S and V brightness components, keep the hue H unchanged, and perform exponential calculations on the S and V components of each pixel (the exponent factor is between 0.25 and 4). Time), increase the light changes;

12) Color jitter: Randomly change the exposure, saturation and hue of the image to form pictures under different lighting and colors, as much as possible to make the model use different light conditions as small as possible Situation

13) Random mixing (Mix up): a data augmentation method based on the principle of neighborhood risk minimization, using linear interpolation to obtain new sample data;

14) Sample Pairing: Two images are randomly selected after being processed by the basic data augmentation operation and then superimposed into a new sample in the form of pixel averaging. The label of the new template is one of the original sample labels.

In this embodiment, any three augmentation strategies are randomly selected from the above 14 augmentation strategies to form a undetermined strategy subset, that is, a undetermined strategy subset includes 3 augmentation strategies, and each augmentation strategy includes 3 A strategy parameter, namely strategy type (μ), probability value (α), amplitude (β). Then a subset of pending strategies can be represented in the form of a numerical matrix:

Among them, each row represents an augmentation strategy. Use a numerical matrix to represent a subset of undetermined strategies to improve computational efficiency.

Step S02: Use each augmented sample training set to train an initialized classification model to obtain multiple trained classification models.

In this embodiment, the classification model is a convolutional neural network model, which is composed of a convolutional neural network and a fully connected network, and its specific structure includes at least a convolutional network layer, a pooling layer, and a fully connected network layer. The specific steps during training include:

Use the convolutional neural network to extract the feature map of each sample in the augmented sample training set of the input classification model; according to the feature map, classify and predict a corresponding sample in the augmented sample training set to obtain the classification result; obtain The classification result set and the loss function of the mean square error of the label set of all samples in the sample training set; the convolutional neural network is optimized through backpropagation, so that the value of the loss function converges, and the optimized and trained classification model is obtained.

In this embodiment, there are two classification results, namely pneumonia and non-pneumonia. The initial convolutional neural network performs feature extraction on labeled samples and performs preset rounds of training, so that the convolutional neural network layer can effectively extract more generalized features (such as edges, textures, etc.). When performing backpropagation, the accuracy of the model can be improved after the gradient is continuously reduced, so that the value of the loss function converges to the minimum, and the weight and bias of the convolutional layer and the fully connected layer are automatically adjusted to make the classification model the best optimization.

In other embodiments, the classification model may also be a long- and short-term neural network model, a random forest model, a support vector machine model, a maximum entropy model, etc., which are not limited here.

Step S03: Input the preset sample verification set into each trained classification model to obtain the classification accuracy corresponding to the trained classification model.

Specifically, the samples in the preset sample validation set are also labeled. For example, a training sample with a positive label is a lung image marked as having pneumonia symptoms, and a training sample with a negative label is marked as having no pneumonia symptoms. Of the lungs. A preset sample verification set is used to verify the trained classification model. The sample verification set corresponding to each classification model is different, which can achieve better model generalization performance and effectively solve the problem of overfitting that may be introduced by sample augmentation.

Before step S03, the method further includes:

Randomly select multiple verification subsets from the preset sample verification set;

Input multiple validation subsets into each trained classification model.

In this embodiment, a random sampling method is adopted, and the sample size ratio in the sample training set and the sample verification set may be 2:8, 4:6, 6:4, 8:2, etc. Understandably, each time a sample is drawn, 50% of the samples in the sample verification set are randomly selected to form a verification subset. In other embodiments, the ratio of random selection may be 30%, 40%, 60%, and so on.

In another embodiment, a cross-validation method is used to validate the classification model. The cross-validation method is either a ten-fold cross-validation method or a five-fold cross-validation method. For example, a five-fold cross-validation method is adopted. Specifically, multiple training samples are randomly divided into 10 parts, and 2 of them are taken as the cross-validation set each time, and the remaining 8 parts are used as the training set. When training, first use 8 of them to train the initialized classification model, and then classify and label the 2 cross-validation sets, so as to repeat the training and verification process 5 times, each time the selected cross-validation set is different, until all The training samples of are all classified and labeled again.

Step S03 specifically includes:

Step S031, input the preset sample verification set into each trained classification model;

Step S032: Obtain the training accuracy and verification accuracy of the classification model output;

Step S033, judging whether the classification model fits well according to the training accuracy and the verification accuracy;

Step S034: Determine the well-fitted classification model as the trained classification model, and use the verification accuracy of the trained classification model as the classification accuracy of the classification model.

Among them, in the training process of each classification model, the training rounds of the classification model can be preset. For example, the training round is 100 trainings. After 100 trainings, the sample validation set is input into the classification model to obtain the output of the classification model. The training accuracy and verification accuracy of the classification model are judged to determine whether the training classification model fits well. Specifically, when (training accuracy-verification accuracy)/verification accuracy ≤ 10%, then the classification model is considered The fit is good. In this embodiment, the verification accuracy of a well-fitted classification model is used as the classification accuracy.

When the Bayesian optimization algorithm is used to find the optimal strategy subset, the undetermined strategy subset (numerical matrix) is used as the x value of the sample point, and the classification accuracy is used as the y value of the sample point to form multiple sample points, and A regression model of the Gaussian process is constructed based on multiple sample points, and the objective function is learned and fitted to find a subset of strategies that promote the objective function to the global optimal value.

Step S04 specifically includes:

Construct a regression model of the Gaussian process based on multiple sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model;

Determine the acquisition function of the Bayesian optimization algorithm according to the regression model;

Through the maximum optimization of the acquisition function, the optimal strategy subset is determined from multiple pending strategy subsets. Among them, the classification model trained on the sample training set after augmenting the optimal strategy subset has the highest classification accuracy.

In this embodiment, the optimal strategy subset is determined from a plurality of pending strategy subsets based on the classification accuracy and the Bayesian optimization algorithm. In other embodiments, other algorithms may also be used for selection, which is not limited here.

Understandably, we believe that there is a certain functional relationship between y and x=(μ,α,β), that is, y=f(x) Bayesian optimization algorithm learns and fits the acquisition function to find the objective function The strategy parameter that f(x) promotes to the global optimal value. When every Bayesian optimization iteration uses a new sample point to test the objective function f(x), use this information to update the prior distribution of the objective function f(x), and finally, use the Bayesian optimization algorithm to test the posterior The distribution gives the sample points where the global maximum value is most likely to occur.

In this embodiment, during the Bayesian optimization iteration process, we guide us to select sample points through the acquisition function, and continuously modify the GP Gaussian process curve to approximate the objective function f(x). When the acquisition function is the largest, the selected sample is explained The point is optimal, which means that we have searched for the optimal strategy subset that maximizes the objective function f(x).

Since the form of f(x) cannot be obtained explicitly, we use Gaussian process to approximate,

That is, f(x)～GP(m(x), k(x,x′)), where m(x) represents the mathematical expectation E(f(x)) of the sample point f(x), in Bayesian optimization Usually 0, k(x, x') is the kernel function, which describes the covariance of x.

For each x there is a corresponding Gaussian distribution, and for a set of {x ₁ , x ₂ ... x _n }, assuming that the y value obeys the joint normal distribution, its mean value is 0, and the covariance is:

Among them, the covariance is only related to x, and has nothing to do with y.

For a new sample point x _n+1 , the joint Gaussian distribution is:

Therefore, the posterior probability distribution _{of f n+1} can be estimated through the first n sample points _{: P(f n+1} |D _1:t ,x _t+1 )～N(μ _n (x),σ _n ² ( x)), where μ _n (x)=k ^T K ^-1 f _1:n ; σ _n ² (x)=k(x _n+1 ,x _n+1 )-k ^T K ^-1 k;

In this embodiment, the probability of improvement (POI) is used as the acquisition function.

The get function is:

Among them, f(x) is the objective function value of x, x is the verification accuracy, f(X+) is the optimal objective function value of x so far, μ(x) and σ(x) are obtained by Gaussian process respectively The mean and variance of the objective function are the posterior distribution of f(x), and Φ(·) represents the normal cumulative distribution function. ξ is the trade-off coefficient. If there is no such coefficient, the POI function will tend to take a point around X+ and converge to a position close to f(X+), that is, it tends to develop rather than explore, so this item is added to make a trade-off. By constantly trying new x, the next largest point should be larger or at least equal to it. Therefore, the next sample is between the intersection f(X+) and the confidence region. We can assume that the samples below f(X+) can be discarded, because we only need to search for the parameter that makes the objective function take the maximum value. So through the iterative process, the observation area is reduced until the optimal solution is searched, so that the POI(X) is maximized.

The embodiment of the present application provides an augmentation strategy selection system for image data. As shown in FIG. 2, the system includes an augmenter 10, a classification model 20, and a controller 30;

The augmenter 10 is used to select multiple undetermined strategy subsets from the augmentation strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, where each undetermined strategy sub-set The set consists of at least one augmentation strategy in the augmentation strategy set. Specifically, the augmentation strategy set includes rotation transformation, flip transformation, zoom transformation, translation transformation, scale transformation, region cropping, noise addition, segmented affine, random masking, boundary detection, contrast transformation, color dithering, random mixing and compounding Overlay. Among them, the augmentation strategy is, for example, flip transformation.

In this embodiment, any of the three augmentation strategies mentioned above are randomly selected to form a subset of undetermined strategies. Each augmentation strategy includes three strategy parameters, which are strategy type (μ), probability value (α), and amplitude (β). ). Then a subset of pending strategies can be represented in the form of a numerical matrix:

The classification model 20 includes a training unit 210 and a verification unit 220. The training unit 210 is configured to use each augmented sample training set to train an initialized classification model to obtain multiple trained classification models; the verification unit 220 is configured to input a preset sample validation set into each trained The classification model obtains the classification accuracy corresponding to the trained classification model.

In this embodiment, the classification model is a convolutional neural network model, which is composed of a convolutional neural network and a fully connected network, and its specific structure includes at least a convolutional network layer, a pooling layer, and a fully connected network layer.

The training unit 210 includes an extraction subunit, a classification subunit, a first acquisition subunit, and an optimization subunit.

The extraction subunit is used to use the convolutional neural network to extract the feature map of each sample in the augmented sample training set of the input classification model; the classification subunit is used to compare the augmented sample training set according to the feature map Perform classification prediction corresponding to a sample to obtain the classification result; obtain the subunit, which is used to obtain the loss function of the mean square error between the classification result set and the label set of all samples in the sample training set; the optimization subunit is used to pair The convolutional neural network is optimized to make the value of the loss function converge to obtain the optimized and trained classification model.

Specifically, the samples in the preset sample validation set are also labeled. For example, a training sample with a positive label is a lung image marked as having pneumonia symptoms, and a training sample with a negative label is marked as having no pneumonia symptoms. Of the lungs. A preset sample validation set is used to verify the trained classification model. The sample validation set corresponding to each classification model is different, which can achieve better model generalization performance and effectively solve the problem of overfitting that may be introduced by sample augmentation.

The verification unit 220 includes an input subunit, a second acquisition subunit, a judgment subunit, and a determination subunit.

The input subunit is used to input the preset sample verification set into each trained classification model;

The second acquisition subunit is used to acquire the training accuracy and verification accuracy of the classification model output;

The judgment subunit is used to judge whether the classification model fits well according to the training accuracy and verification accuracy;

The determination subunit is used to determine a well-fitted classification model as a trained classification model, and use the verification accuracy of the trained classification model as the classification accuracy of the classification model.

The system also includes a database 40 and a processing module 50. The database 40 is used to store the sample training set and the sample verification set.

The processing module 50 is configured to randomly select a plurality of verification subsets from a preset sample verification set; input the plurality of verification subsets into each trained classification model respectively.

The controller 30 is configured to use a Bayesian optimization algorithm to determine an optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each trained classification model.

In this embodiment, the controller 30 determines an optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy and using a Bayesian optimization algorithm. In other embodiments, other algorithms may also be used for selection, which is not limited here.

Please refer to FIG. 3. Optionally, the controller 30 includes a construction unit 310, a first determination unit 320, and a second determination unit 330.

The constructing unit 310 is configured to construct a regression model of the Gaussian process based on a plurality of sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model;

The first determining unit 320 is configured to determine the acquisition function of the Bayesian optimization algorithm according to the regression model;

The second determining unit 330 is used to determine the optimal strategy subset from a plurality of pending strategy subsets through the maximum optimization of the acquisition function, wherein the optimal strategy subset is used to augment the classification model trained on the sample training set. The classification accuracy is the highest.

Among them, the covariance is only related to x, and has nothing to do with y.

For a new sample point x _n+1 , the joint Gaussian distribution is:

The get function is:

Further, after the controller 30 selects the optimal augmentation strategy, the controller 30 is also used to output the optimal augmentation strategy to the augmenter 10, and the augmenter 10 confirms the optimal augmentation strategy as a preset sample The augmentation strategy of the training set. Understandably, after the augmenter 10 obtains the optimal augmentation strategy, every time the augmenter performs sample augmentation, it will use the optimal augmentation strategy output by the controller for sample augmentation.

The embodiment of the present application provides a computer-readable storage medium. Wherein, the computer-readable storage medium may be non-volatile or volatile. Wherein, the storage medium includes a stored program, and when the program is running, the device where the storage medium is located is controlled to perform the following steps:

Select multiple undetermined strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, where each undetermined strategy subset is from the augmented strategy set At least one augmentation strategy composition; use each augmented sample training set to train the initialized classification model to obtain multiple trained classification models; input the preset sample validation set into each trained classification model to obtain training The classification accuracy corresponding to a good classification model; the Bayesian optimization algorithm is used to determine the optimal strategy subset from multiple pending strategy subsets based on the classification accuracy corresponding to each trained classification model.

Optionally, when the program is running, the device where the storage medium is located is controlled to execute the step of using a Bayesian optimization algorithm to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each trained classification model, include:

Construct a regression model of the Gaussian process based on multiple sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model; the Bayesian optimization algorithm is determined according to the regression model Acquisition function: Through the maximum optimization of the acquisition function, the optimal strategy subset is determined from multiple pending strategy subsets. Among them, the classification model trained on the sample training set after augmenting the optimal strategy subset has the highest classification accuracy.

Optionally, when the program is running, the device where the storage medium is located is controlled to execute the input of a preset sample verification set into each trained classification model to obtain the classification accuracy corresponding to the trained classification model, including:

Input the preset sample validation set into each trained classification model; obtain the training accuracy and verification accuracy of the output of the classification model; judge whether the classification model fits well according to the training accuracy and verification accuracy; determine the well-fitted classification model as The trained classification model, and the verification accuracy of the trained classification model is used as the classification accuracy of the classification model.

Optionally, when the program is running, the device where the storage medium is located is controlled to execute a classification model trained and initialized with each augmented sample training set to obtain multiple trained classification models, including: extracting input using a convolutional neural network The feature map of each sample in the augmented sample training set of the classification model; according to the feature map, classify and predict a corresponding sample in the augmented sample training set to obtain the classification result; obtain the classification result set and the sample training set The loss function of the mean square error of the label set of all samples; the convolutional neural network is optimized by backpropagation, so that the value of the loss function converges, and the optimized training classification model is obtained.

Optionally, when the program is running, controlling the device where the storage medium is located before executing the input of the preset sample verification set into each trained classification model to obtain the classification accuracy corresponding to the trained classification model also includes: The sample validation set of the sample validation set randomly selects multiple validation subsets; input multiple validation subsets into each trained classification model.

Fig. 4 is a schematic diagram of a computer device provided by an embodiment of the present application. As shown in FIG. 3, the computer device 100 of this embodiment includes a processor 101, a memory 102, and a computer program 103 stored in the memory 102 and running on the processor 101. The processor 101 executes the computer program 103 when the computer program 103 is executed. The method of selecting the augmentation strategy of the image data in the example is not repeated here to avoid repetition. Alternatively, when the computer program is executed by the processor 101, the function of each model/unit in the image data augmentation strategy selection system in the embodiment is realized. In order to avoid repetition, it will not be repeated here.

The computer device 100 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device may include, but is not limited to, a processor 101 and a memory 102. Those skilled in the art can understand that FIG. 3 is only an example of the computer device 100 and does not constitute a limitation on the computer device 100. It may include more or less components than shown, or a combination of certain components, or different components. For example, computer equipment may also include input and output devices, network access devices, buses, and so on.

The so-called processor 101 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 102 may be an internal storage unit of the computer device 100, such as a hard disk or memory of the computer device 100. The memory 102 may also be an external storage device of the computer device 100, such as a plug-in hard disk equipped on the computer device 100, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on. Further, the memory 102 may also include both an internal storage unit of the computer device 100 and an external storage device. The memory 102 is used to store computer programs and other programs and data required by the computer equipment. The memory 102 can also be used to temporarily store data that has been output or will be output.

Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the above-described system, device, and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined. Or it can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.

The above-mentioned integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The above-mentioned software functional unit is stored in a storage medium and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (Processor) execute the method described in each embodiment of the present application. Part of the steps. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims

A method for selecting an augmentation strategy for image data, wherein the method includes:

Select multiple undetermined strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the undetermined strategy subsets is augmented by the At least one augmentation strategy in the broad strategy set;

Training an initialized classification model using each of the augmented sample training sets to obtain multiple trained classification models;

Input a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model;

A Bayesian optimization algorithm is used to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.
The method according to claim 1, wherein the step of using the Bayesian optimization algorithm to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models ,include:

Constructing a regression model of the Gaussian process based on multiple sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model;

Determining the acquisition function of the Bayesian optimization algorithm according to the regression model;

Through the maximum optimization of the acquisition function, an optimal strategy subset is determined from a plurality of the pending strategy subsets, wherein the classification of the classification model obtained by training the sample training set after the expansion of the optimal strategy subset is used The highest accuracy.
The method according to claim 1, wherein the inputting a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model comprises:

Input a preset sample verification set into each of the trained classification models;

Acquiring training accuracy and verification accuracy output by the classification model;

Judging whether the classification model fits well according to the training accuracy and the verification accuracy;

The well-fitted classification model is determined as the trained classification model, and the verification accuracy of the trained classification model is used as the classification accuracy of the classification model.
The method according to claim 1, wherein the training an initialized classification model using each of the augmented sample training sets to obtain a plurality of trained classification models comprises:

Extracting a feature map of each sample in the augmented sample training set of the input classification model by using a convolutional neural network;

Perform classification prediction on a corresponding sample in the augmented sample training set according to the feature map to obtain a classification result;

Acquiring a loss function of the mean square error of the classification result set and the label set of all samples in the sample training set;

The convolutional neural network is optimized by back propagation, so that the value of the loss function converges, and the optimized and trained classification model is obtained.
The method according to claim 1, wherein, before the preset sample verification set is input into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model, the method Also includes:

Randomly select a plurality of verification subsets from the preset sample verification set;

The multiple validation subsets are input into each of the trained classification models respectively.
The method according to claim 1, wherein the augmentation strategy set includes rotation transformation, flip transformation, zoom transformation, translation transformation, scale transformation, region cropping, noise addition, segment affine, random concealment, boundary detection, Contrast transformation, color dithering, random mixing and composite overlay.
An augmentation strategy selection system for image data, wherein the system includes an augmenter, a classification model, and a controller;

The augmenter is used to select multiple undetermined strategy subsets from the augmentation strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the The undetermined strategy subset is composed of at least one augmentation strategy in the augmentation strategy set;

The classification model is used to train an initialized classification model using each of the augmented sample training sets to obtain a plurality of trained classification models; and input a preset sample verification set into each of the trained samples A classification model to obtain the classification accuracy corresponding to the trained classification model;

The controller is configured to use a Bayesian optimization algorithm to determine an optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.
The system according to claim 7, wherein the controller includes a construction unit, a first determination unit, and a second determination unit;

The construction unit is configured to construct a regression model of the Gaussian process based on a plurality of sample points, wherein each sample point includes the classification accuracy of the trained classification model and a subset of the pending strategy used to train the classification model ；

The first determining unit is configured to determine the acquisition function of the Bayesian optimization algorithm according to the regression model;

The second determining unit is configured to determine an optimal strategy subset from a plurality of the pending strategy subsets through the maximum optimization of the acquisition function, wherein the sample after the expansion of the optimal strategy subset is used The classification model trained on the training set has the highest classification accuracy.
A computer device, wherein the computer device includes a memory and a processor, the memory and the processor are connected to each other, and the memory is used to store a computer program configured to be executed by the processor , The computer program is configured to execute a method for selecting an augmentation strategy for image data:

Wherein, the method includes:

Select multiple undetermined strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the undetermined strategy subsets is augmented by the At least one augmentation strategy in the broad strategy set;

Training an initialized classification model using each of the augmented sample training sets to obtain multiple trained classification models;

Input a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model;

The Bayesian optimization algorithm is used to determine the optimal strategy subset from multiple pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.
The computer device according to claim 9, wherein the Bayesian optimization algorithm is used to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models The steps include:

Constructing a regression model of the Gaussian process based on multiple sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model;

Determining the acquisition function of the Bayesian optimization algorithm according to the regression model;

Through the maximum optimization of the acquisition function, an optimal strategy subset is determined from a plurality of the pending strategy subsets, wherein the classification of the classification model obtained by training the sample training set after the expansion of the optimal strategy subset is used The highest accuracy.
The computer device according to claim 9, wherein the inputting a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model comprises:

Input a preset sample verification set into each of the trained classification models;

Acquiring training accuracy and verification accuracy output by the classification model;

Judging whether the classification model fits well according to the training accuracy and the verification accuracy;

The well-fitted classification model is determined as the trained classification model, and the verification accuracy of the trained classification model is used as the classification accuracy of the classification model.
9. The computer device according to claim 9, wherein said training an initialized classification model using each of said augmented sample training sets to obtain a plurality of trained classification models comprises:

Extracting a feature map of each sample in the augmented sample training set of the input classification model by using a convolutional neural network;

Perform classification prediction on a corresponding sample in the augmented sample training set according to the feature map to obtain a classification result;

Acquiring a loss function of the mean square error of the classification result set and the label set of all samples in the sample training set;

The convolutional neural network is optimized by back propagation, so that the value of the loss function converges, and the optimized and trained classification model is obtained.
The computer device according to claim 9, wherein, before said inputting a preset sample verification set into each of said trained classification models to obtain the classification accuracy corresponding to said trained classification model, said Methods also include:

Randomly select a plurality of verification subsets from the preset sample verification set;

The multiple validation subsets are input into each of the trained classification models respectively.
The computer device according to claim 9, wherein the augmentation strategy set includes rotation transformation, flip transformation, zoom transformation, translation transformation, scale transformation, region cropping, noise addition, segment affine, random concealment, boundary detection , Contrast transformation, color dithering, random mixing and composite overlay.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it is used to implement a method for selecting an augmentation strategy for image data. The method includes the following step:

Select multiple undetermined strategy subsets from the augmented strategy set to perform sample augmentation on the preset sample training set to obtain multiple augmented sample training sets, wherein each of the undetermined strategy subsets is augmented by the At least one augmentation strategy in the broad strategy set;

Training an initialized classification model using each of the augmented sample training sets to obtain multiple trained classification models;

Input a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model;

A Bayesian optimization algorithm is used to determine the optimal strategy subset from a plurality of pending strategy subsets based on the classification accuracy corresponding to each of the trained classification models.
The computer-readable storage medium according to claim 15, wherein the Bayesian optimization algorithm is used to determine the optimal strategy from a plurality of sub-sets of pending strategies based on the classification accuracy corresponding to each of the trained classification models The steps of the subset include:

Constructing a regression model of the Gaussian process based on multiple sample points, where each sample point includes the classification accuracy of the trained classification model and the undetermined strategy subset used to train the classification model;

Determining the acquisition function of the Bayesian optimization algorithm according to the regression model;

Through the maximum optimization of the acquisition function, an optimal strategy subset is determined from a plurality of the pending strategy subsets, wherein the classification of the classification model obtained by training the sample training set after the expansion of the optimal strategy subset is used The highest accuracy.
The computer-readable storage medium according to claim 15, wherein said inputting a preset sample verification set into each of said trained classification models to obtain the classification accuracy corresponding to said trained classification model comprises :

Input a preset sample verification set into each of the trained classification models;

Acquiring training accuracy and verification accuracy output by the classification model;

Judging whether the classification model fits well according to the training accuracy and the verification accuracy;

The well-fitted classification model is determined as the trained classification model, and the verification accuracy of the trained classification model is used as the classification accuracy of the classification model.
15. The computer-readable storage medium according to claim 15, wherein the training an initialized classification model using each of the augmented sample training sets to obtain a plurality of trained classification models comprises:

Extracting a feature map of each sample in the augmented sample training set of the input classification model by using a convolutional neural network;

Perform classification prediction on a corresponding sample in the augmented sample training set according to the feature map to obtain a classification result;

Acquiring a loss function of the mean square error of the classification result set and the label set of all samples in the sample training set;

The convolutional neural network is optimized by back propagation, so that the value of the loss function converges, and the optimized and trained classification model is obtained.
The computer-readable storage medium according to claim 15, wherein, before inputting a preset sample verification set into each of the trained classification models to obtain the classification accuracy corresponding to the trained classification model , The method further includes:

Randomly select a plurality of verification subsets from the preset sample verification set;

The multiple validation subsets are input into each of the trained classification models respectively.
The computer-readable storage medium according to claim 15, wherein the augmentation strategy set includes rotation transformation, flip transformation, scaling transformation, translation transformation, scale transformation, region cropping, noise addition, segmentation affine, random masking , Boundary detection, contrast transformation, color dithering, random mixing and composite overlay.