CN114638997A

CN114638997A - Data augmentation method, data augmentation device, computer device, and storage medium

Info

Publication number: CN114638997A
Application number: CN202011473235.4A
Authority: CN
Inventors: 杨小平
Original assignee: SF Technology Co Ltd
Current assignee: SF Technology Co Ltd
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2022-06-17

Abstract

The application discloses a data augmentation method, a device, computer equipment and a storage medium, wherein the data augmentation method comprises the following steps: acquiring an acquired original image; calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model; and generating an augmented image corresponding to the original image according to the augmented intermediate parameter. According to the method and the device, the augmented intermediate parameters for augmenting the original image are calculated to generate the augmented image corresponding to the original image, so that the diversity of the training samples is enriched, and the accuracy of the later-stage training of the network model after classification is improved.

Description

Data augmentation method, data augmentation device, computer device, and storage medium

Technical Field

The present application relates to the field of communications technologies, and in particular, to a data augmentation method and apparatus, a computer device, and a storage medium.

Background

The classification algorithm in the deep learning is a very general algorithm, and can realize the functions of scene classification, object classification and the like. Because the task is simple, the network model is small, and the method can be applied to various products in a large quantity. But the current classification algorithms have the problems of various types, difficulty in optimization and the like.

Different from the traditional machine learning, the deep learning needs massive positive samples to train, and the machine learning can achieve enough characteristics to achieve 'one-to-three' behavior, so that the phenomenon that overfitting easily occurs when the data set quantity is too small is prevented. When the deep learning network is used for solving the practical problem, the quantity and the quality of the training data are a bottleneck for restricting the network effect. For some types of people or data with high collection difficulty, data augmentation operation is required to meet the training requirement of the deep learning network.

The traditional data augmentation method comprises the operations of rotation, mirror image, random cutting, noise increase and the like, the method does not change the variety of the target essentially, and the method has limited effect on enriching the diversity of the training samples.

Disclosure of Invention

The embodiment of the application provides a data augmentation method and device, computer equipment and a storage medium, wherein augmented intermediate parameters for augmenting original images are calculated to generate augmented images corresponding to the original images, so that the diversity of training samples is enriched, and the accuracy of classification after later-stage training of a network model is improved.

In one aspect, the present application provides a data augmentation method, including:

acquiring an acquired original image;

calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model;

and generating an augmented image corresponding to the original image according to the augmented intermediate parameter.

In some embodiments of the present application, the intermediate parameter for amplification includes a matrix relationship under different exposure parameters corresponding to the original image, and the calculating the intermediate parameter for amplification of the original image includes:

acquiring an augmented image of the original image under different preset exposure parameters to obtain an augmented image library;

and determining the matrix relation under different exposure parameters corresponding to the original image according to the augmented image library.

In some embodiments of the present application, the determining, according to the augmented image library, a matrix relationship under different exposure parameters corresponding to the original image includes:

determining images of each image under different exposure parameters for the images in the augmented image library;

respectively determining the mapping relation between each image in the augmented image library and the images of the images under different exposure parameters;

determining an image which is most similar to the original image in the augmented image library according to each mapping relation, and determining a matrix relation under different exposure parameters corresponding to the original image;

generating an augmented image corresponding to the original image according to the augmented intermediate parameter, wherein the method comprises the following steps:

and generating the augmented images under different exposure parameters corresponding to the original images based on the matrix relation.

In some embodiments of the present application, the separately determining a mapping relationship between each image in the augmented image library and the image of the image under different exposure parameters includes:

respectively taking each image in the augmented image library as a target augmented image, and mapping the pixel value of the target augmented image into a pixel value with a preset dimension, wherein the preset dimension is higher than the dimension of the pixel value of the target augmented image;

after the pixel value of the target augmented image is mapped into the pixel value of a preset dimension, determining the mapping relation between the target augmented image and the image of the target augmented image under different exposure parameters;

determining an image which is closest to the original image in the augmented image library according to each mapping relation, and determining a matrix relation corresponding to the original image under different exposure parameters, wherein the matrix relation comprises the following steps:

determining an image matrix after the target augmented image is transformed into a pixel value with a preset dimension according to the mapping relation;

and obtaining the matrix relation under different exposure parameters corresponding to the target augmented image by minimizing the change of the image matrix.

In some embodiments of the present application, the obtaining, by minimizing the change of the image matrix, a matrix relationship under different exposure parameters corresponding to the target augmented image includes:

calculating the distance between the target augmented image and the color distribution of the images in the augmented image library;

determining the image which is closest to the target augmented image in the augmented image library in the color distribution distance as the closest image;

determining a target exposure parameter corresponding to the most similar image;

and acquiring a matrix relation under the target exposure parameters corresponding to the target augmented image.

In some embodiments of the present application, the generating, based on the matrix relationship, an augmented image under different exposure parameters corresponding to the original image includes:

acquiring the distance between the original image and the closest image in the augmented image library;

calculating an augmentation parameter of the original image according to the distance, the matrix relation and a preset fixed parameter;

and according to the augmentation parameters, augmenting the original image to obtain augmented images corresponding to the original image under different exposure parameters.

In some embodiments of the present application, the generating of the augmented image under different exposure parameters corresponding to the original image based on the matrix relationship adopts the following formula:

M＝αM_s

α＝exp(-d²/2σ²)

wherein d is the distance between the original image and the closest image in the augmented image library, σ is a preset fixed parameter, and M is_sFor the matrix relationship of the original image under different exposure parameters, I_(in)As an original image, I_(out)The image is an augmented image under different exposure parameters corresponding to the original image, and M is an augmented parameter.

In some embodiments of the present application, the intermediate parameter for augmentation includes a derivative value obtained by derivation after the original image is input into a preset network model, and the calculating the intermediate parameter for augmentation of the original image includes:

inputting an original image into a preset network model, and performing forward processing to obtain a loss value;

carrying out derivation on the loss value to obtain a derivation value;

generating a noise image of the original image according to the derivative value;

and taking the noise image as an augmented image corresponding to the original image.

In another aspect, the present application provides a data augmentation apparatus comprising:

the acquisition module is used for acquiring the acquired original image;

the calculation module is used for calculating an augmentation intermediate parameter for augmenting the original image, the augmentation intermediate parameter is used for generating an augmentation intermediate parameter for augmenting the original image based on a preset augmentation condition, the augmentation intermediate parameter comprises a matrix relation under different exposure parameters corresponding to the original image, or/and the original image is input into a preset network model and then derived to obtain a derivative value;

and the generating module is used for generating an augmented image corresponding to the original image according to the augmented intermediate parameter.

In some embodiments of the present application, the augmented intermediate parameter includes a matrix relationship under different exposure parameters corresponding to the original image, and the calculation module is specifically configured to:

In some embodiments of the present application, the calculation module is specifically configured to:

the generation module is specifically configured to:

determining the image which is closest to the target augmented image in color distribution in the augmented image library as the closest image;

determining a target exposure parameter corresponding to the closest image;

In some embodiments of the present application, the calculation module is specifically configured to, when generating the augmented image under different exposure parameters corresponding to the original image based on the matrix relationship, use the following formula:

M＝αM_s

α＝exp(-d²/2σ²)

In some embodiments of the present application, the augmented intermediate parameter includes a derivative value obtained by derivation after the original image is input into a preset network model, and the calculation module is specifically configured to:

carrying out derivation on the loss value to obtain a derivation value;

the generation module is specifically configured to:

In another aspect, the present application further provides a computer device, including:

one or more processors;

a memory; and

one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to implement the data augmentation method of any one of the first aspects.

In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, the computer program being loaded by a processor to perform the steps of the data augmentation method according to any one of the first aspect.

The method comprises the steps of acquiring an acquired original image; calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model; and generating an augmented image corresponding to the original image according to the augmented intermediate parameter. According to the method, on the basis that the diversity effect of the traditional data augmentation method on the rich training samples in the prior art is limited, augmented intermediate parameters such as matrix relations under different exposure parameters corresponding to the original images or/and derived values obtained after the original images are input into the preset network model are derived to generate augmented images corresponding to the original images, the diversity of the training samples is enriched, and the accuracy of classification after the network model is trained in the later stage is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of a data augmentation system according to an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating an embodiment of a data augmentation method provided in an embodiment of the present application;

FIG. 3 is a flowchart illustrating an embodiment of step 202 in the present application;

FIG. 4 is a flowchart illustrating an embodiment of step 302 in an embodiment of the present application;

FIG. 5 is a schematic structural diagram of an embodiment of a data amplification device provided in the embodiments of the present application;

fig. 6 is a schematic structural diagram of an embodiment of a computer device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description of the present application, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced device or element must have a particular orientation, be constructed in a particular orientation, and be operated, and thus should not be considered as limiting the present application. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

In this application, the word "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes are not set forth in detail in order to avoid obscuring the description of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.

It should be noted that, since the method in the embodiment of the present application is executed in a computer device, processing objects of each computer device all exist in the form of data or information, for example, time, which is substantially time information, and it is understood that, if size, number, position, and the like are mentioned in the following embodiments, all exist in the form of corresponding data or information, so that the computer device can process the data, which is not described herein again in detail.

Some basic concepts related to the embodiments of the present application are first introduced below, specifically as follows:

One-Hot encoding: also known as one-bit-efficient encoding, mainly uses an N-bit status register to encode N states, each state being represented by its own independent register bit and only one bit being active at any time. One-Hot encoding is the representation of classification variables as binary vectors. This first requires mapping the classification values to integer values. Each integer value is then represented as a binary vector, which is a zero value, except for the index of the integer, which is marked as 1.

Batch: the Batch size is a hyper-parameter that defines the number of samples to be processed before updating the internal model parameters. A batch process is considered a loop to iterate one or more samples and make predictions. At the end of the batch process, the prediction is compared to the expected output variables and the error is calculated. From this error, the update algorithm is used to improve the model, for example moving down the error gradient. The training data set may be divided into one or more batchs. When all training samples are used to create a Batch, the learning algorithm is called Batch gradient descent. When the batch is one sample size, the learning algorithm is called random gradient descent. When the batch size exceeds one sample and is less than the size of the training data set, the learning algorithm is referred to as a mini-batch gradient descent.

Embodiments of the present application provide a data augmentation method, apparatus, computer device, and storage medium, which are described in detail below.

Referring to fig. 1, fig. 1 is a schematic view of a data augmentation system according to an embodiment of the present disclosure, where the data augmentation system may include a computer device 100, and a data augmentation apparatus, such as the computer device in fig. 1, is integrated in the computer device 100.

In the embodiment of the present application, the computer device 100 is mainly used for acquiring an acquired original image; calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model; and generating an augmented image corresponding to the original image according to the augmented intermediate parameter.

In this embodiment, the computer device 100 may be an independent server, or may be a server network or a server cluster composed of servers, for example, the computer device 100 described in this embodiment includes, but is not limited to, a computer, a network host, a single network server, a plurality of network server sets, or a cloud server composed of a plurality of servers. Among them, the Cloud server is constituted by a large number of computers or web servers based on Cloud Computing (Cloud Computing).

It will be appreciated that the terminal 100 used in the embodiments of the present application may be a device that includes both receiving and transmitting hardware, i.e., a device having receiving and transmitting hardware capable of performing two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display. The terminal 100 may specifically be a desktop terminal or a mobile terminal, and the terminal 100 may also specifically be one of a mobile phone, a tablet computer, a notebook computer, and the like.

Those skilled in the art will appreciate that the application environment shown in fig. 1 is only one application scenario related to the present application, and does not constitute a limitation on the application scenario of the present application, and that other application environments may further include more or less computer devices than those shown in fig. 1, for example, only 1 computer device is shown in fig. 1, and it is understood that the data augmentation system may further include one or more other services, which are not limited herein.

In addition, as shown in fig. 1, the data augmentation system may further include a memory 200 for storing data, such as stored data, for example, the acquired original image, the augmented image corresponding to the original image, and the like.

It should be noted that the scenario diagram of the data augmentation system shown in fig. 1 is only an example, and the data augmentation system and the scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application.

First, an embodiment of the present application provides a data augmentation method, where an execution subject of the data augmentation method is a data augmentation device, and the data augmentation device is applied to a computer device, and the data augmentation method includes: acquiring an acquired original image; calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model; and generating an augmented image corresponding to the original image according to the augmented intermediate parameter.

As shown in fig. 2, which is a schematic flow chart of an embodiment of a data augmentation method in the embodiment of the present application, the data augmentation method includes the following steps 201 to 203:

201. acquiring an acquired original image.

The original image can be shot by various shooting devices in communication connection with computer equipment, taking the logistics field as an example, various shooting devices are arranged in logistics nodes and used for shooting images of preset areas of a transmission device (such as a conveyor belt, a conveying plate and the like) and a sorting device, and the shot images can be collected original images. It is to be understood that the acquired original image may also be an image acquired by a computer device through a network, or an image received from another computer device, and is not limited herein.

202. And calculating an augmentation intermediate parameter for augmenting the original image.

The augmentation parameters are parameters which can be directly used for augmenting an original image, the augmentation parameters correspond to actual augmentation operations, such as amplification, reduction and the like, the augmentation intermediate parameters are intermediate parameters which are generated under a preset augmentation condition and used for augmenting the original image, the augmentation intermediate parameters do not correspond to the actual augmentation operations, the augmentation intermediate parameters comprise matrix relations under different exposure parameters corresponding to the original image, or/and derivation values obtained by derivation after the original image is input into a preset network model.

203. And generating an augmented image corresponding to the original image according to the augmented intermediate parameter.

The embodiment of the application acquires the acquired original image; calculating an augmentation intermediate parameter for augmenting the original image, wherein the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter comprises a matrix relation corresponding to the original image under different exposure parameters or/and a derivative value obtained by derivation after the original image is input into a preset network model; and generating an augmented image corresponding to the original image according to the augmented intermediate parameter. According to the embodiment of the application, on the basis that the diversity effect of a traditional data augmentation method in the prior art on rich training samples is limited, augmented intermediate parameters such as matrix relations under different exposure parameters corresponding to original images or/and derivative values obtained by derivation after the original images are input into a preset network model are calculated to generate augmented images corresponding to the original images, the diversity of the training samples is enriched, and the accuracy of classification after later-stage training of the network model is improved.

In the embodiment of the present application, there are various kinds of augmentation intermediate parameters, and correspondingly, there are various ways to calculate the augmentation intermediate parameters for augmenting the original image in step 202, which will be described below by way of example.

(1) The augmented intermediate parameter includes a matrix relationship under different exposure parameters corresponding to the original image, and in this scheme, specifically, the data augmentation method of manual white balance is used to simulate a camera imaging process and generate an effect image similar to an effect image appearing when a camera normally shoots, and the data of manual white balance is data generated under the condition of generating a synchronous exposure parameter, at this time, as shown in fig. 3, the step 202 of calculating the augmented intermediate parameter for augmenting the original image may specifically include the following steps 301 to 302:

301. and acquiring an augmented image of the original image under different preset exposure parameters to obtain an augmented image library.

The preset different exposure parameters may be preset, and in a specific embodiment, the preset different exposure parameters may include the following 5 exposure parameters: 2850K, 3800K, 5500K, 6500K, 7500K, and the like.

Obtaining the augmented images of the original image under different preset exposure parameters, obtaining an augmented image library, for example, obtaining different effect images of the original image under 2850K, 3800K, 5500K, 6500K, 7500K and other exposure parameters, generating 5 augmented images corresponding to each original image, and finally obtaining 6 images in total, wherein the augmented image library includes the original image and the augmented images under different preset exposure parameters, for example, 2000 original images, 10000 augmented images and 12000 images in total in the augmented image library.

302. And determining the matrix relation under different exposure parameters corresponding to the original image according to the augmented image library.

Specifically, as shown in fig. 4, the step 302 determines, according to the augmented image library, a matrix relationship under different exposure parameters corresponding to the original image, and may further include:

401. and determining images of each image under different exposure parameters for the images in the augmented image library.

402. And respectively determining the mapping relation between each image in the augmented image library and the image of the image under different exposure parameters.

403. And determining an image which is most similar to the original image in the augmented image library according to each mapping relation, and determining a matrix relation corresponding to the original image under different exposure parameters.

Generating an augmented image corresponding to the original image according to the augmented intermediate parameter may include: and generating the augmented images under different exposure parameters corresponding to the original images based on the matrix relation.

In some embodiments of the present application, the separately determining a mapping relationship between each image in the augmented image library and the image of the image under different exposure parameters includes: respectively taking each image in the augmented image library as a target augmented image, and mapping the pixel value of the target augmented image to a pixel value of a preset dimension; and after the pixel value of the target augmented image is mapped to the pixel value of a preset dimension, determining the mapping relation between the target augmented image and the image of the target augmented image under different exposure parameters.

At this time, the determining, according to each mapping relationship, an image in the augmented image library that is closest to the original image, and determining a matrix relationship under different exposure parameters corresponding to the original image, includes: determining an image matrix after the target augmented image is transformed to a pixel value on a preset dimension according to the mapping relation; and obtaining the matrix relation under different exposure parameters corresponding to the target augmented image by minimizing the change of the image matrix.

Specifically, the pixel values of the target augmented image are mapped to pixel values of a preset dimension, where the pixel values of the target augmented image are three-dimensional, that is, include three parameters R, G, B, and the preset dimension is a dimension higher than three-dimensional, that is, a parameter of more dimensions than R, G, B parameters may be included, for example, the target augmented image [ R, G, B ] may be mapped to the more dimensions [ R, G, B, RG, RB, GB, R2, G2, B2] (9 dimensions).

In some embodiments of the present application, the obtaining, by minimizing the change of the image matrix, a matrix relationship under different exposure parameters corresponding to the target augmented image includes: calculating the distance between the target augmented image and the color distribution of the images in the augmented image library; determining the image which is closest to the target augmented image in the augmented image library in the color distribution distance as the closest image; determining a target exposure parameter corresponding to the most similar image; and acquiring a matrix relation under the target exposure parameters corresponding to the target augmented image.

Specifically, calculating the distance between the target augmented image and the color distribution of the images in the augmented image library; determining the image which is closest to the color distribution of the target augmented image in the augmented image library as the closest image, wherein the image is closest when the distance is the smallest; taking out the exposure parameter corresponding to the most similar image

And (4) matrix.

Because the images in the augmented image library and the target augmented images are both represented by image matrixes, the corresponding matrixes represent the color distribution of the images, the distance between the images in the augmented image library and the color distribution of the target augmented images, namely the distance operation of the image matrixes corresponding to the images, and the most minimized change (namely the minimum value) of all the operation distances can determine the closest image.

Specifically, the matrix relationship under different exposure parameters corresponding to the target augmented image obtained by minimizing the change of the image matrix is as follows:

wherein, I_t、

A 3 xn color matrix (3 for RGB, n for the number of pixels), respectively, of the image, t for different exposure parameters,

is a non-linear matrix of 3 x 9,

for the mapping after transformation to the pixel values of the predetermined dimension, minimizing the transformed mapping (i.e. taking the minimum distance between the image in the augmented image library and the color distribution of the target augmented image) can obtain

And (4) matrix.

In some embodiments of the present application, the generating, based on the matrix relationship, an augmented image under different exposure parameters corresponding to the original image includes: acquiring the distance between the original image and the closest image in the augmented image library; calculating an augmentation parameter of the original image according to the distance, the matrix relation and a preset fixed parameter; and according to the augmentation parameters, augmenting the original image to obtain augmented images corresponding to the original image under different exposure parameters.

Specifically, the augmented images under different exposure parameters corresponding to the original image are generated based on the matrix relationship, and the following formula is adopted:

M＝αM_s

α＝exp(-d²/2σ²)

wherein d is the distance between the original image and the closest image in the augmented image library, σ is a preset fixed parameter, and M is_sFor the matrix relationship under different exposure parameters corresponding to the original image, I_(in)As an original image, I_(out)The image is an augmented image under different exposure parameters corresponding to the original image, and M is a calculated augmented parameter.

The classification network in deep learning exists as a network with the widest application scene, the existing classification networks are various and are distributed on different gits on the network, and no general platform is integrated to most of the classification networks. The general deep learning network comprises a basic network and a corresponding functional network, the basic network extracts features, the functional network performs corresponding processing, the basic network can be general, different functional networks such as a classification network and an image segmentation network can be connected to the basic network, most of the classification networks are integrated on one platform, and the method has the advantages that the trial cost of algorithm personnel on different networks can be reduced, the basic network of the classification network can be determined at the highest speed, and the speed of algorithm landing service is increased.

In the embodiment of the application, the basic network can be configured well in a parameter configuration mode. After various network configurations such as efficient net b0-b7, resnet50, resnet101, resnext, resnest, seresnet, mobilenet v2, darknet, hrnet, vgg, shuffle net and the like are completed on the aspect of the basic network, the basic network can be called later in a mode of configuring different parameters.

After network type selection and data amplification are completed, network training needs to be optimized to obtain a better network. The embodiment of the application designs a set of network optimization strategy, and the main innovation point is that various loss functions such as cross entry loss, arc face loss, focal loss, label smoothing, smoothing loss, circle loss and am softmax loss are added to a loss design block, and the loss functions can be called in a one-key parameter configuration mode.

label smoothing is a common method to prevent overfitting in the classification problem. Further, in the embodiment of the present application, the label smoothing may be modified to adapt to all types of loss functions mentioned above, and when the loss function is called, whether the label smoothing is loaded may be configured, so that the optimization of two loss functions on one training task is simultaneously pulled up.

For the classification problem, the label is often converted into a one hot coded form when the function computation is finally lost.

The problem that is easily generated when the classification is performed by adopting a one hot coding mode is that 1) the generalization capability of a model cannot be ensured, and overfitting is easily caused; 2) the total probability and the 0 probability encourage the difference between the category and the other categories to be as large as possible, and the situation is difficult to adapt as the gradient is bounded. Causing the model to over-believe the predicted category.

The label smoothening realizes the softening of the label by modifying one hot coding, and enhances the generalization capability of the network. For example, the standard cross entropy loss function is as follows:

-y[ylogp+(1-y)log(1-p)]

wherein y is a predicted value, and the Label smoothening optimization is that for the probability p, p is a ground route, and the value is 0 or 1; the label smoothing modifies p into p 1-epsilon and epsilon, wherein epsilon is a preset fixed parameter (which can be predetermined based on an actual application scene), so that when a sample result is a certain type, information of other types can be compensated, and label information is softened.

The embodiment of the application designs a set of system which can call a classified network in a parameterization mode, call a data augmentation strategy in a parameterization mode and call different optimization strategies in a parameterization mode, and a set of algorithm platform which can enable a user to independently define the network design and the network optimization, and has the advantages that the trying time of an algorithm engineer on different classified networks can be shortened, and the landing of the algorithm on different services is improved.

(2) The augmentation intermediate parameter includes a derivative value obtained by derivation after the original image is input into a preset network model, and the calculating of the augmentation intermediate parameter for augmenting the original image in step 202 may further include the following: inputting an original image into a preset network model, and performing forward processing to obtain a loss value; carrying out derivation on the loss value to obtain a derivation value;

at this time, the generating an augmented image corresponding to the original image according to the augmented intermediate parameter includes: generating a noise image of the original image according to the derivative value; and taking the noise image as an augmented image corresponding to the original image.

One innovation of the embodiment of the application is that two methods for generating the training sample by using the confrontation sample are designed and added, so that the accuracy of the network for identifying the abnormal scene is improved. The image appearance does not change much when the input is disturbed, but the model output does have a high degree of confidence in the wrong answer, which can lead to a reduction in the accuracy of the network. However, it is difficult to accurately determine the data by designing from the network level, so that a method for generating a countermeasure sample needs to be introduced to generate disturbance data and train the disturbance data, so as to improve the accuracy of the network on the data. And completing the platform for adding countermeasure sample methods such as Deepfol, StepLL, FGSM and CW algorithm, and calling parameterization.

The Deepfol is a classical anti-attack mode, defines sample robustness and model robustness mirrors for the first time, and can accurately calculate the disturbance of a depth classifier on a large-scale data set, so that the robustness of the classifier is reliably quantified. The FGSM is called Fast Gradient Signal Method (Fast Gradient descent Method) in the white-box environment, and the sample under FGSM attack is obtained by calculating the derivative of the model to the input, then obtaining the specific Gradient direction by using a Sign function, and then multiplying by a step length to obtain the 'disturbance' which is added to the original input. StepLL is called Single-Step Least-Likely Class Method, and is also a Single-Step attack Method, and is different from the FGSM Method in which the distance between an image and a real label is increased, the Method is used for restricting the distance between the image and the Class with the lowest classification probability.

The CW algorithm is generally considered as one of the white-box attack algorithms with the strongest attack capability, and is also an optimization-based countermeasure sample generation algorithm. Most literature classifies it as a gradient-based attack algorithm, as well as depfool, FGSM. And in fact it is an optimization-based confrontational sample generation algorithm. The innovation of the CW algorithm lies in the definition of the loss function (objective function). In the process of directional attack, the objective function often uses cross entropy, and the process of iterative optimization is a process of continuously reducing the objective function.

Challenge sample generation methods generally aim to solve the following problems:

wherein, delta is disturbance, S is disturbance space, x is input data of training, y is output result (label) of training, L (theta, x, y) is loss function of the neural network model, D is sample distribution of the training data, theta is integral parameter of the neural network model (or parameter set of the neural network model), rho (theta) is integral objective function,

expressed as an objective function to the innermost layer

Taking the minimum value.

Objective function of the innermost layer

The method is an objective function of an attacker without a target label, and has the physical significance that the larger the function of a loss function on the (x + delta, y) sample point is, the better the function is, so that the loss of a neural network model on the correct label of the neural network model is particularly large, the logistic regression value corresponding to the correct label is small, and the outer min function represents that the standard training result is the minimum loss value.

Wherein, the basic steps of the generation of the confrontation sample are as follows:

A. sending the training image into a network and carrying out forward;

B. calculating a loss value obtained in a forward direction;

C. the forward loss value is subjected to derivation, and a noise image is generated according to the derivation result and by adopting a plurality of preset noise image generation algorithms;

D. and adding the noise image to the original image to generate a new image to start normal training.

The application designs two noise image generation methods, which are specifically as follows:

(1) the first method for generating the confrontation sample comprises the following steps: a set of method for generating a noise image by iteration is designed, the iteration times need to be set in advance, and the specific mode is realized by adopting the following formula:

wherein, t is the iteration number,

shows the derivation of the loss value,. pi_x+SRepresenting a projection function, wherein alpha is a preset fixed value, and preferably 2/255 is taken in the embodiment of the application; the noise generated by each iteration and the last training data (wherein the noise generated for the first time is directly added to the original training image, the generated data needs to be truncated, the data is kept in the original data distribution, and the truncated data is used as new training data) are all sent into the network to solve the loss value again, then new noise is generated according to the loss value, and through multiple iterations, a nonlinear disturbance noise image can be generated.

(2) The second method is based on the modification of the method (1):

wherein, E represents the mean value, and the difference from the previous version lies in the calculation of the gradient, in the iterative process, the mean value of each gradient needs to be calculated, and then the mean value is combined to calculate the perturbation. Because the mean value of the gradient calculated every time is calculated, the noise of each iteration is equalized, the generation of the noise is optimized, and the final experimental result shows that the effect is better than that of the method.

Similarly, in this embodiment, to obtain a better network, the network training needs to be optimized. In the embodiment of the application, a set of network tuning strategies is designed, and the main innovation point is that the block is designed for a loss function. Loss functions such as cross entry, arc face loss, focal loss, label smoothening, smoothening loss, circle loss, am softmax loss and the like are designed and added, and the loss functions can be called in a one-key parameter configuration mode;

in general, large deep neural networks are very powerful, but they have lost significant memory and have been less than ideal for sensitivity against samples. Whereas mixup is a solution that simply alleviates both problems. In essence, mixup trains neural networks on convex combinations (convex combinations) of pairs of samples and their labels. In doing so, the mixup canonical neural network enhances linear expression between training samples. Research results show that the mixup can improve the generalization capability of the current most advanced neural network architecture, and in addition, the mixup can reduce the memory of error labels, increase the robustness of antagonistic samples, and stabilize the training process of generating antagonistic networks. Further, the embodiment of the application modifies the mix up through the multi-network so as to be capable of adapting to all the above-mentioned loss functions, and when the loss functions are called, whether the mix up is loaded or not can be configured, so that the optimization of the training task is pulled up.

The general mix up uses the steps of:

(1) taking data of a batch, namely an image;

(2) combining the sequential batch and the reverse batch one by one to obtain two images, applying the two images to i and j in the formula, wherein the two images correspond to each other (x)_i，y_i)，(x_j，y_j)；

(3) Generating lambda by using a random generator, taking the value of the lambda as (0, 1), randomly taking the value, and generating a new batch image according to the following formula, namely the mix up fused image

(4) Softening labels, wherein the softening method comprises the steps of firstly converting labels into a one hot vector form, then arranging the labels in sequence and reverse order, and generating a new one hot vector label result according to a formula;

(5) sending the newly generated image into a network, carrying out forward, and solving a loss value according to the newly generated one hot vector;

another innovation point of the embodiment of the present application is that 1, the original image is not changed, and the loss value of the last layer of the network is calculated by using the above formula, and finally the loss value is solved. 2. The integrated mix up is made into a general basic component, and when other loss value function design methods are called, the methods can be synchronously called to optimize the network.

It should be noted that, in the embodiment of the present application, the augmentation intermediate parameters may also include a matrix relationship under different exposure parameters corresponding to the original image, and a derivative value obtained by derivation after the original image is input into the preset network model, and at this time, the finally obtained augmentation image includes a process for simulating camera imaging to generate an effect image similar to that generated when the camera normally shoots, and also includes a disturbance noise image generated by an antagonistic sample generation method, so that the recognition capability and generalization performance of the neural network image may be improved.

The system can call the classification network in a parameterization mode, call the data augmentation strategy in a parameterization mode and call different optimization strategies in a parameterization mode, and a set of algorithm platforms can enable a user to self-define a network designed and optimized. The method has the advantages of reducing the trial time of algorithm engineers on different classification networks and improving the landing of the algorithm on different services.

It should be noted that, in the embodiment of the present application, only the two data augmentation manners are illustrated, and it is understood that other more data manners may also be included, for example, more conventional data augmentation manners, specifically, for example, to augment data, so as to improve accuracy of a network and generalization performance of the network; the data augmentation method includes image level (inversion, rotation, cropping, etc.), pixel level (addition, subtraction, multiplication, blurring, contrast enhancement, noise, etc.), and an augmentation method of drop method (pixel random drop, image block drop, etc.), and performs data augmentation on the original image, and is not limited herein.

In order to better implement the data augmentation method in the embodiment of the present application, based on the data augmentation method, the embodiment of the present application further provides a data augmentation device, as shown in fig. 5, the data augmentation device 500 includes an obtaining module 501, a calculating module 502, and a generating module 503, which are specifically as follows:

an obtaining module 501, configured to obtain an acquired original image;

a calculating module 502, configured to calculate an augmentation intermediate parameter for augmenting the original image, where the augmentation intermediate parameter is used to calculate an intermediate parameter for augmenting the original image, where the intermediate parameter is generated based on a preset augmentation condition, and the augmentation intermediate parameter includes a matrix relationship under different exposure parameters corresponding to the original image, or/and a derivation value obtained by deriving the original image after the original image is input into a preset network model;

a generating module 503, configured to generate an augmented image corresponding to the original image according to the augmented intermediate parameter.

In the embodiment of the application, the acquired original image is acquired through the acquisition module 501; the calculation module 502 calculates an augmentation intermediate parameter for augmenting the original image, where the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, and the augmentation intermediate parameter includes a matrix relationship under different exposure parameters corresponding to the original image, or/and a derivation value obtained by deriving the original image after the original image is input into a preset network model; the generating module 503 generates an augmented image corresponding to the original image according to the augmented intermediate parameter. According to the method, on the basis that the diversity effect of the traditional data augmentation method on the rich training samples in the prior art is limited, augmented intermediate parameters such as matrix relations under different exposure parameters corresponding to the original images or/and derived values obtained after the original images are input into the preset network model are derived to generate augmented images corresponding to the original images, the diversity of the training samples is enriched, and the accuracy of classification after the network model is trained in the later stage is improved.

In some embodiments of the present application, the augmented intermediate parameter includes a matrix relationship under different exposure parameters corresponding to the original image, and the calculating module 502 is specifically configured to:

In some embodiments of the present application, the calculating module 502 is specifically configured to:

the generating module 503 is specifically configured to:

respectively taking each image in the augmented image library as a target augmented image, and mapping the pixel value of the target augmented image to a pixel value of a preset dimension;

after the pixel value of the target augmented image is mapped to the pixel value of a preset dimensionality, determining the mapping relation between the target augmented image and the image of the target augmented image under different exposure parameters;

determining a target exposure parameter corresponding to the closest image;

M＝αM_s

α＝exp(-d²/2σ²)

wherein d is the distance between the original image and the closest image in the augmented image library, σ is a preset fixed parameter, and M is_sFor the matrix relationship under different exposure parameters corresponding to the original image, I_(in)As an original image, I_(out)The image is an enlarged image under different exposure parameters corresponding to the original image.

In some embodiments of the present application, the augmented intermediate parameter includes a derivative value obtained by derivation after the original image is input into a preset network model, and the calculating module 502 is specifically configured to:

carrying out derivation on the loss value to obtain a derivation value;

the generating module 503 is specifically configured to:

An embodiment of the present application further provides a computer device, which integrates any one of the data amplification apparatuses provided in the embodiment of the present application, where the computer device includes:

one or more processors;

a memory; and

one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to perform the steps of the data augmentation method described in any of the above data augmentation method embodiments.

The embodiment of the application also provides computer equipment, which integrates any data amplification device provided by the embodiment of the application. As shown in fig. 6, it shows a schematic structural diagram of a computer device according to an embodiment of the present application, specifically:

the computer device may include components such as a processor 601 of one or more processing cores, memory 602 of one or more computer-readable storage media, a power supply 603, and an input unit 604. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 6 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:

the processor 601 is a control center of the computer device, connects various parts of the whole computer device by using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby monitoring the computer device as a whole. Optionally, processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.

The memory 602 may be used to store software programs and modules, and the processor 601 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 with access to the memory 602.

The computer device further comprises a power supply 603 for supplying power to the various components, and preferably, the power supply 603 is logically connected to the processor 601 through a power management system, so that functions of managing charging, discharging, and power consumption are realized through the power management system. The power supply 603 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The computer device may also include an input unit 604, the input unit 604 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 601 in the computer device loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 601 runs the application programs stored in the memory 602, thereby implementing various functions as follows:

acquiring an acquired original image;

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, an embodiment of the present application provides a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like. The computer program is stored on the computer readable medium, and is loaded by the processor to execute the steps of any one of the data augmentation methods provided by the embodiments of the present application. For example, the computer program may be loaded by a processor to perform the steps of:

acquiring an acquired original image;

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed descriptions of other embodiments, and are not described herein again.

In a specific implementation, each unit or structure may be implemented as an independent entity, or may be combined arbitrarily to be implemented as one or several entities, and the specific implementation of each unit or structure may refer to the foregoing method embodiment, which is not described herein again.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

The data augmentation method, apparatus, computer device and storage medium provided by the embodiments of the present application are introduced in detail above, and a specific example is applied in the present application to explain the principle and implementation manner of the present application, and the description of the above embodiments is only used to help understand the method and core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of data augmentation, the method comprising:

acquiring an acquired original image;

2. The data augmentation method of claim 1, wherein the augmented intermediate parameters include matrix relationships under different exposure parameters corresponding to the original image, and the calculating augmented intermediate parameters for augmenting the original image includes:

3. The data augmentation method of claim 2, wherein the determining, according to the augmented image library, a matrix relationship under different exposure parameters corresponding to the original image comprises:

4. The data augmentation method of claim 3, wherein the separately determining a mapping relationship between each image in the augmented image library and the image of the image under different exposure parameters comprises:

5. The data augmentation method of claim 4, wherein the minimizing the change to the image matrix to obtain the matrix relationship under different exposure parameters corresponding to the target augmented image comprises:

6. The data augmentation method of claim 3, wherein the generating augmented images under different exposure parameters corresponding to the original image based on the matrix relationship comprises:

7. The data augmentation method according to claim 1, wherein the augmentation intermediate parameters include derivative values obtained by derivation after the original image is input into a preset network model, and the calculating of the augmentation intermediate parameters for augmenting the original image includes:

carrying out derivation on the loss value to obtain a derivation value;

generating an augmented image corresponding to the original image according to the augmented intermediate parameter, wherein the augmented image comprises:

8. A data augmentation apparatus, the apparatus comprising:

the acquisition module is used for acquiring the acquired original image;

the calculation module is used for calculating an augmentation intermediate parameter for augmenting the original image, the augmentation intermediate parameter is used for generating an intermediate parameter for augmenting the original image based on a preset augmentation condition, the augmentation intermediate parameter comprises a matrix relation under different exposure parameters corresponding to the original image, or/and a derivative value obtained by derivation after the original image is input into a preset network model;

9. A computer device, characterized in that the computer device comprises:

one or more processors;

a memory; and

one or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the processor to implement the data augmentation method of any one of claims 1 to 7.

10. A computer-readable storage medium, having stored thereon a computer program which is loaded by a processor to perform the steps of the data augmentation method of any one of claims 1 to 7.