CN114937187A

CN114937187A - Image optimization method, device, equipment and storage medium

Info

Publication number: CN114937187A
Application number: CN202210684795.7A
Authority: CN
Inventors: 曹琼; 陈夏宁; 陶大程
Original assignee: Jingdong Technology Information Technology Co Ltd
Current assignee: Jingdong Technology Information Technology Co Ltd
Priority date: 2022-06-16
Filing date: 2022-06-16
Publication date: 2022-08-23

Abstract

The embodiment of the invention discloses an image optimization method, an image optimization device, image optimization equipment and a storage medium, wherein the method comprises the following steps: determining a prediction classification of the noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; wherein the pre-training network is used for image classification; carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a loss function according to the classification loss and the canonical loss; and optimizing the noise image, calculating a function value of a loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition. According to the technical scheme, the optimized image obtained by optimizing the noise image can be used as a training image to train the pre-training network again so as to improve the classification accuracy of the pre-training network and solve the problem that the real image for model training is difficult to obtain in the prior art.

Description

Image optimization method, device, equipment and storage medium

Technical Field

Embodiments of the present invention relate to image processing technologies, and in particular, to an image optimization method, an apparatus, a device, and a storage medium.

Background

Image classification is a hot spot of current research and has wide application in many fields. In recent years, deep learning techniques are widely used in image classification. The deep learning requires strong hardware computing power, a large amount of training data and a deeper network layer number to extract data, which is both the key point for exerting the advantages and the factor for limiting the application range of the deep learning.

In the prior art, a deep learning model needs to be trained by large-scale labeled real images so as to improve the classification accuracy of the deep learning model.

In the process of implementing the invention, at least the following technical problems are found in the prior art:

due to many limitations of privacy, storage, and transmission, it is often difficult to acquire real images for model training.

Disclosure of Invention

The invention provides an image optimization method, an image optimization device, image optimization equipment and a storage medium, which are used for optimizing a noise image to obtain an optimized image for training a model and solving the problem that a real image for training the model is difficult to obtain in the prior art.

In a first aspect, an embodiment of the present invention provides an image optimization method, including:

determining a prediction classification of a noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; wherein the pre-training network is used for image classification;

carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a loss function according to the classification loss and the regularized loss;

and optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition.

In a second aspect, an embodiment of the present invention further provides an image optimization apparatus, including:

the determining module is used for determining the prediction classification of the noise image based on a pre-training network, and determining the classification loss of the noise image according to the prediction classification and the classification label set for the noise image; wherein the pre-training network is used for image classification;

the processing module is used for carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss;

an execution module to determine a loss function based on the classification loss and the regularization loss;

and the optimization module is used for optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition.

In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:

at least one processor; and a memory communicatively coupled to the at least one processor;

wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the image optimization method of any one of the first aspects.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the image optimization method according to any one of the first aspect.

In a fifth aspect, the embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, the computer program implements the image optimization method according to any one of the first aspect.

The embodiment of the invention has the following advantages or beneficial effects:

the embodiment of the invention provides an image optimization method, which comprises the following steps: determining a prediction classification of a noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; wherein the pre-training network is used for image classification; carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a loss function according to the classification loss and the regularized loss; and optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition. According to the technical scheme, the image classification accuracy of the pre-training network is low, a large number of real images are required to be trained again, but the real images are difficult to acquire; firstly, the prediction classification of a noise image can be determined according to a pre-training network; and secondly, determining the classification loss of the noise image according to the prediction classification and a classification label randomly set for the noise image, determining a loss function according to the classification loss and the regular loss after determining the regular loss of the noise image, optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as the optimized image when the function value of the loss function meets a preset condition. When the loss function meets the preset condition, the prediction classification is close to the classification label, the texture distribution of the image is natural, and the image feature distribution is close to the real image feature, so that the optimized image can be determined as a training image, and the pre-training network is trained again, so that the classification accuracy of the pre-training network is improved, and the problem that the real image used for model training is difficult to obtain in the prior art is solved.

Drawings

Fig. 1 is a flowchart of an image optimization method according to an embodiment of the present invention;

FIG. 2 is a flow chart of another image optimization method provided by the embodiment of the invention;

FIG. 3 is a flow chart of step 240 of another image optimization method provided in an embodiment of the present invention;

fig. 4 is a schematic diagram illustrating a determination manner of a loss function in another image optimization method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an image optimization apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone.

The terms "first" and "second" and the like in the description and drawings of the present application are used for distinguishing different objects or for distinguishing different processes for the same object, and are not used for describing a specific order of the objects.

Furthermore, the terms "including" and "having," and any variations thereof, as referred to in the description of the present application, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but could have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

It should be noted that in the embodiments of the present application, words such as "exemplary" or "for example" are used to indicate examples, illustrations or explanations. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

In the description of the present application, the meaning of "a plurality" means two or more unless otherwise specified.

Fig. 1 is a flowchart of an image optimization method according to an embodiment of the present invention, where the embodiment is applicable to a situation where it is difficult to obtain a real image for training an image classification model, and the method may be performed by an image optimization apparatus, as shown in fig. 1, the method specifically includes the following steps:

step 110, determining a prediction classification of a noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image.

The pre-trained network is used for image classification, and may be a pre-trained convolutional neural network.

Specifically, the pre-training network may be configured to classify images, and when a noise image is input to the pre-training network as input information, the obtained output information may be a prediction classification corresponding to the noise image. While the predictive classification of the noise image is determined based on the pre-trained network, a classification label may also be randomly set for the noise image, which may be of any class, for example, a cat. Since the pre-training network only completes the pre-training, the classification accuracy is not high when the pre-training network is used for classifying images. Therefore, the predictive classification of noisy images determined from the pre-training network is not accurate.

The classification loss of the noise image can be determined according to the cross entropy loss function and the prediction classification, so that the noise image is continuously optimized into an image according with the significance of the classification label discrimination domain in the optimization process. In the embodiment of the invention, the classification loss can be determined through the predictive classification of the noise image determined by the pre-training network and the classification label set for the noise image, and the noise image can be optimized based on the classification loss, so that the noise image is optimized to be judged as the classification label by the image classification model.

And 120, carrying out distortion constraint and regularization processing on the noise image to obtain the regularization loss.

Specifically, the regularization loss may include two parts, a prior part and a regularization part, wherein the prior part may perform distortion constraint on the noise image, and the regularization part may perform regularization processing on the noise image. The noise image may be distortion constrained based on a TV regular loss function and an L2 regular loss function, the distortion constraint causing the noise image to be optimized to not appear a texture such as distortion that does not conform to the natural image distribution. The noise image may be regularized based on BN regularization, which allows the noise image to be optimized to conform to the feature distribution of the real image, thereby generating a meaningful image.

In the embodiment of the invention, in order to ensure that the noise image does not have textures such as distortion and the like which do not accord with the distribution of the natural image in the optimization process, and the noise is optimized to accord with the characteristic distribution of the real image and is meaningful, the regular loss can be determined.

And step 130, determining a loss function according to the classification loss and the regular loss.

Specifically, the classification loss and the regularization loss may be summed, and the result of the summation may be determined as a loss function.

And step 140, performing optimization processing on the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition.

Specifically, the color feature, texture feature, shape feature, and spatial relationship feature of the noise image may be adjusted on a pixel basis to perform optimization processing on the noise image, and after the optimization processing is completed, the function value of the loss function may be calculated from the processed noise image. If the function value of the loss function meets a preset condition, determining the processed noise image as an optimized image; and if the function value of the loss function does not meet the preset condition, continuing to optimize the noise image until the function value of the loss function meets the preset condition.

In the embodiment of the invention, the pre-training network is reversely deduced to realize the optimization of the noise image to obtain the optimized image, and the optimized image can be used as the training image to train the pre-training network again to improve the classification accuracy of the pre-training network and solve the problem that the real image for model training is difficult to obtain in the prior art.

The embodiment of the invention provides an image optimization method, which comprises the following steps: determining a prediction classification of a noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; wherein the pre-training network is used for image classification; carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a loss function according to the classification loss and the regularized loss; and optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition. According to the technical scheme, the image classification accuracy of the pre-training network is low, a large number of real images are required to be trained again, but the real images are difficult to acquire; firstly, the prediction classification of a noise image can be determined according to a pre-training network; and secondly, determining the classification loss of the noise image according to the prediction classification and a classification label randomly set for the noise image, determining a loss function according to the classification loss and the regular loss after determining the regular loss of the noise image, optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as the optimized image when the function value of the loss function meets a preset condition. When the loss function is converged, the prediction classification is close to the classification label, the texture distribution of the image is natural, and the image feature distribution is close to the real image feature, so that the optimized image can be determined as a training image, and the pre-training network is trained again, so that the classification accuracy of the pre-training network is improved, and the problem that the real image used for model training is difficult to obtain in the prior art is solved.

Fig. 2 is a flowchart of another image optimization method provided in the embodiment of the present invention, which is applicable to a situation where it is difficult to obtain a real image for training an image classification model. On the basis of the embodiment, before the predictive classification of the noise image is determined based on the pre-training network, a network model for image classification is additionally constructed, and the pre-training is carried out on the network model to obtain the pre-training network. Before determining a loss function from the classification loss and the regularization loss, adding "determining a diversification loss of the noise image from a simplest positive sample, a most difficult positive sample, and a most difficult negative sample of the noise image. The explanation of the same or corresponding terms as those in the above embodiments is omitted. As shown in fig. 2, the image optimization method provided in the embodiment of the present invention specifically includes the following steps:

step 210, constructing a network model for image classification, and pre-training the network model to obtain the pre-training network.

In one embodiment, step 210 may specifically include:

constructing the network model for image classification based on a convolutional neural network; pre-training the network model through a public data set, and calculating a pre-loss function; and optimizing the network based on a back propagation algorithm until the pre-loss function is converged to obtain the pre-training network.

Specifically, the convolutional neural network for deep learning can be used for image classification, so that a network model for image classification can be constructed based on the convolutional neural network, after the network model is constructed, the network model can be pre-trained according to a public data set, specifically, the network model can be pre-trained based on an image Net data set, and a pre-loss function is calculated at the same time.

In the embodiment of the invention, before the prediction classification of the noise image is determined, a network model for image classification can be constructed, and certainly, the network model can be pre-trained on the basis of a public data set to improve the classification accuracy, but the classification accuracy of a pre-trained network obtained by pre-training the network model is still low and cannot be directly applied to image classification.

Step 220, determining a prediction classification of the noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image.

Where the noisy image can be represented by x, x ∈ R ^H×W×C H denotes the width of the noise image, W denotes the height of the noise image, and C denotes the number of channels of the noise image.

In one embodiment, step 220 may specifically include:

determining a cross entropy loss function of the classification label and the prediction classification, and determining the cross entropy loss function as the classification loss.

Specifically, the classification loss may be determined based on equation 1.

L _CE (m, n) ═ distance (n, p (m | x)) formula 1

Where m denotes a prediction classification, n denotes a classification label, and x denotes a noise image.

In the embodiment of the invention, the classification label is determined according to the cross entropy loss function and the classification loss of the noise image is determined by prediction and classification, so that the noise image is continuously optimized into the image which accords with the judgment domain meaning of the classification label in the optimization process. For example, when the classification label is cat, the noise image is finally optimized such that the image classification model can judge the image as cat.

And step 230, carrying out distortion constraint and regularization processing on the noise image to obtain the regularization loss.

In one embodiment, step 230 may specifically include:

determining a regular loss function of the noise image, and determining a distortion constraint result according to the regular loss function; determining an input statistic and a storage statistic of the image features of the noise images according to the pre-training network, and determining a regularization processing result according to the input statistic and the storage statistic, wherein the input statistic comprises a mean value and/or a variance of the image features of each image in the noise images of the batch to which the noise images belong, and the storage statistic comprises a mean value and/or a variance of the image features stored in the pre-training network; determining a sum of the distortion constraint result and the regularization processing result as the regularization loss.

Where the regularized loss may consist of two parts, the a priori part R _prior And a regularizing moiety R _BN . The priori part can perform distortion constraint on the noise image, and the regularization part can perform regularization processing on the noise image.

In particular, the noise image may be distortion constrained based on a regular loss function, including a TV regular loss function and an L2 regular loss function. The noise image may be distortion constrained based on the TV regular loss function and L2, and in particular, the distortion constraint may be determined based on equation 2. The noise image may be regularized based on BN regularization, and in particular, the regularization may be determined based on equation 3.

R _prior (x)＝α _TV R _TV (x)+α _L2 R _L2 (x) Equation 2

Wherein alpha is _TV Preset coefficient, R, representing the TV regularized loss function in the prior portion _TV (x) TV regular loss function, alpha, representing noisy images _L2 Preset coefficient, R, representing the regular loss function of L2 in the prior part _L2 (x) L2 canonical loss function representing a noisy image. In practical application, alpha _TV And alpha _L2 The specific value of (A) can be determined according to actual requirements, for example, alpha _TV May be 10 ^-2 ，α _L2 May be 10 ^-3 。

Wherein alpha is _BN A preset coefficient representing the regularization part, L represents the number of BN layers in the pre-training network, f (x) represents the image characteristics of the noise image output by the pooling layer after the noise image is input into the pre-training network, and mu _l f (x) input mean, μ, representing image features _l ^BN Representing the stored mean, mu, of the pre-training network _l ² f (x) input variance, μ, representing image characteristics _l ^2BN Representing the stored variance stored in the pre-trained network. In practical application, alpha _BN The specific value of (A) can also be determined according to actual requirements, for example, alpha _BN May be 10.

As the mean and variance information stored in the pre-training network represents the feature distribution of the real image learned by the pre-training network, the image noise is optimized to be in accordance with the feature distribution, so that the image which can be used as a training image to train the pre-training network is conveniently obtained, and a meaningful image is generated.

Thus, a regular loss R (x) R may be determined _prior (x)+R _BN f(x)。

In the embodiment of the invention, distortion constraint enables a noise image to be optimized into texture which does not conform to natural image distribution, such as distortion, and regularization processing enables the noise image to be optimized into feature distribution which conforms to a real image, so that a meaningful image is generated.

Step 240, determining the loss of diversity of the noise image according to the simplest positive sample, the most difficult positive sample and the most difficult negative sample of the noise image.

In order to ensure the diversity of the optimized images, the diversity loss can maintain the diversity of the boundary and the interior in the batch optimization images. The diversification loss can enable each noise image in the batch of noise images to be optimized into images of various categories, and guarantee the correctness of the image categories in the batch optimized images obtained by optimizing the batch of noise images, and guarantee the semantic features of the optimized images.

Fig. 3 is a flowchart of step 240 in another image optimization method provided in an embodiment of the present invention, and as shown in fig. 3, in an implementation, step 240 may specifically include:

step 2410, determining the simplest positive samples, the most difficult positive samples, and the most difficult negative samples of the noise image.

The simplest positive sample represents an image with the minimum characteristic Euclidean distance from the current noise image in other images of the batch of noise images to which the current noise image belongs; the most difficult positive sample represents an image with the maximum Euclidean distance from the characteristics of the current noise image in other noise images which are similar to the current noise image in the batch of noise images; the most difficult negative samples represent the images with the largest characteristic Euclidean distance from the current noise image in other noise images which are not similar to the current noise image in the batch of noise images.

Specifically, in the batch of noise images input to the pre-trained network, for each noise image, his simplest positive sample may be determined based on equation 4, the most difficult positive sample may be determined based on equation 5, and the most difficult negative sample may be determined based on equation 6.

Wherein x is _ep Represents the simplest positive sample, x _a Representing the current noisy image, x _b Representing noise images other than the current noise image in the batch of noise images, f (x) _a ) Image features, f (x), representing the current noise image output by the pooling layer after the current noise image is input into the pre-training network _b ) Represents x _b After inputting the pre-training network, x output by the pooling layer _b Image feature of (d), dist (f (x) _a ),f(x _b ))＝||f(x _a )-f(x _b )|| ₂ The euclidean distance between the image features of the current noise image and the image features of the other noise images, that is, the characteristic euclidean distance between the two noise images, may be represented.

Wherein x is _p Representing other noise images of the same kind as the current noise image in the batch of noise images, f (x) _p ) Represents x _p After inputting the pre-training network, x output by the pooling layer _p Image feature of (d), dist (f (x) _a ),f(x _p ))＝||f(x _a )-f(x _p )|| ₂ It can be expressed that the euclidean distance of the image features of the current noise image and the image features of other similar noise images, that is, the characteristic euclidean distance of the two noise images, is calculated.

Wherein x is _n Representing other noise images of the batch of noise images that are not of the same class as the current noise image, f (x) _n ) Represents x is _n After inputting the pre-training network, x output by the pooling layer _n Image feature of (d), dist (f (x) _a ),f(x _n ))＝||f(x _a )-f(x _n )|| ₂ Can watchAnd calculating Euclidean distances between the image characteristics of the current noise image and the image characteristics of other different noise images, namely the characteristic Euclidean distances between the two noise images.

In the embodiment of the invention, for the batch of noise images input into the pre-training network, the simplest positive sample, the most difficult positive sample and the most difficult negative sample of each noise image can be determined, and a data basis is prepared for the diversification loss of the noise images.

Step 2420, determining a first diversity loss according to the noise image and the simplest positive sample; determining a second diversity loss from the noise image, the most difficult positive samples, and the most difficult negative samples.

In one embodiment, step 2420 may specifically include:

determining a first characteristic Euclidean distance between the noise image and the simplest positive sample, and determining the first diversity loss according to the first characteristic Euclidean distance; determining a second characteristic Euclidean distance between the noise image and the most difficult positive sample and a third characteristic Euclidean distance between the noise image and the most difficult negative sample, and determining the second diversity loss according to the second characteristic Euclidean distance and the third characteristic Euclidean distance.

Specifically, the closer the images are in the hidden space, the more similar the representative images are, so that explicitly increasing the distance between the images can increase the diversity of the images. After the simplest positive sample of the current noise image in the batch of noise images is obtained, the characteristic Euclidean distance between the current noise image and the simplest positive sample can be increased, so that the diversity of the batch optimized images obtained by optimizing the batch of noise images is ensured. Specifically, the diversity increase result, i.e., the first diversity loss, may be determined based on equation 7.

L _ep (x)＝-dist(f(x _a ),f(x _ep ) Equation 7)

In one embodiment, determining the second diversity loss from the second characteristic euclidean distance and the third characteristic euclidean distance comprises:

determining a difference value between the second characteristic Euclidean distance and the third characteristic Euclidean distance, and calculating a sum value of the difference value and a preset hyper-parameter; determining the sum value as a second diversity loss if the sum value is greater than zero; if the sum is less than zero, the second diversity loss is determined to be zero.

Specifically, in order to ensure that the batch optimization images are still in the discrimination domain of the classifier after the distance is increased, and thus to ensure that the semantic discrimination information of the batch optimization images is not destroyed, a semantic optimization result, that is, a second diversification loss, may be introduced. The second diversity loss may specifically be determined based on equation 8.

L _triplet (x)＝max(0,dist _ap -dist _an + margin) formula 8

Wherein, dist _ap ＝dist(f(x _a ),f(x _p ))＝||f(x _a )-f(x _p )|| ₂ ，dist _an ＝dist(f(x _a ),f(x _n ))＝||f(x _a )-f(x _n )|| ₂ And margin is a preset hyper-parameter.

The distances between the noise image and the positive sample and the nearest negative sample which are farthest away from the hidden space are greater than the preset hyper-parameter, namely the sum of the distances between the noise image and the most difficult positive sample and the sum of the distances between the noise image and the most difficult negative sample is greater than the preset hyper-parameter, so that the noise image cannot be mistakenly optimized into discrimination spaces of other categories, the accuracy of the category of the optimized image is guaranteed, and the semantic features of the optimized image are guaranteed while the diversity is increased.

In the embodiment of the invention, a first diversification loss can be determined based on the simplest positive sample of the current noise image in the batch of noise images to which the current noise image belongs and the current noise image; a second loss of diversity may be determined based on the most difficult positive and most difficult negative samples of the current noise image in the batch of noise images to which the current noise image belongs and the current noise image.

Step 2430, determining the diversity loss by the first diversity loss and the second diversity loss.

Specifically, the diversity loss may be determined based on equation 9.

L _intra-div (x)＝α _ep L _ep (x)+α _triplet L _triplet (x) Equation 9

Wherein alpha is _ep A predetermined coefficient, alpha, representing a first loss of diversity _triplet A predetermined coefficient representing a first loss of diversity. In practical application, alpha _ep And alpha _triplet The specific value of (A) can also be determined according to actual requirements, for example, alpha _ep May be 50, alpha _triplet May be 0.5.

The diversity loss can be combined with the first diversity loss and the first diversity loss, so that the diversity of the batch optimization images is increased, and the semantic features of the optimization images in the batch optimization images are guaranteed.

In the embodiment of the invention, in order to increase the diversity of the batch optimization images and ensure the correctness of the image categories of the batch optimization images, the semantic features of the batch optimization images are ensured while the diversity is increased, and the diversity loss can be determined. The diversity loss ensures that the samples are still within the boundary range of the classification domain while increasing the intra-class sample spacing, thereby generating an image with high quality diversity.

Step 250, determining a loss function according to the classification loss, the regularization loss and the diversity loss.

Fig. 4 is a schematic diagram of a determination manner of a loss function in another image optimization method provided by an embodiment of the present invention, and an exemplary determination manner is given. As shown in fig. 4, includes:

step 410, calculate the prior regularization loss, i.e., distortion constraint R, for noisy images _prior (x)。

Step 420, inputting the noise image into the pre-training model, and determining the regular loss R according to the image characteristics f (x) output by the pooling layer contained in the pre-training model _BN f(x)。

Step 430, sample mining is carried out on the noise image, the simplest sample, the most difficult positive sample and the most difficult negative sample of the noise image in the batch of noise images to which the noise image belongs are determined, anddetermining a diversity loss L from a noisy image, a simplest positive sample, a most difficult positive sample, and a most difficult negative sample _intra-div (x)。

Step 440, determining a classification loss L according to the prediction classification of the noise image output by the pre-training model and the classification label set for the noise image _CE (m，n)。

Step 450, loss R according to prior regularization _prior (x) Regular loss R _BN f (x), loss of diversity L _intra-div (x) And a classification loss L _CE (m, n) determining the loss function LOSS (x).

In particular, the sum of the classification loss, regularization loss and diversity loss may determine the loss function loss (x) L _CE (m，n)+R(x)+L _diversity (x)。

And step 260, optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition.

In one embodiment, step 260 may specifically include:

and optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as the optimized image when the function value of the loss function is converged.

After determining the loss function, an optimized image may be determined based on equation 10.

Wherein x is _y Representing an optimized image.

Specifically, the color feature, texture feature, shape feature, and spatial relationship feature of the noise image may be adjusted on a pixel basis to perform optimization processing on the noise image, and after the optimization processing is completed, the function value of the loss function may be calculated from the processed noise image. If the function value of the loss function shows that the loss function is converged, determining the processed noise image as an optimized image; and if the function value of the loss function indicates that the loss function has not converged, continuing to perform optimization processing on the noise image until the loss function converges.

In practical application, if the function value of the loss function indicates that the loss function is not converged, optimizing the noise image to obtain a processed noise image, calculating the function value of the loss function according to the processed noise image, and if the function value indicates that the loss function is converged, determining the processed noise image as an optimized image; if the function value indicates that the loss function has not converged, the optimization of the noise image is continued.

In the embodiment of the invention, the classification loss is used for determining the classification label and predicting the loss function of classification, the regularization loss is used for determining the regularization item of the noise image, and the diversification loss is used for ensuring the diversity of the optimized image generated by optimizing the noise image.

The embodiment of the invention provides an image optimization method, which comprises the following steps: constructing a network model for image classification, and pre-training the network model to obtain a pre-training network; determining a prediction classification of a noise image based on a pre-training network, and determining a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a diversity loss for the noise image from a simplest positive sample, a most difficult positive sample, and a most difficult negative sample of the noise image; determining a loss function according to the classification loss, the regularization loss, and the diversification loss; and optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value of the loss function meets a preset condition. According to the technical scheme, after the network model for image classification is constructed, the network model can be pre-trained to obtain a pre-training network, the pre-training network can be used for classifying images, however, the image classification accuracy of the pre-training network is low, a large amount of real data is needed for retraining the pre-training network again, and the real data is difficult to obtain; firstly, the prediction classification of a noise image can be determined according to a pre-training network; and secondly, determining the classification loss of the noise image according to the prediction classification and a classification label randomly set for the noise image, after determining the regular loss and the diversity loss of the noise image, determining a loss function according to the classification loss, the regular loss and the diversity loss, optimizing the noise image, calculating a function value of the loss function according to the processed noise image, and determining the processed noise image as an optimized image when the function value shows that the loss function is converged. The prediction classification approaches to the classification label when the loss function is converged, the texture distribution of the image is natural, the image feature distribution approaches to the real image feature, and the image diversity is strong, so that the optimized image can be determined as a training image, and the pre-training network is trained again to improve the classification accuracy of the pre-training network, and the problem that the real image for model training is difficult to acquire in the prior art is solved.

Fig. 5 is a schematic structural diagram of an image optimization apparatus according to an embodiment of the present invention, which belongs to the same inventive concept as the image optimization method, and reference may be made to the related description of the image optimization method for details that are not described in detail in this embodiment.

The specific structure of the image optimization apparatus is shown in fig. 5, and includes:

a determining module 510, configured to determine a prediction classification of a noise image based on a pre-training network, and determine a classification loss of the noise image according to the prediction classification and a classification label set for the noise image; wherein the pre-training network is used for image classification;

a processing module 520, configured to perform distortion constraint and regularization on the noise image to obtain a regularization loss;

an execution module 530 for determining a penalty function based on the classification penalty and the regularization penalty;

and an optimizing module 540, configured to perform optimization processing on the noise image, calculate a function value of the loss function according to the processed noise image, and determine the processed noise image as an optimized image when the function value of the loss function satisfies a preset condition.

On the basis of the above embodiment, the apparatus further includes a building module configured to:

constructing a network model for image classification, and pre-training the network model to obtain a pre-training network;

the method is specifically used for: constructing the network model for image classification based on a convolutional neural network; pre-training the network model through a public data set, and calculating a pre-loss function; and optimizing the network based on a back propagation algorithm until the pre-loss function is converged to obtain the pre-training network.

On the basis of the foregoing embodiment, the determining module 510 is specifically configured to:

determining a predictive classification of a noisy image based on a pre-training network, determining a cross entropy loss function of the classification label and the predictive classification, and determining the cross entropy loss function as the classification loss.

On the basis of the foregoing embodiment, the processing module 520 is specifically configured to:

determining a regular loss function of the noise image, and determining a distortion constraint result according to the regular loss function;

determining an input statistic and a storage statistic of the image features of the noise images according to the pre-training network, and determining a regularization processing result according to the input statistic and the storage statistic, wherein the input statistic comprises a mean value and/or a variance of the image features of each image in the noise images of the batch to which the noise images belong, and the storage statistic comprises a mean value and/or a variance of the image features stored in the pre-training network;

determining a sum of the distortion constraint result and the regularization processing result as the regularization loss.

On the basis of the above embodiment, the apparatus further includes:

a diversification module to determine a diversification loss of the noise image from a simplest positive sample, a most difficult positive sample, and a most difficult negative sample of the noise image.

Accordingly, the executing module 530 is specifically configured to:

summing the classification loss, the regularization loss, and the diversity loss, and determining a result of the summing as the loss function.

On the basis of the above embodiment, the diversification module is specifically configured to:

determining the simplest positive samples, the most difficult positive samples, and the most difficult negative samples of the noise image;

determining a first diversity loss from the noisy image and the simplest positive sample; determining a second diversity loss from the noise image, the most difficult positive samples, and the most difficult negative samples;

determining the diversity loss by the first diversity loss and the second diversity loss.

In one embodiment, determining the simplest positive sample, the most difficult positive sample, and the most difficult negative sample of the noise image comprises:

determining the Euclidean distance between the image characteristics of the noise image and the image characteristics of other images in the batch of noise images to which the noise image belongs;

and determining the image with the minimum characteristic Euclidean distance as the simplest positive sample, determining the same type of image with the maximum characteristic Euclidean distance as the most difficult positive sample, and determining the different type of image with the minimum characteristic Euclidean distance as the most difficult negative sample.

In one embodiment, a first diversity loss is determined from the noise image and the simplest positive samples; determining a second diversity loss from the noise image, the most difficult positive samples, and the most difficult negative samples, comprising:

determining a first characteristic Euclidean distance between the noise image and the simplest positive sample, and determining the first diversity loss according to the first characteristic Euclidean distance;

determining a second characteristic Euclidean distance between the noise image and the most difficult positive sample and a third characteristic Euclidean distance between the noise image and the most difficult negative sample, and determining the second diversity loss according to the second characteristic Euclidean distance and the third characteristic Euclidean distance.

On the basis of the foregoing embodiment, the optimization module 540 is specifically configured to:

The image optimization device provided by the embodiment of the invention can execute the image optimization method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the image optimization apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, the specific names of the functional units are only for the convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present invention. FIG. 6 illustrates a block diagram of an exemplary computer device 6 suitable for use in implementing embodiments of the present invention. The computer device 6 shown in fig. 6 is only an example and should not bring any limitation to the function and the scope of use of the embodiments of the present invention.

As shown in fig. 6, the computer device 6 is in the form of a general purpose computing device. The components of computer device 6 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 6 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 6 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The computer device 6 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, and commonly referred to as a "hard drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. System memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Computer device 6 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 6, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 6 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 6 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) through network adapter 20. As shown in FIG. 6, network adapter 20 communicates with the other modules of computer device 6 via bus 18. It should be understood that although not shown in FIG. 6, other hardware and/or software modules may be used in conjunction with computer device 6, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and page display by running a program stored in the system memory 28, for example, to implement the image optimization method provided by the present embodiment, the method includes:

Of course, those skilled in the art can understand that the processor can also implement the technical solution of the image optimization method provided in any embodiment of the present invention.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements an image optimization method provided in the embodiments of the present invention, for example, the method includes:

carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss; determining a loss function according to the classification loss and the canonical loss;

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer-readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It will be understood by those skilled in the art that the modules or steps of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and optionally they may be implemented by program code executable by a computing device, such that it may be stored in a memory device and executed by a computing device, or it may be separately fabricated into various integrated circuit modules, or it may be fabricated by fabricating a plurality of modules or steps thereof into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments illustrated herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An image optimization method, comprising:

carrying out distortion constraint and regularization processing on the noise image to obtain regularization loss;

determining a loss function according to the classification loss and the regularized loss;

2. The image optimization method of claim 1, prior to determining the predictive classification of the noisy image based on the pre-trained network, further comprising:

and constructing a network model for image classification, and pre-training the network model to obtain the pre-training network.

3. The image optimization method of claim 1, wherein determining the classification loss of the noise image according to the prediction classification and a classification label set for the noise image comprises:

4. The image optimization method of claim 1, wherein the distortion constraint and regularization processing is performed on the noise image to obtain a regularization loss, and the method comprises:

and determining the sum of the distortion constraint result and the regularization processing result as the regularization loss.

5. The image optimization method according to claim 1, before determining a loss function according to the classification loss and the canonical loss, further comprising:

determining a diversity loss for the noise image from a simplest positive sample, a most difficult positive sample, and a most difficult negative sample of the noise image;

accordingly, determining a loss function from the classification loss and the canonical loss includes:

and summing the classification loss, the regularization loss and the diversity loss, and determining a summation result as the loss function.

6. The image optimization method of claim 5, wherein determining the loss of diversity of the noise image based on the simplest positive sample, the most difficult positive sample, and the most difficult negative sample of the noise image comprises:

7. The image optimization method of claim 6, wherein determining the simplest positive sample, the most difficult positive sample, and the most difficult negative sample of the noise image comprises:

8. The image optimization method of claim 6, wherein a first diversity loss is determined from the noise image and the simplest positive sample; determining a second diversity loss from the noise image, the most difficult positive samples, and the most difficult negative samples, comprising:

9. The image optimization method of claim 1, wherein determining the processed noise image as an optimized image when the function value of the loss function satisfies a preset condition comprises:

determining the processed noise image as the optimized image when the function value of the loss function converges.

10. The image optimization method according to claim 2, wherein constructing a network model for image classification, and pre-training the network model to obtain the pre-training network comprises:

constructing the network model for image classification based on a convolutional neural network;

pre-training the network model through a public data set, and calculating a pre-loss function;

and optimizing the network based on a back propagation algorithm until the pre-loss function is converged to obtain the pre-training network.

11. An image optimization apparatus, comprising:

12. A computer device, the device comprising:

wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the image optimization method of any one of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the image optimization method of any one of claims 1 to 10.

14. A computer program product, characterized in that the computer program product comprises a computer program which, when being executed by a processor, carries out the image optimization method according to any one of claims 1-10.