CN116863261A

CN116863261A - Image processing method, device, terminal, storage medium and program product

Info

Publication number: CN116863261A
Application number: CN202210297738.3A
Authority: CN
Inventors: 董旭炯
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2023-10-10

Abstract

The embodiment of the application discloses an image processing method, an image processing device, a terminal, a storage medium and a program product, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring an original training set; the original training set comprises at least one original sample pair; according to the respective pretreatment strategy of at least one original sample pair, carrying out image pretreatment on the at least one original sample pair to obtain each target sample pair; the preprocessing strategy comprises an image enhancement strategy; the image enhancement strategy is a strategy for enhancing the image by at least one image processing mode; acquiring a quantization loss function value of the image processing model based on each target sample pair; the image processing model is a neural network model for performing an image reconstruction task; model parameters in the image processing model are updated based on the quantization loss function values. The quantization training effect is improved, and the quality of an output image obtained by performing image processing on a model obtained through quantization training is further improved.

Description

Image processing method, device, terminal, storage medium and program product

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to an image processing method, an apparatus, a terminal, a storage medium, and a program product.

Background

Currently, in order to develop a deep learning neural network model from algorithm development to chip deployment and landing, the model needs to be quantized, and the model Quantization can be divided into Training Quantization (QAT) and Post-Training Quantization (Post-Training Quantization, PTQ), wherein the QAT has better effects in reducing loss generated by Quantization and guaranteeing the precision of an original model.

In the related art, a training data set and training conditions and the like in the original model training are multiplexed in the quantization process, and the weight of the pseudo quantization (Fake Quantization) is finely tuned (Finetune) to reduce or eliminate errors caused by quantization. However, the conversion of the numerical value of the model weight from floating point to low bit integer greatly reduces the expression ability of the model, and the final output image has a problem of quantization noise. The image quality of the image processing output by the model after quantization training in the process is poor.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, a terminal, a storage medium and a program product, which improve the model quantization training effect of an image processing model. The technical scheme is as follows:

In one aspect, an embodiment of the present application provides an image processing method, including:

acquiring an original training set; the original training set comprises at least one original sample pair, and the original sample pair comprises an original sample image and an original label image;

according to the respective pretreatment strategy of at least one original sample pair, carrying out image pretreatment on at least one original sample pair to obtain each target sample pair; the target sample pair comprises a target sample image and a target label image; the preprocessing strategy comprises an image enhancement strategy; the image enhancement strategy is a strategy for enhancing an image by at least one image processing mode;

acquiring a quantization loss function value of an image processing model based on each target sample pair; the image processing model is a neural network model for performing an image reconstruction task;

and updating model parameters in the image processing model based on the quantization loss function value.

In another aspect, an embodiment of the present application provides an image processing apparatus, including:

the training set acquisition module is used for acquiring an original training set; the original training set comprises at least one original sample pair, and the original sample pair comprises an original sample image and an original label image;

The target acquisition module is used for carrying out image preprocessing on at least one original sample pair according to the respective preprocessing strategy of the at least one original sample pair to obtain each target sample pair; the target sample pair comprises a target sample image and a target label image; the preprocessing strategy comprises an image enhancement strategy; the image enhancement strategy is a strategy for enhancing an image by at least one image processing mode;

the quantization loss acquisition module is used for acquiring a quantization loss function value of the image processing model based on each target sample pair; the image processing model is a neural network model for performing an image reconstruction task;

and the parameter updating module is used for updating the model parameters in the image processing model based on the quantization loss function value.

In one possible implementation, the image processing manner includes at least one of image flipping, image rotation, affine transformation, image blending, image clipping blending, and color gamut scaling.

In one possible implementation, the image reconstruction task includes at least one of an image denoising task, an image restoration task, and an image super-resolution reconstruction task.

In one possible implementation manner, the target obtaining module includes:

the first acquisition submodule is used for acquiring a first original sample pair from the original training set; the first original sample pair is any one of at least one of the original sample pairs;

a policy acquisition sub-module, configured to acquire the preprocessing policy of the first original sample pair;

and the first target acquisition submodule is used for carrying out image processing on the first original sample pair according to the preprocessing strategy of the first original sample pair to obtain a first target sample pair in each target sample pair.

In one possible implementation manner, the policy obtaining sub-module includes:

a random number generation unit for assigning the first original sample pair to a first random number value within a first threshold range;

a first policy obtaining unit, configured to determine, in response to the first random value being greater than a target threshold, that the preprocessing policy of the first original sample pair is an original policy; the original strategy is a strategy for acquiring the first original sample pair as the target sample pair;

a second policy obtaining unit configured to determine, in response to the first random number value being equal to or smaller than the target threshold value, the preprocessing policy of the first original sample pair as the image enhancement policy, and determine a combination of processing manners included in the image enhancement policy; the combination of processing modes includes at least one of the image processing modes.

In a possible implementation manner, the second policy obtaining unit is configured to,

and determining the image processing modes included in the image enhancement strategy based on the respective selection probabilities of at least one image processing mode.

and randomly combining at least one image processing mode to determine the processing mode combination included in the image enhancement strategy.

In one possible implementation, the apparatus further includes:

the training set updating module is used for updating the original training set based on the target sample pairs before the quantized loss function values of the image processing model are acquired based on the target sample pairs, so as to obtain a target training set; the target training set comprises each target sample pair;

the quantization loss acquisition module includes:

a quantization loss acquisition sub-module for acquiring the quantization loss function value of the image processing model based on each of the target sample pairs acquired from the target training set.

In one possible implementation manner, the quantization loss acquisition module includes:

The output sub-module is used for sequentially inputting each target sample image into the image processing model and outputting each image processing result of each target sample image;

and a quantization loss calculation sub-module, configured to calculate the quantization loss function value of the image processing model based on the image processing result of each of the target sample images and the target label image corresponding to each of the target sample images.

In another aspect, an embodiment of the present application provides a terminal, where the terminal includes a processor and a memory; the memory has stored therein at least one computer instruction that is loaded and executed by the processor to implement the image processing method as described in the above aspect.

In another aspect, embodiments of the present application provide a computer readable storage medium having stored therein at least one computer instruction that is loaded and executed by a processor to implement the image processing method as described in the above aspect.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the terminal reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the terminal performs the image processing method provided in various alternative implementations of the above aspect.

The technical scheme provided by the embodiment of the application has the beneficial effects that at least:

the method comprises the steps that on the basis of an original training set, image preprocessing is conducted on corresponding preprocessing strategies according to original samples in the original training set, target sample pairs containing target sample images and target tag images after image preprocessing are obtained according to the corresponding preprocessing strategies of the original samples, then quantization loss functions of an image processing model used for performing an image reconstruction task are obtained on the basis of the target sample pairs, the model is updated through the quantization loss functions, and because the corresponding preprocessing strategies exist in the original sample pairs and the preprocessing strategies are used for enhancing the image, the images in the original sample pairs can be preprocessed according to different image enhancement modes, so that sample images used for performing quantization perception training are expanded, the problem that training resources are insufficient to reduce quantization errors due to the fact that the number of original samples is small when the model for performing the image reconstruction task is small is solved, and the quality of output images obtained by performing image processing on the model obtained through quantization training is improved under the condition that the effect of quantization training is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a schematic illustration of an application scenario in accordance with an exemplary embodiment;

FIG. 2 is a flowchart illustrating a method of image processing according to an exemplary embodiment;

FIG. 3 is a flowchart illustrating a method of image processing according to an exemplary embodiment;

FIG. 4 is a schematic diagram of an image flipping process according to the embodiment of FIG. 3;

FIG. 5 is a schematic diagram of an image rotation process according to the embodiment of FIG. 3;

FIG. 6 is a schematic diagram of an affine transformation processing of an image according to the embodiment shown in FIG. 3;

FIG. 7 is a schematic diagram of an image blending process according to the embodiment of FIG. 3;

FIG. 8 is a schematic diagram of an image cropping blending process according to the embodiment of FIG. 3;

FIG. 9 is a schematic diagram of a gamut scaling process in accordance with the embodiment of FIG. 3;

FIG. 10 is a schematic diagram of a quantized perceptual training flow scheme in accordance with the embodiment of FIG. 3;

fig. 11 is a block diagram of an image processing apparatus provided in an exemplary embodiment of the present application;

Fig. 12 is a block diagram showing the structure of a computer device according to an exemplary embodiment of the present application.

Specific embodiments of the present disclosure have been shown by way of the above drawings and will be described in more detail below. These drawings and the written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the disclosed concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

References herein to "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

The subsequent embodiment of the application provides an image processing scheme, and the quantization training effect of the image processing model can be improved by improving the preprocessing scheme.

Please refer to fig. 1, which illustrates an application scenario diagram according to various embodiments of the present application. As shown in fig. 1, computer device 12 may be used to train an image processing model; the computer device 12 may send the trained image processing model to the terminal device 14 after the offline training has completed the image processing model; the terminal device 14 is internally provided with an operation module and a storage module, the terminal device 14 can store the received image processing model through the storage module, and calculate and process the image input to the image processing model through the operation module, wherein the image input to the image processing model can be acquired by the terminal device 14 through an image acquisition module in real time or stored locally. For example, the terminal device 14 may be a smart phone, tablet computer, electronic book reader, personal portable computer, smart wearable device, or the like.

The computer device 12 may be the same device as the terminal device 14 or may be a different device from the terminal device 14.

That is, the image processing model training process and the image processing model application process may be on the same terminal device 14 or on different terminal devices 14.

Fig. 2 shows a flowchart of an image processing method provided by an exemplary embodiment of the present application. The image processing method may be performed by a computer device, for example, the computer device 12 in the system shown in fig. 1 described above. The image processing method comprises the following steps:

step 201, acquiring an original training set; the original training set includes at least one original sample pair comprising an original sample image and an original label image.

In an embodiment of the application, a computer device obtains an original training set comprising at least one original sample pair comprising an original sample image and an original label image.

Step 202, performing image preprocessing on at least one original sample pair according to respective preprocessing strategies of the at least one original sample pair to obtain each target sample pair; the target sample pair comprises a target sample image and a target label image; the preprocessing strategy comprises an image enhancement strategy; the image enhancement policy is a policy for enhancing an image by at least one image processing means.

In the embodiment of the application, the computer equipment can acquire the corresponding preprocessing strategies, namely, acquire the strategies for enhancing the image included in the corresponding preprocessing strategies, and can acquire the corresponding target sample pairs of each original sample pair by preprocessing the image according to the corresponding preprocessing strategies.

In one possible implementation, the preprocessing strategy includes at least one of an image enhancement strategy and an original strategy.

Step 203, obtaining a quantization loss function value of an image processing model based on each target sample pair; the image processing model is a neural network model for performing image reconstruction tasks.

In the embodiment of the application, the computer equipment calculates and obtains the quantization loss function value of the image processing model when performing quantization perception training according to each acquired target sample pair.

Step 204, updating model parameters in the image processing model based on the quantization loss function value.

In the embodiment of the application, the computer equipment updates the image processing model according to the obtained quantization loss function value, so as to obtain the image processing model with the quantization training completed.

In summary, in the embodiment of the present application, the computer device performs image preprocessing on the respective corresponding preprocessing strategies according to the original samples in the original training set and performs image preprocessing on the respective preprocessing strategies according to the respective original samples in the original training set, so as to obtain a target sample pair including a target sample image and a target label image after image preprocessing, and then obtains a quantization loss function of an image processing model for performing an image reconstruction task based on the target sample pair, updates the model by using the quantization loss function, and since the original sample pair has the respective corresponding preprocessing strategy and the preprocessing strategy is used for enhancing the image, the images in the respective original sample pair can be preprocessed according to different image enhancement modes, thereby expanding the sample image for performing the quantization perception training, solving the problem that the training resource is insufficient to reduce the quantization error due to fewer original samples when the model for performing the image reconstruction task is trained, and improving the quality of the output image obtained by performing the image processing on the model obtained by the quantization training under the condition that the effect of the quantization training is improved.

Fig. 3 shows a flowchart of an image processing method provided by an exemplary embodiment of the present application. The image processing method may be performed by a computer device or performed by a terminal interacting with a computer device, for example, the computer device may be the terminal device 14 or the computer device 12 in the system shown in fig. 1. The image processing method comprises the following steps:

step 301, an original training set is obtained.

In an embodiment of the application, a computer device obtains an original training set for model training of an image processing model.

Wherein the original training set may comprise at least one original sample pair, which may comprise an original sample image as well as an original label image. That is, the original training set includes one original sample pair or several original sample pairs, one original sample pair may include one original sample image and one original label image, the original sample image in the same original sample pair may be an image sample that needs to perform an image reconstruction task, and the original label image corresponding to the original sample image may be an image in an ideal state after the original sample image performs the reconstruction task.

When the image reconstruction task is carried out, the quantity and the quality of the original sample pairs used for carrying out model training on the image processing model are limited, so that the effect of carrying out perception quantization training on the image processing model is limited, and the model quantization error is larger, and therefore, the application can carry out pretreatment on the original sample pairs in the original training set on the basis of the original training set, and expand the samples and the labels used for model training in the training set.

Step 302, obtaining a corresponding preprocessing strategy of each original sample pair.

In the embodiment of the application, the computer equipment can sequentially acquire the original sample pairs in the original training set and acquire the respective preprocessing strategies of the original sample pairs.

In one possible implementation manner, the computer device acquires one of the original sample sets as a first original sample pair, determines a preprocessing strategy of the first original sample pair after determining the first original sample pair, acquires one of the original sample pairs which are not acquired in the original sample set again as the first original sample pair, determines the preprocessing strategy of the first original sample pair, and stops acquiring the preprocessing strategy of each original sample pair until each original sample pair in the original training set determines a corresponding preprocessing strategy.

Alternatively, after determining the preprocessing strategy of the first primitive sample pair, the first primitive sample pair may be continuously acquired from the training set, and the same primitive sample pair may be repeatedly acquired as the first primitive sample pair multiple times.

That is, each original sample pair in the original training set may be sequentially subjected to one or more rounds of pretreatment strategy determination, or the step of randomly acquiring a pretreatment strategy may be performed on each original sample pair in the original training set, where the same original sample pair may be repeatedly acquired as the first original sample pair multiple times during the process of randomly acquiring the pretreatment strategy. The number of sample pairs for image processing model training is greatly expanded.

Wherein the preprocessing strategy may include an image enhancement strategy; the image enhancement policy may be a policy that enhances processing of the image by at least one image processing means. The enhancing of the image may include at least one of removing noise in the image, repairing a missing or damaged portion of the image, and improving resolution of the image.

In one possible implementation, the image processing means includes at least one of image flipping, image rotation, affine transformation, image blending, image cropping blending, and gamut scaling.

Wherein, after various image processing modes are combined, the combined image processing modes can be determined as corresponding preprocessing strategies.

For example, the processing procedure corresponding to each image processing mode may be as follows:

1) Image flipping

The image inversion in the image processing manner may be at least one of horizontally or vertically inverting both the original sample image and the original label image in the original sample pair.

The image can be rotated 180 degrees along the horizontal center axis, and the image can be rotated 180 degrees along the vertical center axis.

Fig. 4 is a schematic diagram illustrating an image flipping process according to an embodiment of the present application. As shown in fig. 4, the sample image 40 is a sample image that has not been subjected to image flipping processing, the sample image 40 is rotated 180 degrees along the horizontal center axis to obtain a horizontally flipped sample image 41, and the sample image 40 is rotated 180 degrees along the vertical center axis to obtain a vertically flipped sample image 42.

That is, when image flipping is included in the preprocessing strategy, the original sample image and the original tag image in the original sample pair may be flipped horizontally or vertically at the same time.

2) Image rotation

The image rotation in the image processing mode may be to perform rotation processing of the original sample image and the original label image in the same angle and direction on the original sample image and the original label image in the original sample pair.

The rotation angle may be any angle belonging to a preset angle threshold, for example, when the preset angle threshold is |α, β ], the rotation angle may be a random number uniformly distributed within the preset angle threshold, that is, may be represented as rotation angles θ to U (α, β), and U may be represented as uniform distribution of intervals [ α, β ]. For example, [ alpha, beta ] may take the value of [ -45, 45 ].

Since the size of the sample image used when the image rotation is performed is unchanged, the blank portion generated due to the rotation can be filled with pure black content to obtain a rotated image.

Fig. 5 is a schematic diagram illustrating an image rotation process according to an embodiment of the present application. As shown in fig. 5, the sample image 50 is a sample image which has not been subjected to image rotation processing, and the sample image 50 is rotated 45 degrees along the center, so that a rotated sample image 51 can be obtained.

That is, when image rotation is included in the preprocessing strategy, the original sample image and the original tag image in the original sample pair can be simultaneously subjected to rotation processing of the same angle.

3) Affine transformation

The affine transformation in the image processing mode can be the integrated operation processing of performing the same rotation, translation and scaling on the original sample image and the original label image in the original sample pair.

Wherein affine transformation can be represented using a 2×3 matrix, and the image pixels in the sample image, i.e., the bit vector, can be represented as image pixels x= [ X, y] ^T X' may be used to represent the transformed two-dimensional vector, M _aff Can represent affine transformation matrix, M _aff As will be shown in the following,

and, X' =m _aff ·[x，y，1] ^T The transformation matrix of affine transformation of the image space needs to be calculated from the positions before and after the transformation of three point coordinates.

Fig. 6 is a schematic diagram illustrating affine transformation processing of an image according to an embodiment of the present application. As shown in FIG. 6, if p ₁ (0，0)，p ₂ (0，H)，p ₃ (W, 0) represents the coordinates of the upper left, lower left and upper right vertices, p 'of the sample image 61, respectively' ₁ ，p′ ₂ ，p′ ₃ Respectively representing the transformed coordinates. According to practical requirements, in order to make the effective content occupy the main space of the picture size as much as possible, the positions of the three vertexes after the change can be limited, that is, the three vertexes can be respectively limited in the indicated dotted line area, so as to control the range of the image output after the processing. Due to processing of the images before and after The dimensions are unchanged, and after the processing, there are an area that is free and an area that is out of range, the free area is zeroed, and the area that is out of range is truncated, so that a sample image 62 obtained after affine transformation processing is obtained.

That is, when affine transformation is included in the preprocessing strategy, the original sample image and the original tag image in the original sample pair may be subjected to the same affine transformation processing at the same time, that is, the processed original sample image and original tag image may be calculated by the following formula,

a＝[p ₁ ，p ₂ ，p ₃ ]，a′＝[p′ ₁ ，p′ ₂ ，p′ ₃ ]

M _aff ＝f(a，a′)

I′ _input ＝M _aff I _input

I′ _gt ＝M _aff I _gt

wherein I' _input For the processed original sample image, I' _gt I for the processed original label image _input As the original sample image that has not undergone this process, _gt is the original label image that has not undergone this process.

4) Image blending

The image mixing in the image processing mode may be to blend two original sample images in random two original sample pairs in the original training set in a weighted manner in a certain proportion on the RGB domain, and blend two original label images in a weighted manner in the same proportion on the RGB domain.

Wherein the sample image { I } for training is selected from the original training set _input ，I _gt Then randomly selecting a reference image { R } from the original training set _input ，R _gt Weighted fusion in proportion over the RGB domain, as shown below,

I′ _input ＝γI _input +(1-γ)R _input

I′ _gt ＝γI _gt +(1-γ)R _gt

here, γ may be expressed as a weight ratio for performing image fusion, and may take a value of 0.9, for example.

For example, fig. 7 is a schematic diagram of an image blending process according to an embodiment of the present application. As shown in fig. 7, the training sample image 71 and the reference sample image 72 are weighted and fused in proportion to each other in the RGB domain, and a processed sample image 73 can be obtained.

5) Image cropping and blending

The image clipping and mixing in the image processing mode can be that after the original sample pair for training is obtained, a group of original sample pairs are randomly selected to be the most reference sample pair, part of the original sample image serving as the reference image is added into the original sample image according to a certain algorithm, and part of the original label image serving as the reference image is added into the original label image according to the same algorithm.

The image cropping and mixing operation can be an effective processing mode borrowed from an image target detection task. Except for the original sample pair { I for training selected at a time _input ，I _gt The original sample pair { R } used as a reference picture may additionally be randomly selected from the training dataset _input ，R _gt In R }, at _input Cutting and covering a region to I _input Corresponding region, and for R _gt And I _gt Is processed with the same operation.

Exemplary, M.epsilon.0, 1 ^H×W Can be used to represent mask matrix, put 0 in the clipped region and put 1 in the rest region, as well as represent element-wise multiplication operation, 1 ε R ^H×W For a matrix of all 1's, the calculation formula of the processed sample image and the label image is as follows,

I′ _input ＝M⊙I _input +(1-M)⊙R _input

I′ _gt ＝M⊙I _gt +(1-M)⊙R _gt

for example, fig. 8 is a schematic diagram of an image cropping mixing process according to an embodiment of the present application. As shown in fig. 8, a processed sample image 83 can be obtained by adding an image of the reference sample image 82 in a position area on the training sample image 81.

6) Color gamut scaling

The color gamut scaling in the image processing method may be different from the above-described transformation in the image space domain, and the image is subjected to numerical transformation in the color gamut by adjusting the pixel values.

Wherein for normalized image { I } _input ，I _gt Randomly selecting an anchor point c epsilon (0, 1.0), adding super-parameters to define a smaller range, scaling points with pixel values smaller than c in the image by m times, scaling points with pixel values larger than or equal to c by n times, scaling m, n epsilon (0, 1.0), and representing the converted image as follows,

Where x is the pixel point in the image.

For example, fig. 9 is a schematic diagram of a color gamut scaling process according to an embodiment of the present application. As shown in fig. 9, when the sample image 91 is subjected to the gamut scaling processing, a processed sample image 92 can be obtained. The pixel value distribution before and after scaling of the sample image may be as shown in the pixel value distribution graph 93.

And 303, performing image preprocessing on at least one original sample pair according to the respective preprocessing strategy of the at least one original sample pair to obtain each target sample pair.

In the embodiment of the application, the computer equipment performs image preprocessing on at least one original sample pair according to the preprocessing strategy determined by the at least one original sample pair, and can acquire each target sample pair after the image preprocessing.

The target sample pair may include a target sample image and a target tag image, among others.

In one possible implementation, a first original sample pair is obtained from an original training set; acquiring a preprocessing strategy of a first original sample pair; and performing image processing on the first original sample pair according to a preprocessing strategy of the first original sample pair to obtain a first target sample pair in each target sample pair.

Wherein the first original sample pair is any one of the at least one original sample pair.

That is, each original sample pair in the original training set needs to be used as a first original sample pair, and image preprocessing is performed in the above manner to obtain a target sample pair corresponding to each original sample pair.

In one possible implementation, the first original sample pairs are assigned a first random number value within a first threshold range; determining a preprocessing strategy of the first original sample pair as an original strategy in response to the first random value being greater than the target threshold; and in response to the first random value being less than or equal to the target threshold, determining that the preprocessing strategy of the first original sample pair is an image enhancement strategy, and determining a combination of processing modes included in the image enhancement strategy.

The original strategy is a strategy for acquiring a first original sample pair as a target sample pair, and the processing mode combination comprises at least one image processing mode.

For example, a target threshold q may be preset, such as q=0.8, defining x to U [0,1 ]]Is a first random number subject to uniform distribution. Setting p= { P ₁ ,p ₂ ,p ₃ …, represents a combination of the various processing modes described above, i.e., the respective preprocessing strategies, the calculation formula for determining the target sample pair is,

f＝random_choice(P)

Wherein, random_choice (P) represents that the processing mode combination included in the preprocessing strategy is randomly obtained according to the probability values corresponding to the P image processing modes, and the images are processed according to the processing mode combination.

In one possible implementation, the image processing means included in the image enhancement policy is determined based on respective selection probabilities of at least one image processing means.

For example, if p= { P is preset ₁ ,p ₂ ,p ₃ … }, which can be expressed as a probability of choosing an image flip p ₁ The probability of image mixture is selected as p ₂ Selecting the probability of image clipping and mixing as p ₃ Etc. And determining the image processing modes included in the image enhancement strategy according to the respective selection probabilities of the set image processing modes.

That is, if the selection probability of image inversion is preset to be 0.2, the selection probability of image rotation is 0.5, the selection probability of affine transformation is 0.6, the selection probability of image mixture is 0.4, the selection probability of image clipping mixture is 0.6, the selection probability of color gamut scaling is 0.3, the computer equipment needs to judge whether the first original sample pair currently determining the preprocessing strategy selects each image processing mode in sequence, when the computer equipment is determining the preprocessing strategy corresponding to the sample pair A, a random number which is uniformly distributed between 0 and 1 can be generated first, and when the random number is larger than the selection probability of the image processing mode, the image processing mode can be determined not to belong to the preprocessing strategy corresponding to the sample pair A; when the random number is smaller than or equal to the selection probability of the image processing mode, the image processing mode can be determined to belong to the preprocessing strategy corresponding to the sample pair A.

For example, if the random number generated by the sample pair a is 0.5, the preprocessing strategy of the sample pair a includes image rotation, affine transformation and image clipping mixing, and the sample image a in the sample pair can be subjected to image rotation, affine transformation and image clipping mixing by processing the sample pair a according to the preprocessing strategy.

If the preprocessing strategy is determined to include at least two image processing modes, the at least two image processing modes may not have priority, that is, the sequence of preprocessing the images according to each image processing mode is randomly selected.

Or when the computer equipment is determining the preprocessing strategy corresponding to the sample pair A, firstly, generating a random number which is uniformly distributed between 0 and 1 and is used for judging whether the first image processing mode is selected for preprocessing the image, and when the random number is larger than the selection probability of the first image processing mode, determining that the image processing mode does not belong to the preprocessing strategy corresponding to the sample pair A; when the random number is smaller than or equal to the selection probability of the first image processing mode, the image processing mode can be determined to belong to the preprocessing strategy corresponding to the sample pair A, then, a random number which is uniformly distributed between 0 and 1 is regenerated and used for judging whether the first image processing mode is selected for preprocessing the image, the judging step is as shown in the above, each time the image processing mode is judged to be completed, a random number is required to be regenerated until all the image processing modes are judged to be completed, and the preprocessing strategy of the sample pair A can be determined.

In one possible implementation, the selection probability of each image processing mode is determined according to the number of original sample pairs in the original training set.

The number of the original sample pairs in the original training set can be inversely related to the selection probability of each image processing mode, namely when the number of the original sample pairs in the original training set is large, the selection probability of each image processing mode is low; when the number of the original sample pairs in the original training set is small, the selection probability of each image processing mode is high.

For example, when the original sample pairs in the original training set are larger than the specified number threshold, the selection probability of each image processing mode is reduced, so that the calculated amount of image preprocessing is saved; when the original sample pairs in the original training set are smaller than or equal to the specified number threshold, the selection probability of each image processing mode is increased, and the image preprocessing efficiency is improved.

That is, by acquiring the number of the original sample pairs in the original training set, the selection probability of each image processing mode can be dynamically adjusted, and the balance between saving the calculation amount of the image preprocessing and improving the efficiency of the image preprocessing is realized by adjusting the selection probability.

In another possible implementation, at least one of the image processing means is randomly combined to determine the combination of processing means included in the image enhancement policy.

Illustratively, any number and variety of image processing modes are selected from the respective image processing modes according to a random algorithm to be determined as the combination of the processing modes.

For example, the probability that each image processing mode is selected may be the same, and according to a random algorithm, the computer device may randomly acquire any number and kind of image processing modes to randomly compose a processing mode combination, and may determine the processing mode combination as a preprocessing strategy of the sample pair.

In one possible implementation, the selection probability of each image processing mode is determined based on the task type of the image reconstruction task.

The computer equipment can be pre-stored with the selection probabilities of the image processing modes corresponding to the image denoising task, the image repairing task and the image super-resolution reconstruction task, the selection probabilities of the corresponding image processing modes can be obtained according to the task types of the image reconstruction task to which the image processing model currently undergoing model training belongs, the selection probabilities of the image processing modes can be determined in advance according to the task types of the image reconstruction task, and the image preprocessing with different processing modes biased for the task types of the different image reconstruction tasks is realized, so that the efficiency of the image preprocessing can be improved.

In one possible implementation, the preprocessing strategy corresponding to the parameter tuning requirements is determined according to different parameter tuning requirements.

The parameter adjustment requirement can be obtained after the image processing model is tested by using the test set. The data cleaning operation can be added to assist in preprocessing according to requirements, and the cleaning strategy can be set according to different requirements, for example, the overall brightness of the picture, the pixel value distribution diagram of the picture and the like.

For example, when the need for the image processing model is to increase the ability of the image processing model to process pictures in a darker state, the pictures in the training set with overall brightness greater than a specified threshold may be cleaned; when the requirement of the image processing model is to improve the capability of the image processing model for processing the picture with the lower pixel value, the picture with the pixel value distribution diagram of the picture in the training set not in the appointed threshold range can be cleaned, so that unnecessary sample training is avoided, the model training precision is improved, and the application requirement of the model can be met as far as possible.

When repairing or denoising the images shot in the ultra-night and ultra-dark scenes, because the image denoising and the image repairing are performed to determine whether the images reach standards or not, the noise, brightness and tone of the images need to be comprehensively evaluated, various target sample pairs which change in airspace and color gamut can be obtained by preprocessing the original image sample pairs, and various factors can be comprehensively considered based on the target sample pairs obtained by expansion, so that the final quantization error is greatly reduced, the structure of the model is not changed, only the training samples are preprocessed, and the universality of the model training method can be improved.

Step 304, obtaining a quantization loss function value of the image processing model based on each target sample pair.

In an embodiment of the present application, the computer device obtains the quantized loss function value of the image processing model according to the obtained target sample pair.

The image processing model may be a neural network model for performing image reconstruction tasks, among other things.

That is, when the image processing model is for performing at least one of an image denoising task, an image restoration task, and an image super-resolution task, preprocessing of the original sample pairs in the training set can be performed by the above steps, the number of sample pairs in the original training set can be expanded, and the effect of model training can be improved while ensuring the quality of the samples.

In one possible implementation, the original training set is updated based on the target sample pairs to obtain a target training set, and the quantized loss function value of the image processing model is obtained based on each target sample pair obtained from the target training set.

After image preprocessing is carried out on each original sample pair in the original training set according to the corresponding preprocessing strategy, the quantized loss function value of the image processing model is obtained based on each original sample pair and each target sample pair.

The method includes the steps that at least one preprocessing strategy of at least one original sample image is sequentially obtained according to sample enhancement times, after each original sample pair in an original training set obtains a plurality of preprocessing strategies, image preprocessing can be carried out on each corresponding original sample pair for a plurality of times according to each preprocessing strategy, a plurality of groups of target sample pairs are obtained, and a quantization loss function value of an image processing model is obtained based on each original sample pair and each target sample pair.

The sample enhancement times are times when the original sample images in the original training set need to acquire a preprocessing strategy.

In one possible implementation manner, each target sample image is sequentially input into an image processing model, and each image processing result of each target sample image is output; a quantization loss function value of the image processing model is calculated based on the respective image processing results of the respective target sample images and the respective target label images corresponding to the respective target sample images.

In step 305, model parameters in the image processing model are updated based on the quantization loss function value.

In the embodiment of the application, the computer equipment can update the model parameters in the image processing model according to the acquired quantization loss function value, thereby realizing the purpose of training and quantizing the image processing model.

That is, taking an image denoising task as an example, the main advantage of the training quantization scheme compared with the post quantization scheme is to reduce quantization noise, and the preprocessing module represents a significant and effective progress in the scenes such as the overnight scenes and the extreme darkness scenes in the experimental comparison of the actual engineering. The data preprocessing technology is used as an important auxiliary means in the model training of deep learning, and can also be used as an important parameter adjusting option in the quantization training, especially on tasks with requirements on image quality; the scheme does not relate to the adjustment of a model structure and a training loss function, so that the training environments of the original floating point model and the pseudo quantization model are kept consistent.

In one possible implementation, model training is performed using a learning rate that is less than a specified threshold.

That is, the quantization perception training may be fine tuning of model weights, and compared with post quantization, the quantization perception training has advantages of mainly improving noise, but has a larger influence on aspects of hue brightness of an image and the like, which becomes a side effect of the scheme caused by noise reduction. In this case, the influence is greater in the field of pixel-level image processing such as denoising tasks, especially when processing some scenes with darker brightness. Based on the evaluation principle that model performance is kept close as much as possible before and after model quantization, the correct convergence direction of model weights needs to be kept as much as possible in training, so that convergence speed is rather abandoned, and weight change is slow. The small stride fine adjustment can be ensured by using a small Learning Rate (LR), for example, 0.01-0.001 of the original Learning Rate, so that the whole effect of the image is not influenced while noise is reduced as much as possible. A smaller learning rate may ensure a correct convergence direction and an excessive learning rate may cause the convergence direction to diverge. The model weight is updated in smaller steps through smaller learning rate, so that the picture can be close to the original picture in terms of tone brightness and the like and the tolerance of pixel values is smaller on the visual effect of the output result picture.

Fig. 10 is a schematic diagram illustrating a quantized perceptual training process according to an embodiment of the present application. As shown in fig. 10, a sample noise picture and a label (i.e., a clean picture corresponding to the sample noise picture) are input, the input picture is read and demosaicing process is performed (S1001), then the input sample noise picture and label are respectively processed with the same preprocessing strategy (S1002), the sample noise picture after image preprocessing is input to a forward reasoning network (S1003), then a corresponding loss function is calculated according to an image processing result and the label determined by forward reasoning (S1004), and a model weight of an image processing model is updated through a loss function value obtained by iterative training (S1005), thereby completing quantized perceptual training of the model.

Because noise is an important index for measuring quantization effect, the noise of the fixed-point model is often formed by superposition of residual noise removed by the model and noise generated by quantization. While some noise in the photograph of overnight or very night scenes taken from night scenes is often found on textured surfaces such as night sky, clumps, slates, faces, and the like. In practical model quantization, the combination of image preprocessing can be significantly improved in image noise after proper period training of quantization training. And because the model training is enhanced through the preprocessing strategy, the model training is irrelevant to the model, the model training method is flexible in practical application, and the model structure, the calculation unit of the model and the subsequent hardware design are not influenced.

On the other hand, compared with other strategies such as regular terms, the preprocessing strategy is used as an effective measure for compensating image noise, has no side effect of blurring the image, and has more advantages in terms of weighing noise reduction and retaining details.

Fig. 11 is a block diagram showing the structure of an image processing apparatus according to an exemplary embodiment of the present application. The image processing apparatus is used in a computer device, and includes:

a training set acquisition module 1110, configured to acquire an original training set; the original training set comprises at least one original sample pair, and the original sample pair comprises an original sample image and an original label image;

the target obtaining module 1120 is configured to perform image preprocessing on at least one of the original sample pairs according to a respective preprocessing policy of at least one of the original sample pairs, so as to obtain each target sample pair; the target sample pair comprises a target sample image and a target label image; the preprocessing strategy comprises an image enhancement strategy; the image enhancement strategy is a strategy for enhancing an image by at least one image processing mode;

a quantization loss acquisition module 1130, configured to acquire a quantization loss function value of the image processing model based on each of the target sample pairs; the image processing model is a neural network model for performing an image reconstruction task;

a parameter updating module 1140 is configured to update model parameters in the image processing model based on the quantized loss function value.

In one possible implementation, the target acquisition module 1120 includes:

In one possible implementation, the apparatus further includes:

the quantization loss acquisition module 1130 includes:

In one possible implementation, the quantization loss acquisition module 1130 includes:

Fig. 12 is a block diagram showing the structure of a computer device according to an exemplary embodiment of the present application. The computer device can be an electronic device such as a smart phone, a tablet computer, an electronic book, a portable personal computer, an intelligent wearable device and the like. The terminal of the present application may include one or more of the following components: processor 1210, memory 1220 and screen 1230.

Processor 1210 may include one or more processing cores. The processor 1210 connects various parts within the overall terminal using various interfaces and lines, performs various functions of the terminal and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1220, and invoking data stored in the memory 1220. Alternatively, the processor 1210 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 1210 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the screen 1230; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 1210 and may be implemented solely by a single communication chip.

The Memory 1220 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (ROM). Optionally, the memory 1220 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 1220 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1220 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, which may be an Android (Android) system (including a system developed based on an Android system), an IOS system developed by apple corporation (including a system developed based on an IOS system depth), or other systems, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described various method embodiments, and the like. The storage data area may also store data created by the terminal in use (such as phonebook, audio-video data, chat-record data), etc.

The screen 1230 may be a capacitive touch display screen for receiving a touch operation by a user on or near any suitable object using a finger, a stylus, or the like, and displaying a user interface for each application. The touch display screen is typically provided at the front panel of the terminal. The touch display screen may be designed as a full screen, a curved screen, or a contoured screen. The touch display screen may also be designed as a combination of a full screen and a curved screen, and the combination of a special-shaped screen and a curved screen, which is not limited in the embodiment of the present application.

In addition, those skilled in the art will appreciate that the configuration of the terminal illustrated in the above-described figures does not constitute a limitation of the terminal, and the terminal may include more or less components than illustrated, or may combine certain components, or may have a different arrangement of components. For example, the terminal further includes components such as a radio frequency circuit, a shooting component, a sensor, an audio circuit, a wireless fidelity (Wireless Fidelity, wiFi) component, a power supply, a bluetooth component, and the like, which are not described herein.

Embodiments of the present application also provide a computer readable storage medium having stored therein at least one computer instruction that is loaded and executed by a processor to implement the image processing method described in the above embodiments.

Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the embodiments of the present application may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable storage medium. Computer-readable storage media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the original sample image and the original label image referred to in the present application are acquired with sufficient authorization.

The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims

1. An image processing method, the method comprising:

2. The method of claim 1, wherein the image processing means comprises at least one of image flipping, image rotation, affine transformation, image blending, image cropping blending, gamut scaling.

3. The method of claim 1, wherein the image reconstruction task comprises at least one of an image denoising task, an image restoration task, and an image super-resolution reconstruction task.

4. The method according to claim 1, wherein said image processing of at least one of said pairs of raw samples according to a respective preprocessing strategy of said at least one of said pairs of raw samples to obtain respective pairs of target samples comprises:

acquiring a first original sample pair from the original training set; the first original sample pair is any one of at least one of the original sample pairs;

acquiring the preprocessing strategy of the first original sample pair;

and carrying out image processing on the first original sample pair according to the preprocessing strategy of the first original sample pair to obtain a first target sample pair in each target sample pair.

5. The method of claim 4, wherein the obtaining the preprocessing strategy for the first original sample pair comprises:

Assigning the first original sample pairs to first random values within a first threshold range;

determining the preprocessing strategy of the first original sample pair as an original strategy in response to the first random value being greater than a target threshold; the original strategy is a strategy for acquiring the first original sample pair as the target sample pair;

determining the preprocessing strategy of the first original sample pair as the image enhancement strategy and determining a combination of processing modes included in the image enhancement strategy in response to the first random value being less than or equal to the target threshold; the combination of processing modes includes at least one of the image processing modes.

6. The method of claim 5, wherein said determining a combination of processing means included in said image enhancement policy comprises:

7. The method of claim 5, wherein said determining a combination of processing means included in said image enhancement policy comprises:

8. The method of claim 1, wherein prior to obtaining the quantized loss function values for the image processing model based on each of the target sample pairs, further comprising:

updating the original training set based on the target sample pair to obtain a target training set; the target training set comprises each target sample pair;

the obtaining the quantized loss function value of the image processing model based on each target sample pair comprises the following steps:

the quantized loss function values of the image processing model are acquired based on the respective target sample pairs acquired from the target training set.

9. The method of claim 1, wherein the obtaining quantized loss function values of an image processing model based on each of the target sample pairs comprises:

sequentially inputting each target sample image into the image processing model, and outputting respective image processing results of each target sample image;

the quantization loss function value of the image processing model is calculated based on the image processing result of each of the target sample images and the target label image corresponding to each of the target sample images.

10. An image processing apparatus, characterized in that the apparatus comprises:

11. A terminal, the terminal comprising a processor and a memory; the memory has stored therein at least one computer instruction that is loaded and executed by the processor to implement the image processing method of any of claims 1 to 9.

12. A computer readable storage medium having stored therein at least one computer instruction that is loaded and executed by a processor to implement the image processing method of any of claims 1 to 9.

13. A computer program product, characterized in that the computer program product comprises computer instructions that are executed by a processor of a terminal, so that the terminal performs the image processing method according to any of claims 1 to 9.