WO2022247702A1

WO2022247702A1 - Image processing method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022247702A1
Application number: PCT/CN2022/093586
Authority: WO
Inventors: 徐青松; 李青
Original assignee: 杭州睿胜软件有限公司
Priority date: 2021-05-28
Filing date: 2022-05-18
Publication date: 2022-12-01
Also published as: CN113344832A

Abstract

An image processing method, comprising: acquiring an image to be processed (S10), the image to be processed comprising a target region; by means of a first neural network model, performing first clarification processing on the image to be processed to obtain a first intermediate image corresponding to the image to be processed (S20), the clarity of the first intermediate image being greater than the clarity of the image to be processed; by means of a second neural network model, performing second clarification processing on an intermediate target region corresponding to the target region in the first intermediate image to obtain a second intermediate image corresponding to the intermediate target region (S30); and performing synthesis processing on the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed (S40). The image processing method optimises the target region, and then synthesises the optimised target area and the first intermediate image, increasing the clarity of the synthesised image, and obtaining an image with high clarity and richer detail. Also disclosed are an image processing apparatus, an electronic device, and a storage medium.

Description

Image processing method and device, electronic device and storage medium

technical field

Embodiments of the present disclosure relate to an image processing method, an image processing apparatus, electronic equipment, and a non-transitory computer-readable storage medium.

Background technique

Thanks to the rapid development of hardware, several generations of mobile phones have been updated in just a few years, and the photos taken by old mobile phones become blurred on the large-resolution screen. In addition, due to the limited shooting technology at that time, the old photos with a long history have low image clarity and insufficient image details. In this situation, the original low-resolution image needs to be sharpened to obtain a high-resolution image.

Contents of the invention

At least one embodiment of the present disclosure provides an image processing method, including: acquiring an image to be processed, wherein the image to be processed includes a target area; performing a first sharpening process on the image to be processed by using a first neural network model, To obtain the first intermediate image corresponding to the image to be processed, wherein the definition of the first intermediate image is greater than the definition of the image to be processed; through the second neural network model in the first intermediate image and Performing a second sharpening process on the intermediate target area corresponding to the target area to obtain a second intermediate image corresponding to the intermediate target area; performing synthesis processing on the first intermediate image and the second intermediate image to obtain A composite image corresponding to the image to be processed.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, after performing a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image through the second neural network model, Before obtaining the second intermediate image corresponding to the intermediate target area, the image processing method further includes: performing recognition processing on the first intermediate image through a third neural network model, so as to obtain the first intermediate image in the first intermediate image The intermediate target area corresponding to the target area.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, the definition of the second intermediate image is greater than the definition of the intermediate target area.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, performing composite processing on the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed includes : performing tone processing on the second intermediate image based on the tone of the first intermediate image to obtain a third intermediate image, wherein the tone of the third intermediate image tends to the tone of the first intermediate image; performing image combination processing on the first intermediate image and the third intermediate image to obtain the synthesized image.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, the target area is a human face area.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, the first neural network model is different from the second neural network model.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, before the first sharpening process is performed on the image to be processed through the first neural network model, the image processing method further includes: acquiring a sample image ; blurring the sample image to obtain an image to be trained, wherein the definition of the image to be trained is smaller than the definition of the sample image; based on the sample image and the image to be trained, the image to be trained The first neural network model and the second neural network model to be trained are trained to obtain the first neural network model and the second neural network model.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, performing blurring processing on the sample image to obtain an image to be trained includes: obtaining a texture slice, wherein the size of the texture slice is the same as the The size of the sample images is the same; the first blurring process is performed on the sample images to obtain a first blurred image, wherein the definition of the first blurred image is smaller than the definition of the sample image; the first blurred performing color mixing processing on the image and the texture slice to obtain a second blurred image; performing second blurring processing on the second blurred image to obtain the image to be trained.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, acquiring a texture slice includes: acquiring at least one preset texture image; randomly selecting a preset texture image from the at least one preset texture image A texture image, as a target texture image; in response to the size of the target texture image being the same as the size of the sample image, using the target texture image as the texture slice; in response to a size larger than the target texture image For the size of the sample image, randomly cut the target texture image based on the size of the sample image to obtain a slice area with the same size as the sample image, and use the slice area as the texture slice.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, the first blurring includes Gaussian blurring, noise addition, or Gaussian blurring and noise addition in any order and in any number Combination processing composed of processing; the second blur processing includes the Gaussian blur processing, the noise addition processing, or a combination processing based on any order and any number of the Gaussian blur processing and the noise addition processing.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, performing a first blurring process on the sample image to obtain a first blurred image includes: performing the Gaussian blurring process on the sample image, to obtain the first blurred image; performing a second blurring process on the second blurred image to obtain the image to be trained, including: performing the noise addition process on the second blurred image to obtain an intermediate blur image; performing the Gaussian blur processing on the intermediate blurred image to obtain the image to be trained.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, performing color mixing processing on the first blurred image and the texture slice to obtain a second blurred image includes: performing a color mixing process on the first blurred image The image and the texture slice are subjected to color filtering processing to obtain the second blurred image.

Optionally, in the image processing method provided in at least one embodiment of the present disclosure, performing color mixing processing on the first blurred image and the texture slice to obtain a second blurred image includes: processing the texture slice and the texture slice The first blurred image is highlighted to obtain the second blurred image.

At least one embodiment of the present disclosure provides an image processing device, including: an image acquisition unit configured to acquire an image to be processed, wherein the image to be processed includes a target area; a first processing unit configured to use a first neural network model performing a first sharpening process on the image to be processed to obtain a first intermediate image corresponding to the image to be processed, wherein the definition of the first intermediate image is greater than that of the image to be processed; second A processing unit configured to perform a second sharpening process on an intermediate target area corresponding to the target area in the first intermediate image through a second neural network model, so as to obtain a second intermediate image corresponding to the intermediate target area; A compositing unit configured to composite the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed.

Optionally, in the image processing device provided in at least one embodiment of the present disclosure, the composition unit includes a tone processing module and an image combination processing module, and the tone processing module is configured to, based on the tone of the first intermediate image, performing tone processing on the second intermediate image to obtain a third intermediate image, wherein the tone of the third intermediate image tends to the tone of the first intermediate image; the image combining processing module is configured to performing image combination processing on the first intermediate image and the third intermediate image to obtain the synthesized image.

At least one embodiment of the present disclosure provides an electronic device, including: a memory storing computer-executable instructions in a non-transitory manner; a processor configured to run the computer-executable instructions, wherein the computer-executable instructions are executed by the The processor implements the image processing method according to any embodiment of the present disclosure when running.

At least one embodiment of the present disclosure provides a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the computer-executable instructions according to The image processing method described in any embodiment of the present disclosure.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the accompanying drawings of the embodiments will be briefly introduced below. Obviously, the accompanying drawings in the following description only relate to some embodiments of the present disclosure, rather than limiting the present disclosure .

Fig. 1 is a schematic flowchart of an image processing method provided by at least one embodiment of the present disclosure;

Fig. 2 is a schematic diagram of an image to be processed provided by at least one embodiment of the present disclosure;

Fig. 3 is a schematic diagram of a first intermediate image provided by at least one embodiment of the present disclosure;

FIG. 4A is a schematic diagram of an intermediate target area provided by at least one embodiment of the present disclosure;

Fig. 4B is a schematic diagram of a second intermediate image provided by at least one embodiment of the present disclosure;

Fig. 5A is a schematic diagram of a third intermediate image provided by at least one embodiment of the present disclosure;

FIG. 5B is a schematic diagram of a synthesized image provided by an embodiment of the present disclosure;

Fig. 6A shows a schematic flowchart of obfuscation processing provided by at least one embodiment of the present disclosure;

Fig. 6B is a schematic diagram of a texture slice provided by at least one embodiment of the present disclosure;

Fig. 7A is a sample image provided by at least one embodiment of the present disclosure;

Fig. 7B is an image to be trained provided by at least one embodiment of the present disclosure;

Fig. 8 is a schematic block diagram of an image processing device provided by at least one embodiment of the present disclosure;

Fig. 9 is a schematic diagram of an electronic device provided by at least one embodiment of the present disclosure;

Fig. 10 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure;

Fig. 11 is a schematic diagram of a hardware environment provided by at least one embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings of the embodiments of the present disclosure. Apparently, the described embodiments are some of the embodiments of the present disclosure, not all of them. Based on the described embodiments of the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative effort fall within the protection scope of the present disclosure.

Unless otherwise defined, the technical terms or scientific terms used in the present disclosure shall have the usual meanings understood by those skilled in the art to which the present disclosure belongs. "First", "second" and similar words used in the present disclosure do not indicate any order, quantity or importance, but are only used to distinguish different components. "Comprising" or "comprising" and similar words mean that the elements or items appearing before the word include the elements or items listed after the word and their equivalents, without excluding other elements or items. Words such as "connected" or "connected" are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "Up", "Down", "Left", "Right" and so on are only used to indicate the relative positional relationship. When the absolute position of the described object changes, the relative positional relationship may also change accordingly. In order to keep the following description of the embodiments of the present disclosure clear and concise, the present disclosure omits detailed descriptions of some known functions and known components.

For old photos that have been stored for many years or other images that are not clear enough, they can be processed to make their details vivid. The emergence of deep learning makes it possible to operate on the semantic level of images, so that convolutional neural networks can be used, for example, to The image is manipulated at the semantic level to achieve high-definition detail reproduction of the image.

For face images, unlike other objects such as landscapes and objects, the details of face images are very rich, such as facial texture features, etc. Therefore, in the process of clearing images containing face images, the traditional method The obtained face image is not clear enough, the texture features are not clear enough, and image noise often appears.

At least one embodiment of the present disclosure provides an image processing method, an image processing device, an electronic device, and a non-transitory computer-readable storage medium. The image processing method includes: acquiring an image to be processed, wherein the image to be processed includes a target area; The first neural network model performs the first sharpening process on the image to be processed to obtain the first intermediate image corresponding to the image to be processed, wherein the definition of the first intermediate image is greater than the definition of the image to be processed; through the second neural network model Performing a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image to obtain a second intermediate image corresponding to the intermediate target area; performing composite processing on the first intermediate image and the second intermediate image to obtain a Composite image corresponding to the image to be processed.

In this image processing method, after the first clearing process is performed on the image to be processed, the second clearing process is performed on the target area, and special optimization is performed on the target area, and then the optimized target area is synthesized with the first intermediate image, thereby It can improve the clarity of the synthesized image, and obtain a high-definition image with richer details.

The image processing method provided by the embodiment of the present disclosure can be applied in a mobile terminal (such as a mobile phone, a tablet computer, etc.), and on the basis of improving the processing speed, the definition of the synthesized image can be improved, and the image collected by the mobile terminal can also be processed. Real-time sharpening.

It should be noted that the image processing method provided in the embodiment of the present disclosure can be applied to the image processing device provided in the embodiment of the present disclosure, and the image processing device can be configured on an electronic device. The electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal may be a hardware device with various operating systems, such as a mobile phone and a tablet computer.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings, but the present disclosure is not limited to these specific embodiments.

Fig. 1 is a schematic flowchart of an image processing method provided by at least one embodiment of the present disclosure. Fig. 2 is a schematic diagram of an initial image provided by at least one embodiment of the present disclosure.

As shown in FIG. 1 , the image processing method provided by at least one embodiment of the present disclosure includes steps S10 to S40.

Step S10, acquiring an image to be processed.

For example, the image to be processed includes a target area.

In step S20, the first sharpening process is performed on the image to be processed through the first neural network model to obtain a first intermediate image corresponding to the image to be processed.

For example, the resolution of the first intermediate image is greater than the resolution of the image to be processed.

Step S30 , performing a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image through the second neural network model, so as to obtain a second intermediate image corresponding to the intermediate target area.

Step S40, performing composite processing on the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed.

For example, the images to be processed acquired in step S10 can be various types of images, such as landscape images, person images, item images, etc., for example, landscape images can include landscape objects such as mountains, rivers, plants, animals, and sky, A person image is an image including a person (for example, a human face, etc.). For example, a person image may include a human face area, and an item image may include items such as vehicles and houses. Of course, in addition to the face area, the person image may also include areas corresponding to landscape objects, object objects, and the like. For example, in some embodiments, the image to be processed may be a person image, such as a person's ID photo. For example, in some other embodiments, the image to be processed may also be a person image with a landscape object or an article object.

For example, the shape of the image to be processed may be a rectangle or the like. The shape and size of the image to be processed can be set by the user according to the actual situation.

For example, the image to be processed can be a blurred image with low definition. lower. For example, the image to be processed may be obtained by scanning or the like, for example, the image to be processed may be an image obtained by scanning or photographing an old photo with a long history. For another example, the image to be processed may be an image obtained by performing image compression on a high-definition image to facilitate transmission.

The image to be processed can be a grayscale image or a color image. For example, in order to avoid the influence of data quality and data imbalance of the image to be processed on the image processing, before processing the image to be processed, the image processing method provided by at least one embodiment of the present disclosure may also include preprocessing the image to be processed operate. Preprocessing may include, for example, cropping, gamma (Gamma) correction, or noise reduction filtering on the image to be processed. Preprocessing can eliminate irrelevant information or noise information in the image to be processed, so as to facilitate subsequent image processing of the image to be processed.

For example, the target area may be an area including the target, and the target may be a human face, so the target area may be a human face area. It should be noted that, according to image processing requirements, other objects can be selected as targets, for example, animals, vehicles, etc. are selected as targets. At this time, the target area is an area including animals (eg, cats) or an area including vehicles. There is no limit to this.

For example, as shown in Figure 2, the image to be processed can be a person image and includes a human face, and the target area is the face area that includes a human face in the image to be processed. It can be seen from Fig. 2 that the resolution of the image to be processed is low, the image details are missing, and there is image noise.

For example, in step S20, the image to be processed is first cleared through the trained first neural network model to obtain a first intermediate image with higher definition, that is, the definition of the first intermediate image is greater than that of the image to be processed clarity.

For example, the first neural network model may adopt a pix2pixHD (pixel to pixel HD) model, which uses a multi-level generator (coarse-to-fine generator) and a multi-scale discriminator (multi-scale discriminator) to treat The first sharpening process is performed on the image to generate a high-resolution, high-definition first intermediate image. The generator of the pix2pixHD model includes a global generator network (global generator network) and a local enhancer network (local enhancer network). The global generator network part adopts the U-Net structure, and the features output by the global generator network part are extracted from the local enhancement network part. The feature fusion of the local enhancement network is used as the input information of the local enhancement network part, and the local enhancement network part outputs high-resolution and high-definition images based on the fused information.

The training process for the first neural network model is described later and will not be repeated here.

Fig. 3 is a schematic diagram of a first intermediate image obtained after performing a first sharpening process on the image to be processed shown in Fig. 2 according to at least one embodiment of the present disclosure. As shown in Figure 3, compared with the image to be processed shown in Figure 2, the definition of the first intermediate image after the first sharpening process has been greatly improved, but this sharpening is aimed at the image to be processed The global sharpening of the target area, such as the face area, cannot be specially optimized, for example, the high-definition details of the target area cannot be provided, and the obtained first intermediate image will also have image noise such as variegated lines.

For example, except for the difference in sharpness, other properties (eg, size, etc.) of the first intermediate image and the image to be processed are completely or substantially the same.

For example, in step S30, the intermediate target area is an area corresponding to the target area in the first intermediate image. The size of the intermediate target area is the same as that of the target area, and the relative position of the intermediate target area in the first intermediate image is completely or substantially the same as the relative position of the target area in the image to be processed.

For example, in step S30, the second clearing process is performed on the intermediate target area corresponding to the target area obtained from the first intermediate image, so as to further enrich the image details of the intermediate target area on the basis of the first clearing process, and improve The definition of the intermediate target area eliminates the image noise existing in the intermediate target area, and obtains a second intermediate image with higher definition and richer image details.

For example, the resolution of the second intermediate image is greater than the resolution of the intermediate target area. For example, there is no image noise such as variegated lines and noise in the second intermediate image, and the texture and lines of the second intermediate image are clearer and richer than the intermediate target area.

For example, in some embodiments, the intermediate target area is extracted through the second neural network model, and the second sharpening process is performed on the intermediate target area to obtain the second intermediate image.

For example, in other embodiments, the position of the target area in the image to be processed is relatively fixed. For example, the image to be processed is a ID photo, and the target area is a face area, and the face area is generally located in the center of the ID photo. The position information of the target area in the image to be processed is used to extract the intermediate target area in the first intermediate image, and the second clearing process is performed on the intermediate target area through the second neural network model to obtain the second intermediate image.

For example, in some other embodiments, before step S20, the image processing method provided by at least one embodiment of the present disclosure may further include: using a third neural network model to identify and process the first intermediate image, so as to obtain The intermediate target region in the image corresponding to the target region.

For example, the target area is a face area, the third neural network model can be a face recognition model, and the third neural network model can be trained to recognize the face area in the first intermediate image to obtain the intermediate target area, that is, the first intermediate The region in the image that includes parts of the face. It should be noted that, when the target area is other objects, such as a vehicle, the third neural network model can be trained to recognize the object (i.e., the vehicle) in the image to be recognized, so that the first intermediate The image is identified and processed to obtain an intermediate target area including the object (ie, the vehicle), which is not limited in the present disclosure.

For example, in some other embodiments, the intermediate target area can also be extracted by means of manual extraction, and the intermediate target area can be subjected to the second sharpening process through the second neural network model to obtain the second intermediate image, which is not discussed in this disclosure. limit.

For example, the first neural network model may be the same as the second neural network model, or the first neural network model may be different from the second neural network model. For example, the second neural network model can be a SPADE (Spatially-Adaptive Normalization) model, and the SPADE model can solve the problem that information in the input semantic image is easily lost in the traditional normalization layer.

The training process for the second neural network model is described later, and will not be repeated here.

For example, one or more of the first neural network model, the second neural network model and the third neural network model may be a convolutional neural network model.

Fig. 4A is a schematic diagram of an intermediate target area provided by at least one embodiment of the present disclosure, and Fig. 4B is a schematic diagram of a second intermediate image provided by at least one embodiment of the present disclosure.

For example, the first intermediate image shown in Figure 3 is identified and processed by the third neural network model to obtain the intermediate target area shown in Figure 4A; then, the intermediate target area shown in Figure 4A is processed by the second neural network model A second sharpening process is performed to obtain a second intermediate image shown in FIG. 4B . As shown in FIG. 4A and FIG. 4B , the second intermediate image after the second sharpening process has richer texture features and higher definition, and removes the black lines from the nose to the mouth of the face in the intermediate target area. For example, as shown in FIG. 4B , in the second intermediate image, details such as wrinkles that originally existed on the human face are reflected, so that the human face is more in line with the characteristics of a real human face.

For example, the hues of the first intermediate image obtained based on the first sharpening process and the second intermediate image obtained based on the second sharpening process may not be uniform. If the first intermediate image and the second intermediate image are directly synthesized, the resulting There may be multiple tones in the resulting composite image. Therefore, it is necessary to perform tone processing on the first intermediate image and the second intermediate image so that the tones of the two tend to be consistent. For example, the tones of the two are unified or consistent. At this time Then image merging is performed to obtain a composite image with uniform tone.

For example, step S40 may include: based on the tone of the first intermediate image, performing tone processing on the second intermediate image to obtain a third intermediate image, for example, the tone of the third intermediate image tends to the tone of the first intermediate image; The first intermediate image and the third intermediate image are combined to obtain a composite image.

For example, any algorithm or tool capable of tone adjustment may be used to perform tone processing on the second intermediate image based on the tone of the first intermediate image, which is not limited in the present disclosure.

It should be noted that, in the above description, the tone of the second intermediate image is adjusted to be consistent with the tone of the first intermediate image, but the disclosure is not limited thereto, as long as the tone of the first intermediate image and the tone of the second intermediate image can be adjusted It only needs to be consistent in tone. For example, in some other embodiments, step S40 may include: performing tone processing on the first intermediate image based on the tone of the second intermediate image to obtain a fourth intermediate image, for example, the fourth intermediate image The tone tends to the tone of the second intermediate image; performing an image combination process on the second intermediate image and the fourth intermediate image to obtain a composite image.

For example, in some embodiments, all pixels in the synthesized image are arranged in n rows and m columns, and in step S40, image merging processing is performed on the first intermediate image and the third intermediate image to obtain the synthesized image, which may include: The pixels in the t1th row and the t2th column in the first intermediate image: in response to the pixel in the t1th row and the t2th column in the first intermediate image is not located in the intermediate target area, the t1th row and the t2th column in the first intermediate image The pixel value of the pixel is used as the pixel value of the pixel in the t1th row and the t2th column in the composite image; in response to the pixel in the t1th row and the t2th column in the first intermediate image is located in the intermediate target area, the third intermediate image is The pixel value of the second intermediate pixel is used as the pixel value of the pixel in the t1th row and the t2th column in the composite image, wherein the second intermediate pixel is the t1th row and the t2th column in the third intermediate image and the t1th row and the t2th column in the first intermediate image The pixel corresponding to the pixel of , where n, m, t1, and t2 are all positive integers, and t1 is less than or equal to n, and t2 is less than or equal to m.

For example, when image combination processing is performed on the second intermediate image and the fourth intermediate image to obtain a composite image, the image combination processing process is the same as the above-mentioned process, and will not be repeated here.

It should be noted that the image merging process may also use other merging methods, which is not limited in the present disclosure.

For example, the composite image can be a color image, for example, the pixel values of the pixels in the color image can include a set of RGB pixel values, or the composite image can also be a monochrome image, for example, the pixel values of the pixels in the monochrome image can be is the pixel value of a color channel.

FIG. 5A is a schematic diagram of a third intermediate image provided by at least one embodiment of the present disclosure, and FIG. 5B is a schematic diagram of a composite image provided by an embodiment of the present disclosure. For example, FIG. A composite image obtained by the image processing method provided by at least one embodiment of the present disclosure.

As shown in FIG. 5A , the tone of the third intermediate image after tone processing is consistent with the tone of the first intermediate image shown in FIG. 3 .

As shown in FIG. 5B , the image details of the synthesized image are richer and clearer than that of the unprocessed image, and there is only one tone in the synthesized image.

For example, before performing the first sharpening process on the image to be processed by using the first neural network model, the image processing method provided by at least one embodiment of the present disclosure further includes: acquiring a sample image; performing blurring processing on the sample image to obtain an image to be trained , for example, the clarity of the image to be trained is smaller than the clarity of the sample image; based on the sample image and the image to be trained, the first neural network model to be trained and the second neural network model to be trained are trained to obtain the first neural network model and the second neural network model.

For example, the sample image may be an image with a sharpness greater than a sharpness threshold, and the sharpness threshold may be set by the user according to actual conditions. For example, the sample image includes a sample target area, for example, the sample target area is a face area. For example, when the first neural network model to be trained and the second neural network model to be trained are trained, the image to be trained can be used as the input of the neural network model, the sample image can be used as the target output of the neural network model, and the image to be trained can be used as the target output of the neural network model. The first neural network model and the second neural network model to be trained are trained.

For example, the training process of the neural network model may include: using the neural network model to be trained to process the training image to obtain the training output image; based on the training output image and the sample image, calculating The loss value of the neural network model to be trained; and modifying the parameters of the neural network model to be trained based on the loss value; when the loss function corresponding to the neural network model to be trained satisfies a predetermined condition, the trained neural network model is obtained, When the loss function corresponding to the neural network model to be trained does not meet the predetermined condition, continue to input the image to be trained to repeat the above training process. Here, the neural network model to be trained may be the first neural network model to be trained or the second neural network model to be trained.

For example, in one example, the predetermined condition corresponds to the minimization of the loss function of the neural network model to be trained under the input of a certain number of images to be trained. In another example, the predetermined condition is that the number of training times or training cycles corresponding to the neural network model to be trained reaches a predetermined number, and the predetermined number may be millions, as long as the number of images to be trained is large enough.

For example, the first neural network model and the second neural network model can be trained separately using the above training process. At this time, the sample image corresponding to the second neural network model needs to include the sample target area, and the sample image corresponding to the first neural network model may not Include the sample target area.

For example, the first neural network model and the second neural network model can be trained simultaneously based on the same sample image and the image to be trained. In this case, the sample image needs to include the sample target area. For example, at this time, the first neural network model and the second neural network model adopt different structures. For example, the first neural network model is a pix2pixHD model, the second neural network model is a SPADE model, and the first neural network model is based on the overall image of the sample image. Training, the second neural network model is only trained based on the sample target area in the sample image, so that the first neural network model can perform the first clearing process on the whole image to be processed, and the second neural network model can perform the second clearing process on the target area 2. Clarification treatment.

Fig. 6A shows a schematic flowchart of obfuscation processing provided by at least one embodiment of the present disclosure. As shown in FIG. 6A, the blurring process may include steps S501-S504.

Step S501, acquiring texture slices.

For example, texture tiles are the same size as the sample image.

Step S502, performing a first blurring process on the sample image to obtain a first blurred image.

For example, the sharpness of the first blurred image is smaller than that of the sample image.

Step S503, performing color mixing processing on the first blurred image and the texture slice to obtain a second blurred image.

Step S504, performing a second blurring process on the second blurred image to obtain an image to be trained.

For example, step S501 may include: acquiring at least one preset texture image; randomly selecting a preset texture image from at least one preset texture image as the target texture image; responding to the size of the target texture image and the sample image The size is the same, and the target texture image is used as a texture slice; in response to the size of the target texture image being larger than the size of the sample image, based on the size of the sample image, the target texture image is randomly cut to obtain a slice area with the same size as the sample image, Treat sliced regions as texture slices.

For example, texture tiles are the same size as the sample image.

Fig. 6B is a schematic diagram of a texture slice provided by at least one embodiment of the present disclosure. As shown in FIG. 6B, the texture slice has noise spots imitating photo noise (eg, film grain) and noise lines imitating scratches. The noise spots and noise lines can be randomly generated or preset. , which is not limited in the present disclosure.

For example, multiple preset texture images can be generated in advance. The preset texture images have randomly distributed mottled spots and mottled lines. The size of the preset texture image can be set larger than the size of the sample image. When slicing the texture, first select one of the preset texture images as the target texture image, and randomly cut the target texture image to obtain a slice area with the same size as the sample image as the texture slice. In this way, the state of a lower-sharp image can be more realistically simulated.

It should be noted that, in some other embodiments, the size of the target texture image may also be smaller than the size of the sample image, and then based on the size of the sample image, the target texture image is enlarged so that the size of the target texture image is the same as that of the sample image. If the dimensions are the same, the expanded target texture image is a texture slice.

For example, the first blur processing includes Gaussian blur processing, noise addition processing, or a combination of Gaussian blur processing and noise addition processing based on any order and any number; the second blur processing includes Gaussian blur processing, noise addition processing, or based on any order And any number of combinations of Gaussian blur processing and noise addition processing.

It should be noted that Gaussian Blur (Gaussian Blur) processing includes Gaussian Blur processing with the same or different blur parameters, noise addition processing includes noise addition processing with the same or different noise parameters, and any number of Gaussian Blur processing blur parameters in combination processing They can be the same or different, and can be set according to actual needs. Similarly, the noise parameters of any number of noise addition processes in the combination process can also be the same or different, which is not limited in the present disclosure.

For example, Gaussian blur processing can adjust the pixel values of pixels according to a Gaussian curve to achieve image blur. Noise addition processing can generate image noise, such as Gaussian white noise, etc., and image noise is synthesized with the image to achieve image blur. It should be noted that the Gaussian blur processing and the noise adding processing may be implemented in any relevant technical means in image processing, which is not limited in the present disclosure.

For example, step S502 may specifically include: performing Gaussian blur processing on the sample image to obtain a first blurred image.

For example, step S504 may specifically include: performing a second blurring process on the second blurred image to obtain an image to be trained, including: performing noise addition processing on the second blurred image to obtain an intermediate blurred image; performing Gaussian blurring on the intermediate blurred image processed to obtain images to be trained.

For example, the color mixing processing includes one or more processings such as Screen processing, Addition processing, and Lighten Only processing.

For example, in some embodiments, step S503 may include: performing color filtering (Screen) processing on the first blurred image and the texture slice to obtain a second blurred image.

For example, the pixels of the first blurred image are arranged in p rows and q columns, the pixels of the texture slice are arranged in p rows and q columns, the pixels of the second blurred image are arranged in p rows and q columns, and both p and q are positive integers. For example, the number of bits of the pixel value of the pixel is 8 bits, that is, the range of the pixel value of each channel of the pixel is (0-255).

For example, when screen processing is performed on the first blurred image and the texture slice, for a pixel located at row t3 and column t4 in the second blurred image, the calculation formula of the pixel value of the pixel is as follows:

Result_pix＝255-[(255-fig1_pix)*(255-slice_pix)]/255 (Formula 1)

Among them, Result_pix is the pixel value of the pixel located in row t3 and column t4 in the second blurred image, fig1_pix is the pixel value of the pixel located in row t3 and column t4 in the first blurred image, and slice_pix is the pixel value in the texture slice The pixel value of the pixel located at row t3 and column t4.

For example, in some other embodiments, step S503 may include: performing lightening (Lighten Only) processing on the texture slice and the first blurred image to obtain the second blurred image.

For example, when the first blurred image and the texture slice are lightened (Lighten Only), for the pixel located in row t3 and column t4 in the second blurred image, the calculation formula of the pixel value of the pixel is as follows:

Result_pix=max(fig1_pix,slice_pix) (Formula 2)

Among them, max(x, y) means to take the maximum value of x and y, and the specific meanings of other parameters are the same as those in formula 1, and will not be repeated here.

For example, in some other embodiments, step S503 may include: performing layer addition (Addition) processing on the texture slice and the first blurred image to obtain the second blurred image.

For example, when layer addition (Addition) processing is performed on the first blurred image and the texture slice, for a pixel located at row t3 and column t4 in the second blurred image, the calculation formula of the pixel value of the pixel is as follows:

Result_pix=fig1_pix+slice_pix (Formula 3)

Wherein, the specific meanings of the parameters are the same as those in Formula 1, and will not be repeated here.

It should be noted that the color mixing process may also adopt other blending modes (Blend Mode) as required, which is not limited in the present disclosure.

FIG. 7A is a sample image provided by at least one embodiment of the present disclosure, and FIG. 7B is an image to be trained provided by at least one embodiment of the present disclosure. For example, the image to be trained shown in FIG. The resulting image after the first blurring, color blending, and second blurring.

As shown in FIG. 7A, the sample image is a high-definition image. The image to be trained corresponding to the sample image obtained after the first blurring process, color mixing process, and second blurring process in the aforementioned steps is shown in FIG. 7B. The sharpness of the training image is smaller than that of the sample image, and there are simulated noises and scratches in the training image.

At least one embodiment of the present disclosure further provides an image processing device, and FIG. 8 is a schematic block diagram of an image processing device provided by at least one embodiment of the present disclosure.

As shown in FIG. 8 , the image processing apparatus 800 may include: an image acquisition unit 801 , a first processing unit 802 , a second processing unit 803 and a synthesis unit 804 .

For example, these modules may be implemented by hardware (such as circuit) modules, software modules, or any combination of the two, and the following embodiments are the same as this, and will not be repeated here. For example, a central processing unit (CPU), a graphics processing unit (GPU), a tensor processing unit (TPU), a field programmable logic gate array (FPGA), or other forms of processors with data processing capabilities and/or instruction execution capabilities The processing units and corresponding computer instructions implement these units.

For example, the image acquiring unit 801 is configured to acquire an image to be processed, wherein the image to be processed includes a target area.

For example, the first processing unit 802 is configured to perform first sharpening processing on the image to be processed through the first neural network model to obtain a first intermediate image corresponding to the image to be processed, wherein the definition of the first intermediate image is greater than that of the image to be processed. Image clarity.

For example, the second processing unit 803 is configured to perform a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image through the second neural network model, so as to obtain a second intermediate image corresponding to the intermediate target area.

For example, the compositing unit 804 is configured to composite the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed.

For example, the image acquisition unit 801, the first processing unit 802, the second processing unit 803, and the synthesis unit 804 may include codes and programs stored in memory; the processor may execute the codes and programs to realize the image acquisition unit as described above 801 , some or all of the functions of the first processing unit 802 , the second processing unit 803 and the synthesis unit 804 . For example, the image acquisition unit 801, the first processing unit 802, the second processing unit 803, and the synthesis unit 804 may be dedicated hardware devices, which are used to implement the image acquisition unit 801, the first processing unit 802, and the second processing unit described above. 803 and some or all of the functions of the synthesis unit 804. For example, the image acquiring unit 801 , the first processing unit 802 , the second processing unit 803 and the compositing unit 804 may be a circuit board or a combination of multiple circuit boards for realizing the functions described above. In the embodiment of the present application, the circuit board or a combination of multiple circuit boards may include: (1) one or more processors; (2) one or more non-transitory memories connected to the processors; and (3) Processor-executable firmware stored in memory.

It should be noted that the image acquisition unit 801 can be used to realize the step S10 shown in FIG. 1, the first processing unit 802 can be used to realize the step S20 shown in FIG. In the step S30 shown, the combining unit 804 can be used to realize the step S40 shown in FIG. 1 . Therefore, for a specific description of the functions that can be realized by the image acquisition unit 801, the first processing unit 802, the second processing unit 803, and the synthesis unit 804, reference may be made to the relevant descriptions of steps S10 to S40 in the above-mentioned embodiment of the image processing method, and repeat The place will not be repeated. In addition, the image processing apparatus 800 can achieve technical effects similar to those of the aforementioned image processing method, which will not be repeated here.

It should be noted that, in the embodiment of the present disclosure, the image processing device 800 may include more or fewer circuits or units, and the connection relationship between the various circuits or units is not limited, and may be determined according to actual needs . The specific configuration of each circuit or unit is not limited, and may be composed of analog devices according to circuit principles, or may be composed of digital chips, or in other suitable ways.

At least one embodiment of the present disclosure further provides an electronic device, and FIG. 9 is a schematic diagram of an electronic device provided by at least one embodiment of the present disclosure.

For example, as shown in FIG. 9 , the electronic device includes a processor 901 , a communication interface 902 , a memory 903 and a communication bus 904 . The processor 901, the communication interface 902, and the memory 903 communicate with each other through the communication bus 904, and the processor 901, the communication interface 902, the memory 903 and other components may also communicate with each other through a network connection. The present disclosure does not limit the type and function of the network here. It should be noted that the components of the electronic device shown in FIG. 9 are exemplary rather than limiting, and the electronic device may also have other components according to actual application requirements.

For example, memory 903 is used to store computer readable instructions on a non-transitory basis. When the processor 901 is configured to execute computer-readable instructions, implement the image processing method according to any one of the foregoing embodiments. For the specific implementation of each step of the image processing method and related explanations, reference may be made to the above-mentioned embodiment of the image processing method, and details are not repeated here.

For example, other implementations of the image processing method implemented by the processor 901 executing computer-readable instructions stored in the memory 903 are the same as the implementations mentioned in the foregoing method embodiments, and will not be repeated here.

For example, communication bus 904 may be a Peripheral Component Interconnect Standard (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

For example, the communication interface 902 is used to implement communication between the electronic device and other devices.

For example, the processor 901 and the memory 903 may be set at the server (or cloud).

For example, the processor 901 can control other components in the electronic device to perform desired functions. The processor 901 may be a device with data processing capability and/or program execution capability such as a central processing unit (CPU), a network processor (NP), a tensor processing unit (TPU) or a graphics processing unit (GPU); it may also be Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The central processing unit (CPU) may be an X86 or ARM architecture or the like.

For example, memory 903 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include random access memory (RAM) and/or cache memory (cache), etc., for example. Non-volatile memory may include, for example, read only memory (ROM), hard disks, erasable programmable read only memory (EPROM), compact disc read only memory (CD-ROM), USB memory, flash memory, and the like. One or more computer-readable instructions can be stored on the computer-readable storage medium, and the processor 901 can execute the computer-readable instructions to realize various functions of the electronic device. Various application programs, various data, and the like can also be stored in the storage medium.

For example, in some embodiments, the electronic device may further include an image acquisition component. The image acquisition component is used to acquire images. The memory 903 is also used to store acquired images.

For example, the image acquisition component may be a camera of a smartphone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, or even a webcam.

For example, for a detailed description of the process of image processing performed by the electronic device, reference may be made to relevant descriptions in the embodiments of the image processing method, and repeated descriptions will not be repeated.

Fig. 10 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure. For example, as shown in FIG. 10 , the storage medium 1000 may be a non-transitory computer-readable storage medium, and one or more computer-readable instructions 1001 may be non-transitorily stored on the storage medium 1000 . For example, when the computer-readable instructions 1001 are executed by the processor, one or more steps in the image processing method described above may be performed.

For example, the storage medium 1000 may be applied to the above-mentioned electronic device, for example, the storage medium 1000 may include a memory in the electronic device.

For example, the storage medium may include a memory card of a smartphone, a storage unit of a tablet computer, a hard disk of a personal computer, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), Portable compact disc read-only memory (CD-ROM), flash memory, or any combination of the above-mentioned storage media may also be other applicable storage media.

For example, for the description of the storage medium 1000, reference may be made to the description of the memory in the embodiments of the electronic device, and repeated descriptions will not be repeated.

Fig. 11 shows a schematic diagram of a hardware environment provided for at least one embodiment of the present disclosure. The electronic device provided by the present disclosure can be applied in the Internet system.

The functions of the image processing apparatus and/or electronic equipment involved in the present disclosure can be realized by using the computer system provided in FIG. 11 . Such computer systems can include personal computers, laptops, tablets, mobile phones, personal digital assistants, smart glasses, smart watches, smart rings, smart helmets, and any smart portable or wearable device. The specific system in this embodiment illustrates a hardware platform including a user interface using functional block diagrams. Such computer equipment may be a general purpose computer equipment or a special purpose computer equipment. Both computer devices can be used to realize the image processing device and/or electronic device in this embodiment. The computer system may include any components that implement the presently described information needed to achieve image processing. For example, a computer system can be realized by a computer device through its hardware devices, software programs, firmware, and combinations thereof. For the sake of convenience, only one computer device is drawn in Fig. 11, but the relevant computer functions for realizing the information required for image processing described in this embodiment can be implemented by a group of similar platforms in a distributed manner, Distribute the processing load of a computer system.

As shown in Figure 11, the computer system can include a communication port 250, which is connected to a network for data communication, for example, the computer system can send and receive information and data through the communication port 250, that is, the communication port 250 can realize the communication between the computer system and the computer system. Other electronic devices communicate wirelessly or by wire to exchange data. The computer system may also include a processor group 220 (ie, the processor described above) for executing program instructions. The processor group 220 may consist of at least one processor (eg, CPU). The computer system may include an internal communication bus 210 . A computer system may include different forms of program storage units and data storage units (i.e., memory or storage media described above), such as hard disk 270, read-only memory (ROM) 230, random access memory (RAM) 240, which can be used to store Various data files used by the computer for processing and/or communicating, and possibly program instructions executed by the processor group 220 . The computer system may also include an input/output component 260 for enabling input/output data flow between the computer system and other components (eg, user interface 280, etc.).

Typically, the following devices can be connected to the input/output assembly 260: input devices including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibrator, etc. output devices; storage devices including, for example, magnetic tapes, hard disks, etc.; and communication interfaces.

While FIG. 11 shows a computer system with various devices, it should be understood that the computer system is not required to have all of the devices shown and, instead, the computer system may have more or fewer devices.

For this disclosure, the following points need to be explained:

(1) The drawings of the embodiments of the present disclosure only relate to the structures involved in the embodiments of the present disclosure, and other structures may refer to general designs.

(2) For clarity, in the drawings used to describe the embodiments of the present invention, the thickness and size of layers or structures are exaggerated. It will be understood that when an element such as a layer, film, region, or substrate is referred to as being "on" or "under" another element, it can be "directly on" or "under" the other element, Or intervening elements may be present.

(3) In the case of no conflict, the embodiments of the present disclosure and the features in the embodiments can be combined with each other to obtain new embodiments.

The above description is only a specific implementation manner of the present disclosure, but the protection scope of the present disclosure is not limited thereto, and the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

An image processing method, characterized in that, comprising:

Acquiring an image to be processed, wherein the image to be processed includes a target area;

Perform a first sharpening process on the image to be processed by using a first neural network model to obtain a first intermediate image corresponding to the image to be processed, wherein the definition of the first intermediate image is greater than that of the image to be processed clarity;

performing a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image through a second neural network model, so as to obtain a second intermediate image corresponding to the intermediate target area;

Compositing the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed.
The image processing method according to claim 1, wherein the second sharpening process is performed on the intermediate target area corresponding to the target area in the first intermediate image through the second neural network model, so as to obtain Before the second intermediate image corresponding to the intermediate target area, the image processing method further includes:

Perform recognition processing on the first intermediate image by using a third neural network model to obtain the intermediate target area corresponding to the target area in the first intermediate image.
The image processing method according to claim 1, wherein the definition of the second intermediate image is greater than the definition of the intermediate target area.
The image processing method according to claim 1, wherein the composite processing of the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed comprises:

performing tone processing on the second intermediate image based on the tone of the first intermediate image to obtain a third intermediate image, wherein the tone of the third intermediate image tends to the tone of the first intermediate image; and

performing image combination processing on the first intermediate image and the third intermediate image to obtain the synthesized image.
The image processing method according to any one of claims 1-4, wherein the target area is a human face area.
The image processing method according to any one of claims 1-4, wherein the first neural network model is different from the second neural network model.
The image processing method according to any one of claims 1-4, wherein, before performing the first sharpening process on the image to be processed through the first neural network model, the image processing method further comprises:

get sample image;

Blurring the sample image to obtain an image to be trained, wherein the definition of the image to be trained is smaller than the definition of the sample image;

Based on the sample image and the image to be trained, a first neural network model to be trained and a second neural network model to be trained are trained to obtain the first neural network model and the second neural network model.
The image processing method according to claim 7, wherein blurring the sample image to obtain an image to be trained comprises:

Acquiring a texture slice, where the size of the texture slice is the same as the size of the sample image;

performing a first blurring process on the sample image to obtain a first blurred image, wherein the definition of the first blurred image is smaller than the definition of the sample image;

performing color mixing processing on the first blurred image and the texture slice to obtain a second blurred image;

performing a second blurring process on the second blurred image to obtain the image to be trained.
The image processing method according to claim 8, wherein obtaining texture slices comprises:

Obtain at least one preset texture image;

Randomly select a preset texture image from the at least one preset texture image as the target texture image;

In response to the size of the target texture image being the same as the size of the sample image, using the target texture image as the texture slice;

In response to the size of the target texture image being larger than the size of the sample image, randomly cutting the target texture image based on the size of the sample image to obtain a slice area with the same size as the sample image, The slice area serves as the texture slice.
The image processing method according to claim 8, characterized in that, the first blurring process includes Gaussian blurring, noise adding, or a combination of Gaussian blurring and noise adding in any order and in any number. combined processing;

The second blurring process includes the Gaussian blurring process, the noise adding process, or a combined process based on an arbitrary order and an arbitrary number of the Gaussian blurring process and the noise adding process.
The image processing method according to claim 10, wherein performing the first blurring process on the sample image to obtain the first blurred image comprises: performing the Gaussian blurring process on the sample image to obtain the The first blurred image;

Performing a second blurring process on the second blurred image to obtain the image to be trained includes: performing the noise addition process on the second blurred image to obtain an intermediate blurred image; performing The Gaussian blur processing to obtain the image to be trained.
The image processing method according to claim 8, wherein color mixing is performed on the first blurred image and the texture slice to obtain a second blurred image, comprising:

performing color filtering processing on the first blurred image and the texture slice to obtain the second blurred image.
The image processing method according to claim 8, wherein color mixing is performed on the first blurred image and the texture slice to obtain a second blurred image, comprising:

Perform highlighting processing on the texture slice and the first blurred image to obtain the second blurred image.
An image processing device, characterized in that it comprises:

an image acquisition unit configured to acquire an image to be processed, wherein the image to be processed includes a target area;

The first processing unit is configured to perform a first sharpening process on the image to be processed by using a first neural network model to obtain a first intermediate image corresponding to the image to be processed, wherein the sharpness of the first intermediate image The degree is greater than the definition of the image to be processed;

The second processing unit is configured to perform a second sharpening process on the intermediate target area corresponding to the target area in the first intermediate image through a second neural network model, so as to obtain a second intermediate image corresponding to the intermediate target area image;

A compositing unit configured to composite the first intermediate image and the second intermediate image to obtain a composite image corresponding to the image to be processed.
The image processing device according to claim 14, wherein the synthesis unit includes a color tone processing module and an image merging processing module,

The tone processing module is configured to perform tone processing on the second intermediate image based on the tone of the first intermediate image to obtain a third intermediate image, wherein the tone of the third intermediate image tends to the the hue of the first intermediate image;

The image combination processing module is configured to perform image combination processing on the first intermediate image and the third intermediate image to obtain the composite image.
An electronic device, characterized in that it comprises:

memory non-transitoryly storing computer-executable instructions;

a processor configured to execute said computer-executable instructions,

Wherein, the computer-executable instructions are executed by the processor to implement the image processing method according to any one of claims 1-13.
A non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the computer-executable instructions according to claims 1-13 are implemented. The image processing method described in any one.