CN111369468B

CN111369468B - Image processing method, image processing device, electronic equipment and computer readable medium

Info

Publication number: CN111369468B
Application number: CN202010157922.9A
Authority: CN
Inventors: 李华夏
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2022-02-01
Anticipated expiration: 2040-03-09
Also published as: CN111369468A

Abstract

The method adopts a trained color-changing special effect network of a target area, carries out down-sampling on an image to be processed to obtain a down-sampling result of the image to be processed, carries out up-sampling interpolation on the down-sampling result according to a first attribute to be interpolated corresponding to target gender, can realize target color conversion processing on the image to be processed to obtain a color-changing image of the target area, is beneficial to realizing color conversion of the image locking specific area through the color-changing special effect network of the target area, and provides a brand-new special effect experience for users.

Description

Image processing method, image processing device, electronic equipment and computer readable medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an image processing method and apparatus, an electronic device, and a computer-readable medium.

Background

With the rapid development of computer technology and communication technology, the use of intelligent terminals is widely popularized, and more application programs are developed to facilitate and enrich the work and life of people.

Currently, many applications are dedicated to providing more personalized visual special effects with better visual perception for intelligent terminal users, such as filter effects, sticker effects, deformation effects, and the like. The diversity, interactivity and sociality of the visual special effect experience are fully developed.

Users often have a need to build various network images using visual features according to their interests or other motivations. For example, the color-changed patterns of the hair or eyes of the user can be displayed on the network through photos or videos, so that the camouflage effect is achieved.

However, in the prior art, a special effect of performing color conversion on a specific region of an image has not been realized.

Disclosure of Invention

In order to overcome the above technical problems or at least partially solve the above technical problems, the following technical solutions are proposed:

in a first aspect, the present disclosure provides an image processing method, including:

acquiring an image to be processed, a pre-trained target area color-changing special effect network and a first attribute to be interpolated corresponding to the target area color-changing special effect network;

and performing down-sampling on the image to be processed through the target area color-changing special effect network to obtain a down-sampling result of the image to be processed, and performing up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image.

In a second aspect, the present disclosure provides an image processing apparatus comprising:

the acquisition module is used for acquiring an image to be processed, a pre-trained color-changing special effect network of a target area and a first attribute to be interpolated corresponding to the color-changing special effect network of the target area;

and the special effect processing module is used for carrying out down-sampling on the image to be processed through the target area color-changing special effect network to obtain a down-sampling result of the image to be processed, and carrying out up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image.

In a third aspect, the present disclosure provides an electronic device comprising:

a processor and a memory storing at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement a method as set forth in the first aspect of the disclosure.

In a fourth aspect, the present disclosure provides a computer readable medium for storing a computer instruction, program, code set or instruction set which, when run on a computer, causes the computer to perform the method as set forth in the first aspect of the disclosure.

According to the image processing method, the device, the electronic equipment and the computer readable medium, the trained color-changing special effect network of the target area is adopted, the image to be processed is subjected to down-sampling to obtain a down-sampling result of the image to be processed, the down-sampling result is subjected to up-sampling interpolation according to the first attribute to be interpolated corresponding to the target gender, then the color processing of the image to be processed for converting the target color can be realized, the color-changing image of the target area is obtained, the color-changing special effect network of the target area is beneficial to realizing that the image is locked in the specific area for color conversion, and a brand-new special effect experience is provided for a user.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a countermeasure generation network provided by an embodiment of the present disclosure;

fig. 3 is a schematic diagram of another countermeasure generation network provided by an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a training model provided in an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing the features, data, elements, devices, modules or units, and are not used for limiting the features, data, elements, devices, modules or units to be specific to different features, data, elements, devices, modules or units, and also for limiting the sequence or interdependence relationship of the functions performed by the features, data, elements, devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

To make the objects, technical solutions and advantages of the present disclosure more apparent, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

An embodiment of the present disclosure provides an image processing method, as shown in fig. 1, the method including:

step S110: acquiring an image to be processed, a pre-trained target area color-changing special effect network and a first attribute to be interpolated corresponding to the target area color-changing special effect network;

the color-changing special effect network of the target area is obtained by training the target area, namely training the probability map of the target area corresponding to the training sample in the training stage, and correspondingly processing the target area of the image to be processed.

Step S120: and performing down-sampling on the image to be processed through the target area color-changing special effect network to obtain a down-sampling result of the image to be processed, and performing up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image.

The image processing method provided by the embodiment of the disclosure adopts the trained color-changing special effect network of the target area, performs down-sampling on the image to be processed to obtain a down-sampling result of the image to be processed, performs up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated corresponding to the target gender, and then can realize the target color conversion processing on the image to be processed to obtain the color-changing image of the target area.

In the embodiment of the disclosure, the color-changing special effect network of the target area is obtained by training through the following steps:

step S210: acquiring a pre-constructed countermeasure generating network, wherein the countermeasure generating network comprises a generating network, a first judging network, a second judging network and a first classifying network; through a generation network, performing down-sampling on each sample image containing color information of a target area to obtain corresponding image characteristics, and performing up-sampling and target color conversion on the image characteristics of each sample image to obtain a corresponding generated image; determining a probability map of each sample image corresponding to the target area, and mixing the corresponding sample image and the corresponding generated image according to each probability map to obtain a corresponding mixed image; judging the color reality of the image characteristics of each sample image through a first judging network to obtain a corresponding first judging result; judging the authenticity of each sample image and the corresponding mixed image through a second judging network to obtain a corresponding second judging result; obtaining a classification result of the target color according to the generated image corresponding to each sample image through a first classification network;

step S220: and performing countermeasure training on the antibiotic network based on the first judgment result, the second judgment result, the classification result and the generated image corresponding to each sample image, and determining the trained generated network as the color-changing special effect network of the target area.

In the embodiment of the present disclosure, the target region is a region where an object whose color needs to be changed is located in the image, and the sample image used for training is an image including the object whose color needs to be changed. As an example, if the object needing color conversion is hair, the target area is a hair area, and the sample image may be a face image, a human body image, or the like. In other application scenarios, the object needing to change color may also be a vehicle, a tree, an animal, and the like, and the sample image may be a landscape image, and the embodiment of the present disclosure does not limit this.

The challenge generation network may be constructed based on various types of challenge generation networks (GAN), and the main structure of GAN includes a generator g (generator) and a discriminator d (discriminator).

For the embodiment of the disclosure, as shown in fig. 2, a generation network is defined as a generator G, and the network uses a processing manner of down-sampling and up-sampling, which can better represent potential features in a sample image. Specifically, in each training, in the down-sampling stage, the generation network extracts image features from the sample image, and in practical application, the image features may be expressed in the form of feature vectors. In the up-sampling stage, the generation network performs target color transformation processing on the image in the process of generating the image features into the image, so that the generation network learns the capability of generating the image with the target color in the target area. For example, if the target color of the training is yellow, the generation network performs a yellowing process on the image in the up-sampling stage, and outputs a generated image in which the target area generated for the generation network is yellow. In practical application, the generation network can be trained to be capable of transforming into different colors according to different colors of trained targets. In the embodiment of the disclosure, the ability to train only one generation network to be converted into one target color can improve the accuracy of the generation network. In addition, according to different trained target regions, the generation network may be trained to have the capability of transforming colors of different regions, and a person skilled in the art may select a required target color and a target region according to actual conditions for training, and the embodiment of the present disclosure is not limited herein.

For the embodiment of the present disclosure, as shown in fig. 2, two discriminators D are also defined, which are a first discrimination network and a second discrimination network respectively. And the first discrimination network is used for discriminating the color reality of the image features during each training. Wherein the judgment result and the color information of the sample image should be associated. For example, if the training input generation network is a sample image whose target area is non-yellow, the first discrimination network discriminates that the image feature is non-yellow in the target area as true, and that yellow is false. The second discrimination network is used for discriminating the authenticity of the image.

In the embodiment of the present disclosure, in order to enable the second determination network to determine the authenticity of the image, only pay attention to the color transformation of the target area without being interfered by a background or other objects, so as to improve the ability of the generation network to change color for the target area through countermeasure training, as shown in fig. 2, the countermeasure generation network further includes a function of dividing the target area, and the function can be directly executed by a corresponding method module in the countermeasure generation network. Specifically, a single-channel probability map (mask) needs to be obtained for the target region, so as to represent the relationship between each pixel in the image and the target region. And before the generated image is placed into the second judgment network, mixing the corresponding generated image and the sample image according to the obtained probability map to obtain a mixed image. That is, the second discrimination network is for discriminating authenticity of the sample image and the mixed image, that is, whether the sample image is true (Real) or false (Fake), and whether the mixed image is true or false.

For the disclosed embodiment, as shown in fig. 2, a classifier, i.e. a first classification network, is also defined for assisting training. And during each training, the first classification network is used for classifying the generated images to obtain a classification result of the target color. As an example, if the target color of the training is yellow, the classification result of the generated image obtained by the first classification network should be yellow.

In the embodiment of the present disclosure, the confrontation training in step S220 may specifically adopt the following procedures:

initializing a network parameter of a generated network, a network parameter of a first discrimination network, a network parameter of a second discrimination network and a network parameter of a first classification network.

Based on m sets of sample images { a₁，a₂，…，a_mAnd m image features z derived from the generation network₁，z₂，…，z_mM generated images

m mixed images

And (5) performing confrontation training.

Training a first discrimination network to distinguish the real color of the image feature in the target area as accurately as possible; the training of the generation network is to make the image feature far from the color information of the target area as much as possible, which is equivalent to making the first discrimination network erroneously discriminate the color of the target area of the image feature as much as possible. And training a second discrimination network to distinguish as accurately as possible between real samples (sample images) and generated samples (mixed images); training the generating network to blur the difference between the generated sample (mixed image) and the real sample (sample image) as much as possible also means that the second discrimination network discriminates erroneously as much as possible. The first classification network is trained to classify the image as accurately as possible. That is, the generation network can improve the generation capability in the course of the countermeasure training, the first discrimination network and the second discrimination network can improve the discrimination capability, and the first classification network can improve the classification capability.

After multiple updating iterations, the final ideal situation is that the first judging network cannot judge the real color of the image feature in the target area, and the second judging network cannot judge whether the sample is the generated sample or the real sample.

As the generation capacity of the generation network reaches an ideal state through countertraining, the trained generation network is determined as the color-changing special effect network of the target area, and a good color-changing special effect of the image target area can be realized.

According to the image processing method provided by the embodiment of the disclosure, a pre-constructed countermeasure generation network is adopted when a color-changing special effect network of a target area is trained, countermeasure training is performed on an image characteristic layer and a generated image layer of a sample image together, and the target area is segmented through a probability map, so that the color-changing capacity of the target area in the image can be effectively improved through the generation network in the countermeasure generation network, the color-changing effect cannot affect other areas outside the target area, the trained generation network is used as the special effect network of the target area to perform target color conversion on an image to be processed, when the color-changing image of the target area is obtained, the color conversion of the specific area can be locked by the image, and a brand-new special effect experience is provided for a user.

In the embodiment of the present disclosure, a feasible implementation manner is provided for generating a transform target color process in a network upsampling stage, and specifically, the method includes the following steps: and performing up-sampling interpolation on the image characteristics of each sample image according to a second attribute to be interpolated of the preset target color.

In this embodiment, a person skilled in the art may set the second to-be-interpolated attribute of the target color according to actual conditions, for example, if the target color of the training is yellow, the interpolation attribute of yellow may be set to 1, and the interpolation attribute of non-yellow may be set to 0, etc. Because the generated network can automatically learn the incidence relation between the interpolation attribute and the color attribute of the target area in the confrontation training process, the preset interpolation attribute can be used for representing the target color after the training is finished.

In the embodiment of the present disclosure, the second attribute to be interpolated may be inserted in the upsampling process in a form of a channel.

Specifically, during upsampling, a layer of channel is additionally added to the original number of channel layers of the image feature for filling the second attribute to be interpolated.

Taking the interpolation attribute of the target color as 1 as an example, assuming that the image features include an 8 × 8 feature map of 4 channels (i.e., the size is 4 × 8 × 8), upsampling will insert a layer of 1 channels, and finally obtain an 8 × 8 feature map of 5 channels (i.e., the size is 5 × 8 × 8).

Those skilled in the art can understand that color information of a sample image in a target area is hidden in an original channel (i.e., in an 8 × 8 feature map of 4 channels in the above example), through countertraining of a generation network and a first discrimination network, color information of the target area in the original channel can be removed in a downsampling stage, that is, image features of the target area without color information are obtained, and then a second attribute to be interpolated, which is learned by the generation network and characterizes a target color, is inserted in an upsampling stage, so that a color of the target area of the image can be determined to be generated by using the second attribute to be interpolated.

In the embodiment of the present disclosure, a feasible implementation manner is provided for a function of countering target region segmentation in a generation network, and specifically, a process of determining a probability map of each sample image corresponding to a target region includes: and acquiring each probability map, wherein each probability map is obtained by performing image segmentation processing on a corresponding sample image through a pre-trained target region segmentation network.

In this embodiment, the target area segmentation network is pre-trained, and the target area can be identified. For example, if the color-changing object is hair, a hair segmentation network can be trained to identify hair regions. And inputting each sample image into a trained target region segmentation network for image segmentation processing, and outputting a corresponding probability map.

In practical application, the target area segmentation network can be independent of the countermeasure generation network, and after the corresponding probability map is output in an off-line or real-time mode, the target area segmentation network is directly acquired and used by the countermeasure generation network. Or, the target area segmentation network may also belong to a countermeasure generation network, and is used by other modules in the countermeasure generation network in real time after outputting the corresponding probability map.

The output probability map may indicate whether each pixel in the corresponding image (including the sample image and the generated image) belongs to the target region. As an example, the probability map may be a binary image composed of 0 and 1, with 1-value regions being the target regions and 0-value regions being other regions in the image outside the target regions. Then, the countermeasure generating network mixes the corresponding sample image and the corresponding generated image according to each probability map, that is, for the 1-value region in the probability map, the corresponding pixel in the generated image is taken, and for the 0-value region in the probability map, the corresponding pixel in the local image is sampled to form a complete mixed image.

Alternatively, the output probability map may represent the probability that each pixel in the corresponding image belongs to the target region. As an example, each pixel in the probability map corresponds to a numerical value greater than or equal to 0 and less than or equal to 1, and the closer the numerical value is to 1, the higher the probability that the pixel belongs to the target region. The challenge-generating network then mixes the respective sample image and the corresponding generated image according to each probability map, i.e. mixes the values of the respective pixels in the respective sample image and the corresponding generated image based on the probability values of the respective pixels in each probability map. In short, for each pixel, the following calculation is performed to obtain each pixel of the mixed image:

z＝p*x+(1-p)*y

where p represents the probability value for any pixel, x represents the pixel in the generated image, y represents the pixel in the sample image, and z represents the pixel in the blended image.

In the embodiment of the present disclosure, another possible implementation manner is provided for the function of segmenting the target region in the confrontation generating network, specifically, as shown in fig. 3, the confrontation generating network further includes an attention network, and the image segmentation processing is performed on each sample image through the attention network to obtain a corresponding probability map.

Unlike the target region segmentation network in the previous embodiment, in this embodiment, the attention network uses an unsupervised training mode, that is, in the training of the anti-biotic network, the attention network automatically learns to perform image segmentation processing on each sample image, and outputs a corresponding probability map.

Similarly, the probability map output by the attention network may indicate whether each pixel in the corresponding image belongs to the target region. The probability that each pixel in the corresponding image belongs to the target region may also be expressed. The specific representation manner and the calculation manner may refer to the description in the target area segmentation network implementation manner, and are not described herein again.

After the mixed image is obtained through any embodiment, the mixed image can be put into a second judgment network to judge the authenticity of the sample.

In the embodiment of the disclosure, a corresponding loss function is provided for the countermeasure training process, so as to better optimize the countermeasure generation network in the training process.

Specifically, step S220 includes the steps of:

step S221: determining real color loss and false color loss corresponding to corresponding image features according to the first discrimination result corresponding to each sample image;

since the first discrimination network needs to determine the colors of the m image features in the target region as true (the true probability is 1 when the color information of the sample image in the target region is the same), but in the actual training process, the probability that the color of each image feature in the target region is determined as true by the first discrimination network may not be 1, at this time, a countermeasure loss may be determined based on the determination of the true and false probabilities of the colors of the image feature target region, which is defined as the true color loss corresponding to the image feature in the embodiment of the present disclosure, and for convenience of description, the true color loss corresponding to the image feature is hereinafter abbreviated as L3_ loss 1.

Since the generation network needs to make the target area of the image feature far from the color information as much as possible, it is equivalent to make the first discrimination network erroneously discriminate the color of the target area of the image feature as much as possible, and determine the colors of the m target areas of the image feature as false. At this time, a countermeasure loss may be determined based on the judgment (false judgment) of the true-false probability of the color of the image feature target region caused by the generation network, which is defined as a false color loss corresponding to the image feature in the embodiment of the present disclosure, and for convenience of description, the false color loss corresponding to the image feature is hereinafter abbreviated as L3_ loss 2.

Step S222: determining a true sample loss corresponding to the corresponding sample image, a false sample true loss corresponding to the mixed image and a false sample false loss corresponding to the mixed image according to a second judgment result corresponding to each sample image;

in the embodiment of the present disclosure, because the second decision network needs to decide all m sample images as true samples (that is, true samples, where the true probability is 1), but in the actual training process, the probability that each sample image is decided as true by the second decision network may not be 1, at this time, a countermeasure loss may be determined based on the decision of the true and false probabilities of the sample images, which is defined as the true sample loss corresponding to the sample image, and for convenience of description, the true sample loss corresponding to the sample image is hereinafter abbreviated as L2_ loss 1.

Since the second decision network needs to decide all m mixed images as false samples (i.e. the generated samples have a true probability of 0), in the actual training process, the probability that each mixed image is decided as true by the second decision network may not be 0. At this time, a countermeasure loss may be determined based on the determination of the true and false probability of the blended image, which is defined as a false sample true loss corresponding to the blended image in the embodiment of the present disclosure, and for convenience of description, the false sample true loss corresponding to the blended image is abbreviated as L2_ loss2 hereinafter.

Since the generation network needs to blur the difference between the generated sample (mixed image) and the real sample (sample image) as much as possible, that is, the generation network makes the second judgment network wrong as much as possible, and all m mixed images are judged as real samples. At this time, a countermeasure loss may be determined based on the judgment (false judgment) of the true-false probability of the mixed image caused by the generation network, which is defined as a false sample false loss corresponding to the mixed image in the embodiment of the present disclosure, and for convenience of description, the false sample false loss corresponding to the mixed image is hereinafter referred to as L2_ loss 3.

In practical applications, all three losses can be calculated based on the least squares loss function, but are not limited thereto.

Step S223: determining corresponding target color classification loss according to the classification result corresponding to each sample image;

since the first classification network needs to generate image classification as accurately as possible, a classification loss is generated, and is defined as a target color classification loss in the embodiment of the present disclosure, and for convenience of description, the target color classification loss is abbreviated as L4_ loss hereinafter.

In practical applications, the loss can be calculated based on a cross-entropy loss function, but is not limited thereto.

Step S224: determining an image loss between each sample image and the corresponding generated image;

wherein, as is clear to a person skilled in the art, since the generated image is obtained by down-sampling and up-sampling the corresponding sample image, the pixel size of each sample image and the corresponding generated image is the same, e.g. a₁And

are the same. However, in the actual training process, the content of each sample image is different from that of the corresponding generated image, and the same pixels in the corresponding sample image and the corresponding generated image can be compared one by one, so that the difference value of each pixel is determined, and the image loss between the sample image and the generated image is determined according to the difference value of each pixel.

In one possible implementation, the difference values of each pixel are summed to obtain the image loss between the sample image and the generated image.

Hereinafter, for convenience of description, the image loss between the sample image and the generated image is simply referred to as L1_ loss.

Step S225: and optimizing the antibiotic network according to the real color loss, the false color loss, the real sample loss, the false sample real loss, the false sample false loss, the target color classification loss and the image loss corresponding to each sample image.

In a possible implementation manner, L1_ loss, L2_ loss1, L2_ loss2, L2_ loss3, L3_ loss1, L3_ loss2 and L4_ loss of each training are fused, for example, weighted fusion, addition, averaging or other fusion methods are performed, so as to obtain the corresponding total loss. In this step, the antibiotic network is optimized according to the total loss of each training, and the best training effect is obtained step by step.

In addition, the training process provided by the embodiment of the present disclosure may further include the steps of: acquiring a pre-trained second classification network; and labeling the color information of the target area to the image data through a second classification network to obtain each sample image containing the color information of the target area.

The second classification network is trained by using an open data set and used as an evaluation index for distinguishing color information of the image target area. In the embodiment of the present disclosure, the specific form of the color information is not limited, for example, 1 and 0 in the above example may be used to label the target color and the non-target color, and other types of label information may also be used.

The second classification network is utilized to label the color information of the target area to the image data of various sources, so that a large number of sample images containing the color information of the target area can be obtained for generating the network for processing, and can also be used for processing by the second judgment network and/or the functional module for dividing the target area.

In other embodiments, the second classification network may also be independent of the challenge generation network, and may be used for the challenge generation network call after obtaining a large number of sample images containing color information of the target area offline or in real time through the second classification network.

Based on the foregoing embodiments of the present disclosure, an embodiment of the present disclosure further provides a training model, where the training model includes: the system comprises a generation network, a mixing module, a first discrimination network, a second discrimination network and a first classification network, wherein the generation network is used for down-sampling each sample image containing color information of a target area to obtain corresponding image characteristics, up-sampling and converting the image characteristics of each sample image to obtain a corresponding generated image, the mixing module is used for determining a probability map of each sample image corresponding to the target area, mixing the corresponding sample image and the corresponding generated image according to each probability map to obtain a corresponding mixed image, the first discrimination network is used for discriminating the color authenticity of the image characteristics of each sample image to obtain a corresponding first discrimination result, the second discrimination network is used for discriminating the authenticity of each sample image and the corresponding mixed image to obtain a corresponding second discrimination result, the first classification network is used for obtaining a classification result of the target color according to the generated image corresponding to each sample image;

as shown in fig. 4, the generation network is connected to the mixing module, the first discrimination network and the first classification network, and the mixing module is connected to the second discrimination network, so that the anti-biotic network is subjected to the countertraining based on the first discrimination result, the second discrimination result, the classification result and the generated image corresponding to each sample image, and the trained generation network is obtained.

Further, a mixing module of the training model may include an attention network, and when the mixing module is configured to determine a probability map of each sample image corresponding to the target region, the attention network is configured to process each sample image to obtain a corresponding probability map.

Or the training model may include an attention network, a trained target region segmentation network, and a mixing module, where the probability map output by the target region segmentation network is provided for the mixing module to obtain for subsequent processing.

Furthermore, the training model may further include a second classification network, connected to the generation network, for labeling the image data with the color information of the target area, and obtaining sample images each containing the color information of the target area. In practical applications, the second classification network may be further connected to the mixing module and/or the second discrimination network.

The implementation principle and the generated technical effect of the training model provided in the embodiments of the present disclosure are the same as those of the countermeasure generation network in the embodiments of the foregoing methods, and for the sake of brief description, no part of this embodiment is mentioned, and reference may be made to corresponding contents in the embodiments of the foregoing methods, and details are not repeated here.

Based on the above embodiments of the present disclosure, in the embodiments of the present disclosure, the processing instruction of the color-changing special effect of the target area may be issued through an operation of a user on the terminal device. The terminal devices include, but are not limited to, mobile terminals, smart terminals, and the like, such as mobile phones, smart phones, tablet computers, notebook computers, personal digital assistants, portable multimedia players, navigation devices, and the like. It will be understood by those skilled in the art that the configuration according to the embodiments of the present disclosure can be applied to a fixed type terminal such as a digital television, a desktop computer, etc., in addition to elements particularly used for mobile purposes.

In the embodiment of the present disclosure, the execution subject of the method may be the terminal device or an application installed on the terminal device. Specifically, after receiving a processing instruction of the color-changing special effect of the target area, in step S110, a to-be-processed image corresponding to the processing instruction is acquired.

In addition, in step S110, a target area color-changing special effect network obtained by training in the training step provided in any of the above embodiments of the present disclosure and a first to-be-interpolated attribute corresponding to the target area color-changing special effect network are further obtained to execute the target area color-changing special effect on the to-be-processed image. The target area color-changing special effect network has a corresponding target color, namely the first attribute to be interpolated is also the first attribute to be interpolated of the target color corresponding to the target area color-changing special effect network. For how the attribute to be interpolated is set, reference may be made to the description of the training phase above, and details are not repeated here. Specifically, the image to be processed is downsampled through the target area color-changing special effect network to obtain a downsampling result of the image to be processed, the downsampling result is the image characteristic far away from the color information by combining the above information, and then the downsampling result is upsampled and interpolated according to the first to-be-interpolated attribute corresponding to the target area color-changing special effect network, so that the target area color-changing special effect image can be obtained. The manner of upsampling and interpolating may refer to the description of the generation network, and is not described herein again.

In practical applications, the processing instruction may include sequentially changing the target area to a plurality of target colors, for example, sequentially changing hair of the human face image to yellow, blue, red, and the like. At this time, the target area color-changing special effect networks corresponding to different trained target colors can be obtained, the target color changing processing is carried out on the images to be processed respectively, target area color-changing images of various colors are obtained, and the images are displayed in sequence.

Further, after obtaining the color-changing image of the target area, the method can further comprise the following steps: and displaying the color-changing special effect image of the target area on a display screen.

Or, the execution main body of the method may be a server, after receiving a processing instruction of the color-changing special effect of the target area sent by the terminal device, similar to the terminal device, receiving an image to be processed corresponding to the processing instruction, acquiring a color-changing special effect network of the target area obtained by training in the training step provided in any embodiment of the present disclosure, down-sampling the image to be processed through the color-changing special effect of the target area to obtain a down-sampling result of the image to be processed, and then up-sampling and interpolating the down-sampling result according to a first attribute to be interpolated of the target color to obtain the color-changing special effect image of the target area. And for the server, the color-changing special effect image of the target area needs to be sent to the terminal equipment for displaying.

It should be noted that the first attribute to be interpolated and the second attribute to be interpolated only represent the distinction between the age interpolation attributes in the training phase and the application phase, and are not to be construed as the definition of the interpolation attribute. In practical application, the expression forms of the first attribute to be interpolated and the second attribute to be interpolated are determined according to the setting of the generation network.

In the embodiment of the present disclosure, the type of the image to be processed should be the same as or similar to the type of the sample image, and reference may be specifically made to the description of the sample image, which is not described herein again.

In practical applications, the number of the images to be processed may be one or more. When the number of the images to be processed is multiple, the images to be processed may also be videos to be processed. And processing each frame of image in the video to be processed by adopting the image processing method to obtain the color-changing special effect video of the target area.

It can be understood by those skilled in the art that the obtained color-changing special effect image of the target area and the image to be processed are the same image with different colors of the target area, for example, the same face image with different hair colors. In practical application, the special effect image obtained by the image processing method provided by the embodiment of the disclosure has higher definition and sharpening degree than the original image (to-be-processed image).

The embodiment of the present disclosure also provides an image processing apparatus, as shown in fig. 5, the image processing apparatus 50 may include: an acquisition module 510 and a special effects processing module 520, wherein,

the obtaining module 510 is configured to obtain an image to be processed, a pre-trained color-changing special effect network in a target area, and a first attribute to be interpolated corresponding to the color-changing special effect network in the target area;

the special effect processing module 520 is configured to perform downsampling on the image to be processed through the target area color-changing special effect network to obtain a downsampling result of the image to be processed, and perform upsampling and interpolation on the downsampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image.

In an alternative implementation manner, the color-changing special effect network of the target area is obtained by training through the following steps:

acquiring a pre-constructed countermeasure generating network, wherein the countermeasure generating network comprises a generating network, a first judging network, a second judging network and a first classifying network;

through a generation network, performing down-sampling on each sample image containing color information of a target area to obtain corresponding image characteristics, and performing up-sampling and target color conversion on the image characteristics of each sample image to obtain a corresponding generated image;

determining a probability map of each sample image corresponding to the target area, and mixing the corresponding sample image and the corresponding generated image according to each probability map to obtain a corresponding mixed image;

judging the color reality of the image characteristics of each sample image through a first judging network to obtain a corresponding first judging result;

judging the authenticity of each sample image and the corresponding mixed image through a second judging network to obtain a corresponding second judging result;

obtaining a classification result of the target color according to the generated image corresponding to each sample image through a first classification network;

and performing countermeasure training on the antibiotic network based on the first judgment result, the second judgment result, the classification result and the generated image corresponding to each sample image, and determining the trained generated network as the color-changing special effect network of the target area.

In an alternative implementation, the process of determining the probability map of the target region corresponding to each sample image includes any one of the following:

obtaining each probability map, wherein each probability map is obtained by performing image segmentation processing on a corresponding sample image through a pre-trained target region segmentation network;

the countermeasure generating network also comprises an attention network, and each sample image is subjected to image segmentation processing through the attention network to obtain a corresponding probability map.

In an alternative implementation, the process of mixing the corresponding sample image and the corresponding generated image according to each probability map includes:

the values of the respective pixels in the respective sample image and the corresponding generated image are blended based on the probability values of the respective pixels in each probability map.

In an alternative implementation, the process of upsampling and transforming the image features of each sample image to a target color includes:

and performing up-sampling interpolation on the image characteristics of each sample image according to a second attribute to be interpolated of the preset target color.

In an optional implementation manner, the process of performing countermeasure training on an anti-forming network based on the first determination result, the second determination result, the classification result, and the generated image corresponding to each sample image includes:

determining real color loss and false color loss corresponding to corresponding image features according to the first discrimination result corresponding to each sample image;

determining a true sample loss corresponding to the corresponding sample image, a false sample true loss corresponding to the mixed image and a false sample false loss corresponding to the mixed image according to a second judgment result corresponding to each sample image;

determining corresponding target color classification loss according to the classification result corresponding to each sample image;

determining an image loss between each sample image and the corresponding generated image;

and optimizing the antibiotic network according to the real color loss, the false color loss, the real sample loss, the false sample real loss, the false sample false loss, the target color classification loss and the image loss corresponding to each sample image.

In an optional implementation manner, the method further includes:

acquiring a pre-trained second classification network;

and labeling the color information of the target area to the image data through a second classification network to obtain each sample image containing the color information of the target area.

The image processing apparatus provided in the embodiment of the present disclosure may be specific hardware on the device, or software or firmware installed on the device, and the implementation principle and the generated technical effect are the same as those of the foregoing method embodiment, and for brief description, no part of the embodiment of the device is mentioned, and reference may be made to corresponding contents in the foregoing method embodiment, and details are not repeated here.

For training of the color-changing special effect network of the target area, the embodiment of the present disclosure further provides a training device, where the training device may include: a network acquisition module and a network training module, wherein,

the network acquisition module is used for acquiring a pre-constructed countermeasure generation network, and the countermeasure generation network comprises a generation network, a mixing module, a first judgment network, a second judgment network and a first classification network;

the generation network is used for carrying out down-sampling on each sample image containing the color information of the target area to obtain corresponding image characteristics, and carrying out up-sampling and target color conversion processing on the image characteristics of each sample image to obtain a corresponding generated image;

the mixing module is used for determining a probability map of each sample image corresponding to the target area, and mixing the corresponding sample image and the corresponding generated image according to each probability map to obtain a corresponding mixed image;

the first discrimination network is used for discriminating the color authenticity of the image feature of each sample image to obtain a corresponding first discrimination result;

the second judging network is used for judging the authenticity of each sample image and the corresponding mixed image to obtain a corresponding second judging result;

the first classification network is used for obtaining a classification result of the target color according to the generated image corresponding to each sample image;

the network training module is used for carrying out countermeasure training on the anti-biotic network based on the first judgment result, the second judgment result, the classification result and the generated image corresponding to each sample image, and determining the trained generated network as a target area color-changing special effect network.

In an optional implementation manner, the mixing module, when being configured to determine the probability map of the target region corresponding to each sample image, is specifically configured to:

the mixing module further comprises an attention network, and image segmentation processing is carried out on each sample image through the attention network to obtain a corresponding probability map.

In an optional implementation manner, the mixing module, when configured to mix the corresponding sample image and the corresponding generated image according to each probability map, is specifically configured to:

In an optional implementation manner, the generation network, when being configured to perform upsampling and target color transformation processing on the image features of each sample image, is specifically configured to:

In an optional implementation manner, the network training module, when configured to perform countercheck training on the anti-biological network based on the first determination result, the second determination result, the classification result, and the generated image corresponding to each sample image, is specifically configured to:

In an optional implementation manner, the network obtaining module is further configured to:

acquiring a pre-trained second classification network;

The training apparatus provided in the embodiments of the present disclosure may be specific hardware on the device, or software or firmware installed on the device, etc., and the implementation principle and the generated technical effect are the same as those of the foregoing method embodiments.

Referring now to FIG. 6, a schematic diagram of an electronic device 60 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

The electronic device includes: a memory and a processor, wherein the processor may be referred to as the processing device 601 hereinafter, and the memory may include at least one of a Read Only Memory (ROM)602, a Random Access Memory (RAM)603 and a storage device 608 hereinafter, which are specifically shown as follows:

as shown in fig. 6, the electronic device 60 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 60 are also stored. The processing device 601, the ROM602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 60 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 60 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the image processing method shown in any of the above embodiments of the present disclosure.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules or units described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the designation of a module or unit does not in some cases constitute a limitation of the unit itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Example 1 provides, according to one or more embodiments of the present disclosure, an image processing method including:

In an alternative implementation, determining a probability map of each sample image corresponding to the target region includes any one of:

In an alternative implementation, mixing the respective sample image and the corresponding generated image according to each probability map includes:

In an alternative implementation, the upsampling and transformation target color processing on the image features of each sample image includes:

In an optional implementation manner, performing countermeasure training on an anti-forming network based on the first discrimination result, the second discrimination result, the classification result and the generated image corresponding to each sample image includes:

In an optional implementation, the method further includes:

acquiring a pre-trained second classification network;

Example 2 provides the image processing apparatus of example 1, the apparatus including:

In an optional implementation manner, the method further includes:

acquiring a pre-trained second classification network;

Example 3 provides, in accordance with one or more embodiments of the present disclosure, an electronic device comprising:

a processor and a memory storing at least one instruction, at least one program, set of codes, or set of instructions that is loaded and executed by the processor to implement a method as shown in example 1 or any of the alternative implementations of example 1 of the present disclosure.

Example 4 provides a computer readable medium for storing a computer instruction, program, code set or instruction set which, when run on a computer, causes the computer to perform a method as shown in example 1 or any one of the alternative implementations of example 1 of the present disclosure.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. An image processing method, comprising:

acquiring an image to be processed, a pre-trained color-changing special effect network of a target area and a first attribute to be interpolated corresponding to the color-changing special effect network of the target area, wherein the pre-training comprises training aiming at a sample image containing the target area;

performing down-sampling on the image to be processed through the target area color-changing special effect network to obtain a down-sampling result of the image to be processed, and performing up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image;

the color-changing special effect network of the target area is obtained by training through the following steps:

sampling the sample images containing the color information of the target area down through the generation network to obtain corresponding image characteristics, and performing up-sampling and target color conversion processing on the image characteristics of each sample image to obtain a corresponding generated image;

judging the color reality of the image characteristics of each sample image through the first judging network to obtain a corresponding first judging result;

judging the authenticity of each sample image and the corresponding mixed image through the second judgment network to obtain a corresponding second judgment result;

obtaining a classification result of the target color according to the generated image corresponding to each sample image through the first classification network;

performing countermeasure training on the countermeasure generation network based on the first determination result, the second determination result, the classification result and the generated image corresponding to each sample image, and determining the trained generation network as the target area color-changing special effect network.

2. The method according to claim 1, wherein the determining the probability map of each sample image corresponding to the target region comprises any one of:

obtaining each probability graph, wherein each probability graph is obtained by performing image segmentation processing on a corresponding sample image through a pre-trained target region segmentation network;

the countermeasure generating network further comprises an attention network, and each sample image is subjected to image segmentation processing through the attention network to obtain a corresponding probability map.

3. The image processing method according to claim 1, wherein the mixing of the respective sample image and the corresponding generated image according to each probability map comprises:

and mixing the values of the corresponding pixels in the corresponding sample image and the corresponding generated image based on the probability values of the pixels in each probability map.

4. The image processing method according to claim 1, wherein the upsampling and transformation target color processing of the image features of each sample image comprises:

and performing up-sampling interpolation on the image characteristics of each sample image according to a second attribute to be interpolated of a preset target color.

5. The image processing method according to claim 1, wherein the performing countermeasure training on the countermeasure generation network based on the first discrimination result, the second discrimination result, the classification result, and the generated image corresponding to each sample image includes:

determining a true sample loss corresponding to the corresponding sample image, a false sample true loss corresponding to the mixed image and a false sample false loss corresponding to the mixed image according to the second judgment result corresponding to each sample image;

determining a corresponding target color classification loss according to the classification result corresponding to each sample image;

optimizing the countermeasure generation network according to the true color loss, the false color loss, the true sample loss, the false sample true loss, the false sample false loss, the target color classification loss and the image loss corresponding to each sample image.

6. The image processing method according to claim 1, further comprising:

acquiring a pre-trained second classification network;

and labeling the color information of the target area to the image data through the second classification network to obtain each sample image containing the color information of the target area.

7. An image processing apparatus characterized by comprising:

the device comprises an acquisition module, a pre-training module and a processing module, wherein the acquisition module is used for acquiring an image to be processed, a pre-trained color-changing special effect network of a target area and a first attribute to be interpolated corresponding to the color-changing special effect network of the target area, and the pre-training comprises training aiming at a sample image containing the target area; the special effect processing module is used for carrying out down-sampling on the image to be processed through the target area color-changing special effect network to obtain a down-sampling result of the image to be processed, and carrying out up-sampling interpolation on the down-sampling result according to the first attribute to be interpolated to obtain the target area color-changing special effect image;

8. An electronic device, comprising:

a processor and a memory storing at least one instruction, at least one program, a set of codes, or a set of instructions that is loaded and executed by the processor to implement the method of any of claims 1-6.

9. A computer readable medium for storing a computer instruction, a program, a set of codes, or a set of instructions, which when run on a computer, causes the computer to perform the method of any one of claims 1-6.