WO2023116744A1 - Image processing method and apparatus, device, and medium - Google Patents

Image processing method and apparatus, device, and medium Download PDF

Info

Publication number
WO2023116744A1
WO2023116744A1 PCT/CN2022/140574 CN2022140574W WO2023116744A1 WO 2023116744 A1 WO2023116744 A1 WO 2023116744A1 CN 2022140574 W CN2022140574 W CN 2022140574W WO 2023116744 A1 WO2023116744 A1 WO 2023116744A1
Authority
WO
WIPO (PCT)
Prior art keywords
style
sample image
image
network
generation network
Prior art date
Application number
PCT/CN2022/140574
Other languages
French (fr)
Chinese (zh)
Inventor
黄奇伟
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023116744A1 publication Critical patent/WO2023116744A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular to an image processing method, device, equipment and medium.
  • the disclosure provides an image processing method, device, equipment and medium.
  • An embodiment of the present disclosure provides an image processing method, the method comprising: acquiring a first object feature of a first style sample image, and training a first adversarial generation network according to the first object feature and the first style sample image ; Obtain the second object feature of the second style sample image, train the second confrontation generation network according to the second object feature and the second style sample image; generate the first confrontation network and the second confrontation generation The network performs fusion processing to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network.
  • An embodiment of the present disclosure also provides an image processor, the device includes: a first training module, configured to obtain a first object feature of a first style sample image, and according to the first object feature and the first style The sample image trains the first confrontation generation network; the second training module is used to obtain the second object feature of the second style sample image, and trains the second confrontation generation network according to the second object feature and the second style sample image; A fusion module, configured to perform fusion processing on the first confrontational generation network and the second confrontational generation network, and generate a style conversion network, so as to perform a fusion process on the first style and the second style according to the style conversion network.
  • the image undergoes image style conversion processing.
  • An embodiment of the present disclosure also provides an electronic device, which includes: a processor; a memory for storing instructions executable by the processor; and the processor, for reading the instruction from the memory.
  • the instructions can be executed, and the instructions are executed to implement the image processing method provided by the embodiment of the present disclosure.
  • the embodiment of the present disclosure also provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute the image processing method provided by the embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram of an image processing scene provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 11 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure.
  • FIG. 12 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure.
  • FIG. 13 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure.
  • FIG. 14 is a schematic structural diagram of an image processor provided by an embodiment of the present disclosure.
  • Fig. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • this disclosure proposes a network training method that does not need to perform style conversion processing on the original image in advance to obtain training sample images.
  • this method as shown in Figure 1, two adversarial generation networks A and B, where the confrontation generation network A only processes the sample image of the first style so that A can obtain the image of the first style for the input image, and B only processes the sample image of the second style so that B can output the sample image of the second style , and further, the style transfer network can be obtained based on the fusion of A and B.
  • the training of the style conversion network can be realized based on the original sample images of the first style and the sample images of the second style without pre-transforming the sample images from the first style to the second style.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.
  • the method can be executed by an image processor, wherein the apparatus can be implemented by software and/or hardware, and generally can be integrated in an electronic device. As shown in Figure 2, the method includes:
  • Step 201 acquire first object features of a first style sample image, and train a first adversarial generation network according to the first object features and the first style sample image.
  • the first style sample image may correspond to the second style sample mentioned in subsequent embodiments, and the first style sample image and the second style sample image may be sample images of any different styles, for example, the first style sample image is a human face image, the second style sample image may be an animal face image, for example, the first style sample image is a human face image without makeup, and the second style sample image is an oil painting style sample image.
  • the first style sample image may be the original image with the first style acquired in the database, or the first style obtained after further strengthening the first sub-style of the original image.
  • Sample image for example, when the first style sample image is a plain face image, you can directly obtain the plain face image in the relevant database as the first style sample image, or remove makeup from the acquired face image to obtain First style sample image.
  • the first object feature of the first style sample image is extracted, and the first object feature is any feature reflecting the style feature of the first style sample image, including but not Limited to pixel color features, key pixel position features, etc., key point pixel semantic features, area contour features, etc., and then, according to the first object features and the first style sample image to train the first object generation network, so that the training obtained the first A pair of adversarial generative networks can extract the first object features of the input image to obtain style sample images with the first style.
  • Step 202 acquire the second object features of the second style sample image, and train the second adversarial generation network according to the second object feature and the second style sample image.
  • the second style sample image and the first style sample image in the training stage are obtained separately, and the second style sample image is not obtained by processing the first style sample image, so the consumption of computing power is relatively low, further It helps to improve the training efficiency of the style transfer network.
  • the second object feature of the second style sample image is extracted, and the second object feature is any feature reflecting the style feature of the second style sample image, including but not Limited to pixel color features, key pixel position features, etc., key point pixel semantic features, area contour features, etc., and then train the second object generation network according to the second object features and the second style sample image, so that the trained first
  • the two-way adversarial generation network can extract the second object feature of the input image to obtain a style sample image with the second style.
  • Step 203 performing fusion processing on the first adversarial generation network and the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network.
  • the first adversarial generation network and the second adversarial generation network are fused to generate a style conversion network, so that the style conversion network can not only convert the input image to the first style, but also realize the conversion of the input image to the first style.
  • the conversion of the second style realizes the image style conversion processing of the images of the first style and the second style based on the style conversion network.
  • the first confrontation generation network and the second confrontation generation network are fused to generate a style transfer network, including:
  • Step 301 Determine a first weight corresponding to the first adversarial generation network and a second weight corresponding to the second adversarial generation network according to the similarity between the first object feature and the second object feature.
  • the similarity between the first object feature and the second object feature is determined, the similarity reflects the similarity in the feature dimension between the image output by the first confrontation generation network and the image generated by the second confrontation network, if If the similarity is low, the weights of the two generative adversarial networks will affect the effect of the final generated style conversion image. If the second weight corresponding to the adversarial generation network is larger, the output style-converted image is more inclined to the first style. On the contrary, if the similarity is low, the first weight corresponding to the first adversarial generation network, compared to the second adversarial generation network. The second weight corresponding to the network is smaller, and the output style-converted image is more inclined to the second style.
  • the first object feature and the second object feature can be input into a pre-trained deep learning model to obtain the similarity between the first object feature and the second object feature.
  • multiple first key points of the input first style image may be extracted, the first object feature of each first key point may be obtained, and multiple first key points of the input second style image may be extracted.
  • Two key points obtain the second object feature of each second key point, wherein, the first key point and the second key point can include the nose of the human face, the corners of the eyes, the points where the lips are equal to the contours of the five sense organs of the human face, etc., and then, Calculate the key point similarity of the first object feature and the second object feature of the same key point in the first key point and the second key point, and use the mean value of the key point similarity of all key points as the first object feature and the second object feature Similarity of features.
  • the corresponding relationship between the first weight and the similarity may be pre-built according to the needs of the scene, and after the first weight is obtained based on the object relationship query, the second weight is obtained based on the first weight.
  • the difference between the similarity and the preset standard similarity can be calculated, and based on the difference, the preset object relationship can be queried to obtain the weight correction value, and based on the sum of the standard first weight value and the weight correction value, it can be obtained The first weight, and further, the second weight is obtained based on the first weight.
  • Step 302 obtain the first product result of the output image of the first confrontation generating network and the first weight, and obtain the second product result of the output image of the second confrontation generating network and the second weight, and combine the first product result and the second The product results are fused to generate a style transfer network.
  • the first product result of the output image of the first confrontation generation network and the first weight is obtained, the first product result corresponds to the first style, and the output of the second confrontation generation network is obtained
  • the second product result of the image and the second weight, the second product result corresponds to the second style, and the first product result and the second product result are fused to obtain the fusion processing result of the first style and the second style, thus , to generate a style conversion network
  • the output image in this embodiment can be regarded as a variable
  • the style conversion network is a combination of processing network parameters for the corresponding variable.
  • the output of the first confrontation network and the second confrontation network can be connected to the alignment network, and the alignment network is used to align the output objects of the first confrontation network and the second confrontation network, and the alignment process includes One or more of attitude alignment, pixel color alignment, etc.
  • the alignment algorithm can refer to matrix alignment, feature point alignment, etc., which will not be described in detail here
  • the style transfer network includes the first confrontation The generative network, the second adversarial generative network, and the corresponding alignment network.
  • the image processing method of the embodiment of the present disclosure when training the style conversion network, does not need to pre-process the acquisition of the first style sample image and the second style sample image corresponding to the first style sample image, that is, it does not need to consume computing power to image
  • the sample image of the first style and the sample image of the second style can be fused to obtain
  • the style conversion network that performs style conversion processing reduces the training computing power consumption of the style network.
  • the image processing method of the embodiment of the present disclosure obtains the first object feature of the first style sample image, trains the first adversarial generation network according to the first object feature and the first style sample image, and then obtains the second style sample image
  • the second object feature of train the second adversarial generation network according to the second object feature and the second style sample image, perform fusion processing on the first adversarial generation network and the second adversarial generation network, generate a style conversion network, and transform the network according to the style
  • An image style conversion process is performed on the images of the first style and the second style.
  • the first object feature of the first style sample image is taken, and the first adversarial generation network is trained according to the first object feature and the first style sample image, including:
  • Step 601 Carry out key point segmentation and detection on the first object in the first style sample image, and extract key area contour features of the first object.
  • the first object is an entity object to be style converted, including but not limited to human face, clothing, and the like.
  • the key point segmentation detection is performed on the first object in the first style sample image, and the key area contour features of the first object are extracted, that is to say, based on the key point detection technology, the recognition Different regions of the first object, and furthermore, the first object is divided into a plurality of key regions, so as to perform subsequent image processing based on the granularity of the key regions.
  • the key points during the key point segmentation detection can be pre-defined, or can be learned from experimental data.
  • the corresponding key points can be the nose area Key points, key points of the left eye area, key points of the right eye area, key points of the mouth area, key points of other face areas, etc., and then extract key area contour features based on these key points, the contour features include but are not limited to The pixel position corresponding to the outline of the key area and the positional relationship between the pixels, etc.
  • step 602 the key area contour features of the first object are processed by the generation network in the first adversarial generation network to be trained to generate a first reference sample image.
  • the key region contour features of the first object are processed by the generation network in the first adversarial generation network to be trained to generate the first reference sample image, wherein the first reference sample image is based on the key region contour First-style images in the feature extraction dimension.
  • Step 603 Determine a first loss function according to the first style sample image and the first reference sample image.
  • the corresponding first adversarial generation network can be trained through the first loss function between the first reference sample image and the first style sample image.
  • the first loss function is calculated in different ways, examples are as follows:
  • the optical flow field from the first reference sample image to the first style sample image can be calculated, that is, the motion optical flow field of the same key point can be calculated from the first reference sample image to the first style sample image , the determination of the first loss function based on the motion optical flow field, wherein the motion optical flow field identifies the alignment error between the first reference sample image and the first style sample image, and the larger the optical flow field is, it indicates that the first reference sample image The larger the error between and the first style sample image.
  • the first reference sample image is divided into multiple grid blocks, and the first style sample image is also divided into grid blocks according to the same grid division strategy
  • a plurality of grid blocks calculating the pixel mean value of all pixels contained in each grid block, based on the difference between the pixel mean values of corresponding position grid blocks between the first reference sample image and the first style sample image Determining the first loss function. For example, the mean of the differences of pixel mean values between all grids is used as the first loss function.
  • Step 604 Perform back propagation according to the first loss function to train the first adversarial generative network.
  • the direction propagation is performed according to the first loss function to train the first adversarial generation network, that is, the network parameters of the first adversarial generation network to be trained are adjusted, so that the first adversarial generation network after adjusting the network parameters can output the same Relevant images with a consistent first style.
  • the training method of the second adversarial generation network can be consistent with that of the first adversarial generation network.
  • the second object feature of the second style sample image is obtained, and the second adversarial generation network is trained according to the second object feature and the second style sample image, including:
  • Step 801 perform key point segmentation and detection on the second object in the second style sample image, and extract key area contour features of the second object.
  • the second object is an entity object to be style-transformed, including but not limited to human face, clothing, and the like.
  • the first object can be consistent with the second object.
  • the second object is also a human face.
  • the first object and the second object can also be inconsistent.
  • the first object is a human face
  • the second object can also be a human face.
  • the objects are cat faces and the like.
  • the key point segmentation detection is performed on the second object in the second style sample image, and the key area contour features of the second object are extracted, that is to say, based on the key point detection technology, the recognition Different regions of the second object, and furthermore, the second object is divided into a plurality of key regions, so as to perform subsequent image processing based on the granularity of the key regions.
  • the key points during key point segmentation detection can be pre-defined, or can be learned from experimental data.
  • the corresponding key points can be the key points of the nose area, the left eye area
  • the contour features include but not limited to the pixels corresponding to the key area contour The position of the point and the positional relationship between the pixels, etc.
  • Step 802 using the generation network in the second adversarial generation network to be trained to process the contour features of key regions of the second object to generate a second reference sample image.
  • the key region contour features of the second object are processed by the generation network in the second adversarial generation network to be trained to generate a second reference sample image, wherein the second reference sample image is based on the key region contour Image of the second style in feature extraction dimension.
  • Step 803 Determine a second loss function according to the second style sample image and the second reference sample image.
  • the corresponding second adversarial generation network can be trained through a second loss function between the second reference sample image and the second style sample image.
  • the second loss function is calculated in different ways, examples are as follows:
  • the optical flow field from the second reference sample image to the second style sample image can be calculated, that is, the motion optical flow field of the same key point can be calculated from the second reference sample image to the second style sample image , the second loss function is determined based on the motion optical flow field, wherein the motion optical flow field identifies the alignment error between the second reference sample image and the second style sample image, and the larger the optical flow field is, the second reference sample image The larger the error between and the second style sample image.
  • the second reference sample image is divided into multiple grid blocks, and the first style sample image is also divided according to the same grid division strategy
  • For multiple grid blocks calculate the pixel mean value of all pixels contained in each grid block, based on the difference between the pixel mean values between the corresponding position grid blocks between the second reference sample image and the second style sample image The value determines the second loss function.
  • the mean of the differences of the pixel mean values between all grids is used as the second loss function.
  • Step 804 perform backpropagation training on the second adversarial generative network according to the second loss function.
  • the second adversarial generation network is trained by direction propagation according to the second loss function, that is, the network parameters of the second adversarial generation network to be trained are adjusted, so that the second adversarial generation network after adjusting the network parameters can output the same A second consistent style of related images.
  • the first object key point segmentation detection is performed on the first style sample image, for example, the face key point segmentation detection is performed based on the face analysis technology, and then the first object is obtained.
  • the contour feature mask1 of the key region after obtaining mask1, encode mask1 to obtain the first encoding result, encode the first style sample image to obtain the second encoding result, and perform fusion based on the first encoding result and the second encoding result Obtain the first feature image, which on the one hand embodies the outline features of the first object on the outline of the key area, and on the other hand, combines the original first style sample image to retain the original first style features .
  • the second feature map is obtained, and the second feature map is input to the first confrontation generation network to obtain the corresponding third reference sample image, and the third reference sample image and the first style sample image are calculated If the loss value is greater than the preset threshold, adjust the network parameters of the first adversarial generation network until the above loss value is less than the preset threshold, then complete the training of the first adversarial generation network.
  • the second object key point segmentation detection is performed on the second style sample image, for example, the face key point segmentation detection is performed based on the face analysis technology, and then the second object The key area contour feature mask2, after obtaining mask2, encode mask2 to obtain the third encoding result, encode the second style sample image to obtain the fourth encoding result, and obtain after fusion based on the third encoding result and the fourth encoding result
  • the third feature image on the one hand, the third feature image embodies the outline features of the second object on the outline of the key area, and on the other hand, combines the original second style sample image to retain the original second style features.
  • the fourth feature map is obtained, the fourth feature map is input to the second confrontation generation network to obtain the corresponding fourth reference sample image, and the fourth reference sample image and the second For the loss value between the style sample images, if the loss value is greater than the preset threshold, adjust the network parameters of the second confrontation generation network until the above loss value is less than the preset threshold.
  • the image processing method of the embodiment of the present disclosure can perform the training of the adversarial generation network according to the needs of the scene in combination with the outline characteristics of key regions, and improve the training efficiency of the adversarial generation network on the basis of ensuring the training accuracy of the adversarial generation network. .
  • the distance between the confrontation generation network and the corresponding positive sample image is considered to calculate the relevant loss function.
  • This calculation method may cause the output image to lack details and be too smooth.
  • the adversarial generation network is trained in combination with negative sample images, that is, when calculating the relevant loss function, negative sample images can also be combined.
  • the first style is the oil painting style
  • the second style is Take the plain makeup style as an example to illustrate the training process of the adversarial generative network.
  • the first loss function is determined according to the first style sample image and the first reference sample image, including:
  • Step 1101 performing fusion and noise addition processing on the first style sample image and the first reference sample image to generate a first negative sample image.
  • random noise may be added to the fused image to obtain the first negative sample image.
  • the first negative sample image not only introduces an error of the first reference sample image, but also introduces a noise error.
  • Step 1102 extract the first high-frequency information of the first style sample image, the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image.
  • the first high-frequency information of the first style sample image, the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image are extracted, wherein the high-frequency information of the image
  • the information can be understood as pixel information of a pixel with a large brightness difference and richer details, and the like.
  • the discriminant network in the first adversarial generation network performs discriminative processing on the first high-frequency information, the second high-frequency information, and the third high-frequency information to generate corresponding discriminant scores.
  • the first high-frequency information, the second high-frequency information, and the third high-frequency information are discriminated by the discriminant network in the first adversarial generation network to generate corresponding discriminant scores, which represent the discriminant
  • the score of the first style belongs to the first high-frequency information, the second high-frequency information and the third high-frequency information.
  • Step 1104 determine a first loss function according to the discriminant score.
  • the first loss function is determined according to the discriminant score, for example, directly calculating the first high-frequency information and the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image
  • a ratio of the first absolute square error value to the second absolute square error value is calculated as a first loss function.
  • a ratio of the first difference to the second difference is calculated as a first loss function.
  • the image output by the trained first adversarial generation network is not only close to the first style sample image on the feature level, but also similar to the first negative sample image Far away, thereby reducing the introduction of some artifacts and noise, and ensuring that the image output by the first confrontation generation network is an oil painting style.
  • the second loss function is determined according to the second style sample image and the second reference sample image, including:
  • Step 1201 performing fusion and noise addition processing on the second style sample image and the second reference sample image to generate a second negative sample image.
  • random noise may be added to the fused image to obtain a second negative sample image.
  • the second negative sample image not only introduces an error of the second reference sample image, but also introduces a noise error.
  • Step 1202 extracting the first texture feature of the second style sample image, the second texture feature of the second reference sample image, and the third texture feature of the second negative sample image.
  • the first texture feature of the second style sample image, the second texture feature of the second reference sample image, and the third texture feature of the second negative sample image are extracted, wherein the texture feature reflects the corresponding image Features such as color and brightness of pixels belonging to plain makeup style.
  • Step 1203 Determine a second loss function according to the first texture feature, the second texture feature, and the third texture feature.
  • the discriminant network in the second adversarial generation network can be used to discriminate the first texture feature, the second texture feature, and the third texture feature to generate corresponding discriminant scores.
  • the first texture feature, the second texture feature and the third texture feature belong to the score of the second style.
  • the second loss function can be determined according to the discriminant score, for example, directly calculate the first texture feature and the second texture feature of the second reference sample image respectively, and the third square absolute of the third texture feature of the second negative sample image The error value, and the fourth squared absolute error value, calculating the ratio of the third squared absolute error value and the fourth squared absolute error value as the second loss function.
  • the image output by the trained second adversarial generation network is not only close to the second style sample image on the feature level, but also similar to the second negative sample image away, thereby reducing the introduction of some artifacts and noise, and ensuring that the image output by the second adversarial generation network is a plain style.
  • the key area contour features of the target object in the original image with plain makeup style can be extracted first, wherein the target object includes but is not limited to The various parts of the face mentioned above, etc., and then encode the key area contour features of the original image without makeup style and the target object to generate the feature data of the target object.
  • the feature data of the target object and the key region outline of the target object can be further analyzed by the style transfer network
  • the image fusion processing is performed on the features, and the conversion of the oil painting style field is performed based on the fused image, and the target image with the oil painting style is obtained.
  • the pre-trained first adversarial generation network can extract the corresponding key contour features of the target object reflecting the characteristics of the plain face style based on the input original image of the plain face style, and then, the original image of the plain face style and the target object in the plain face
  • the key area contour features of the dimension are encoded to generate the feature data of the target object, and the image fusion processing is performed on the feature data of the target object and the key area contour features of the target object to obtain a new original image of the key area contour dimension with plain makeup style.
  • the second adversarial generation network Based on the new original image input to the pre-trained second adversarial generation network, the second adversarial generation network extracts the key contour features of the target object in the oil painting style dimension of the new original image, and then, the new style original image and target The key area contour features of the oil painting style dimension of the object are encoded to generate new feature data of the target object. Since the second adversarial generation network can obtain an image of the second style based on the feature input of the input image, the second adversarial generation network is based on The new feature data obtains the target image of the oil painting style, in which the weight of the style conversion network acting on the first confrontation generation network and the second confrontation generation network is reflected in the product of the output results of each confrontation generation network. For details, please refer to the above The embodiment will not be repeated here.
  • the output fusion of the first confrontation generation network and the second confrontation generation network can be combined to obtain a corresponding oil painting style image.
  • Oil painting-style image details are rich and realistic.
  • the image processing method of the embodiment of the present disclosure combines the distance between the input style sample image and the positive sample image and the negative sample image respectively, and trains the loss value at the feature level to obtain the corresponding adversarial generation network, while ensuring the output of the adversarial generation network.
  • the purity of the output image is improved, so that the style conversion effect of the fused target image is consistent with the second style.
  • the present disclosure also proposes an image processor.
  • FIG. 14 is a schematic structural diagram of an image processor provided by an embodiment of the present disclosure.
  • the device can be implemented by software and/or hardware, and can generally be integrated into an electronic device for image processing. As shown in Figure 14, the device includes:
  • the first training module 1610 is used to obtain the first object feature of the first style sample image, and train the first confrontation generation network according to the first object feature and the first style sample image;
  • the second training module 1620 is used to obtain the second object feature of the second style sample image, and train the second confrontation generating network according to the second object feature and the second style sample image;
  • the fusion module 1630 is configured to perform fusion processing on the first adversarial generation network and the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network.
  • the image processor provided by the embodiment of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • the present disclosure also proposes a computer program product, including computer programs/instructions, which implement the image processing methods in the above embodiments when the computer programs/instructions are executed by a processor.
  • FIG. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • FIG. 15 it shows a schematic structural diagram of an electronic device 1700 suitable for implementing an embodiment of the present disclosure.
  • the electronic device 1700 in the embodiment of the present disclosure may include, but not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Tablet Computers), PMPs (Portable Multimedia Players), vehicle-mounted terminals ( Mobile terminals such as car navigation terminals) and stationary terminals such as digital TVs, desktop computers and the like.
  • the electronic device shown in FIG. 15 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 1700 may include a processor (such as a central processing unit, a graphics processing unit, etc.) 1701, which may be stored in a read-only memory (ROM) 1702 or loaded into a random access memory from a memory 1708. (RAM) 1703 to execute various appropriate actions and processing. In the RAM 1703, various programs and data necessary for the operation of the electronic device 1700 are also stored.
  • the processor 1701, ROM 1702, and RAM 1703 are connected to each other through a bus 1704.
  • An input/output (I/O) interface 1705 is also connected to the bus 1704 .
  • the following devices can be connected to the I/O interface 1705: input devices 1706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 1707 such as a computer; a memory 1708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 1709.
  • the communication means 1709 may allow the electronic device 1700 to perform wireless or wired communication with other devices to exchange data. While FIG. 15 shows electronic device 1700 having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer readable medium, the computer program including program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from a network via communication means 1709, or from memory 1708, or from ROM 1702.
  • the processor 1701 When the computer program is executed by the processor 1701, the above-mentioned functions defined in the image processing method of the embodiment of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future network protocols such as HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol), and can communicate with digital data in any form or medium
  • HTTP HyperText Transfer Protocol
  • the communication eg, communication network
  • Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the first object feature of the first style sample image, according to the first object feature and The first style sample image trains the first adversarial generation network, and then obtains the second object feature of the second style sample image, trains the second adversarial generation network according to the second object feature and the second style sample image, and the first adversarial generation network performing fusion processing with the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on the images of the first style and the second style according to the style conversion network.
  • the processing power requirements for sample images during image style conversion are reduced, and the training efficiency of the style conversion network is improved on the premise of ensuring the effect of style conversion.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs System on Chips
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the present disclosure provides an image processing method, including: acquiring a first object feature of a first style sample image, and according to the first object feature and the first style sample image Train the first adversarial generative network;
  • the acquisition of the first object feature of the first style sample image is performed according to the first object feature and the first style sample image training
  • the first adversarial generative network including:
  • the acquisition of the second object features of the second style sample image is performed according to the second object features and the second style sample image training
  • the second adversarial generative network including:
  • the fusion processing of the first confrontation generation network and the second confrontation generation network to generate a style conversion network includes:
  • the first style is an oil painting style
  • the second style is a plain makeup style
  • the determining a first loss function according to the first style sample image and the first reference sample image includes:
  • the first loss function is determined based on the discriminant score.
  • the determining the second loss function according to the second style sample image and the second reference sample image includes:
  • the second loss function is determined according to the first texture feature, the second texture feature, and the third texture feature.
  • performing image style conversion processing on images of the first style and the second style according to the style conversion network includes:
  • an image processor including:
  • the first training module is used to obtain the first object feature of the first style sample image, and train the first confrontation generation network according to the first object feature and the first style sample image;
  • the second training module is used to obtain the second object feature of the second style sample image, and train the second confrontation generation network according to the second object feature and the second style sample image;
  • the first training module is specifically used for:
  • the second training module is specifically used for:
  • the fusion module is specifically used for:
  • the first style is an oil painting style
  • the second style is a plain makeup style
  • the fusion module is specifically used for:
  • the first loss function is determined based on the discriminant score.
  • the second training module is specifically used for:
  • the second loss function is determined according to the first texture feature, the second texture feature, and the third texture feature.
  • the second training module is specifically used for:
  • the present disclosure provides an electronic device, including:
  • the processor is configured to read the executable instructions from the memory, and execute the instructions to implement any image processing method provided in the present disclosure.
  • the present disclosure provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute any image provided by the present disclosure. Approach.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

Embodiments of the present invention relate to an image processing method and apparatus, a device, and a medium. The method comprises: obtaining a first object feature of a first style sample image, and training a first generative adversarial network according to the first object feature and the first style sample image; obtaining a second object feature of the second style sample image, and training a second generative adversarial network according to the second object feature and the second style sample image; and fusing the first generative adversarial network and the second generative adversarial network to generate a style conversion network, so as to perform image style conversion on images of the first style and the second style according to the style conversion network.

Description

图像处理方法、装置、设备及介质Image processing method, device, equipment and medium
相关申请的交叉引用Cross References to Related Applications
本申请是以申请号为202111574622.1,申请日为2021年12月21日的中国申请为基础,并主张其优先权,该中国申请的公开内容在此作为整体引入本申请中。This application is based on the Chinese application with the application number 202111574622.1 and the filing date is December 21, 2021, and claims its priority. The disclosure content of the Chinese application is hereby incorporated into this application as a whole.
技术领域technical field
本公开涉及计算机视觉技术领域,尤其涉及一种图像处理方法、装置、设备及介质。The present disclosure relates to the technical field of computer vision, and in particular to an image processing method, device, equipment and medium.
背景技术Background technique
随着计算机视觉技术的进步,对图像进行风格间的转换处理等技术,由于可以将图像进行不同风格的转换,从而,在拍照处理等应用中得到了广泛使用。With the advancement of computer vision technology, technologies such as converting images between styles can be converted into different styles, and thus have been widely used in applications such as photo processing.
相关技术中,为了实现图像的风格转换,需要预先获取每个原始图像的不同风格的样本图像,基于不同风格的样本图像进行网络的训练,从而,基于训练后的网络对输入的图像进行风格转换处理。In related technologies, in order to realize image style conversion, it is necessary to obtain sample images of different styles of each original image in advance, and perform network training based on the sample images of different styles, thereby performing style conversion on the input image based on the trained network deal with.
发明内容Contents of the invention
本公开提供了一种图像处理方法、装置、设备及介质。The disclosure provides an image processing method, device, equipment and medium.
本公开实施例提供了一种图像处理方法,所述方法包括:获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。An embodiment of the present disclosure provides an image processing method, the method comprising: acquiring a first object feature of a first style sample image, and training a first adversarial generation network according to the first object feature and the first style sample image ; Obtain the second object feature of the second style sample image, train the second confrontation generation network according to the second object feature and the second style sample image; generate the first confrontation network and the second confrontation generation The network performs fusion processing to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network.
本公开实施例还提供了一种图像处理器,所述装置包括:第一训练模块,用于获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;第二训练模块,用于获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;融合模块,用于对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。An embodiment of the present disclosure also provides an image processor, the device includes: a first training module, configured to obtain a first object feature of a first style sample image, and according to the first object feature and the first style The sample image trains the first confrontation generation network; the second training module is used to obtain the second object feature of the second style sample image, and trains the second confrontation generation network according to the second object feature and the second style sample image; A fusion module, configured to perform fusion processing on the first confrontational generation network and the second confrontational generation network, and generate a style conversion network, so as to perform a fusion process on the first style and the second style according to the style conversion network. The image undergoes image style conversion processing.
本公开实施例还提供了一种电子设备,所述电子设备包括:处理器;用于存储所述处 理器可执行指令的存储器;所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现如本公开实施例提供的图像处理方法。An embodiment of the present disclosure also provides an electronic device, which includes: a processor; a memory for storing instructions executable by the processor; and the processor, for reading the instruction from the memory. The instructions can be executed, and the instructions are executed to implement the image processing method provided by the embodiment of the present disclosure.
本公开实施例还提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行如本公开实施例提供的图像处理方法。The embodiment of the present disclosure also provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute the image processing method provided by the embodiment of the present disclosure.
附图说明Description of drawings
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。The above and other features, advantages and aspects of the various embodiments of the present disclosure will become more apparent with reference to the following detailed description in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale.
图1为本公开实施例提供的一种图像处理场景示意图;FIG. 1 is a schematic diagram of an image processing scene provided by an embodiment of the present disclosure;
图2为本公开实施例提供的一种图像处理方法的流程示意图;FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure;
图3为本公开实施例提供的另一种图像处理方法的流程示意图;FIG. 3 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure;
图4为本公开实施例提供的另一种图像处理场景示意图;FIG. 4 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图5为本公开实施例提供的另一种图像处理场景示意图;FIG. 5 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图6为本公开实施例提供的另一种图像处理方法的流程示意图;FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure;
图7为本公开实施例提供的另一种图像处理场景示意图;FIG. 7 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图8为本公开实施例提供的另一种图像处理方法的流程示意图;FIG. 8 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure;
图9为本公开实施例提供的另一种图像处理场景示意图;FIG. 9 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图10为本公开实施例提供的另一种图像处理场景示意图;FIG. 10 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图11为本公开实施例提供的另一种图像处理方法的流程示意图;FIG. 11 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure;
图12为本公开实施例提供的另一种图像处理方法的流程示意图;FIG. 12 is a schematic flowchart of another image processing method provided by an embodiment of the present disclosure;
图13为本公开实施例提供的另一种图像处理场景示意图;FIG. 13 is a schematic diagram of another image processing scenario provided by an embodiment of the present disclosure;
图14为本公开实施例提供的一种图像处理器的结构示意图;FIG. 14 is a schematic structural diagram of an image processor provided by an embodiment of the present disclosure;
图15为本公开实施例提供的一种电子设备的结构示意图。Fig. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是, 本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein; A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the protection scope of the present disclosure.
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。It should be understood that the various steps described in the method implementations of the present disclosure may be executed in different orders, and/or executed in parallel. Additionally, method embodiments may include additional steps and/or omit performing illustrated steps. The scope of the present disclosure is not limited in this regard.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "comprise" and its variations are open-ended, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one further embodiment"; the term "some embodiments" means "at least some embodiments." Relevant definitions of other terms will be given in the description below.
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。It should be noted that concepts such as "first" and "second" mentioned in this disclosure are only used to distinguish different devices, modules or units, and are not used to limit the sequence of functions performed by these devices, modules or units or interdependence.
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that the modifications of "one" and "multiple" mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more" multiple".
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are used for illustrative purposes only, and are not used to limit the scope of these messages or information.
基于不同风格的样本图像进行网络训练时,显然需要预先针对大量的原始图像处理得到不同的风格的样本图像,对样本图像的提前获取对算力的消耗较大,网络训练的效率较低。When performing network training based on sample images of different styles, it is obviously necessary to process a large number of original images in advance to obtain sample images of different styles. Acquiring sample images in advance consumes a lot of computing power and the efficiency of network training is low.
上述训练风格转换网络时,需要预先处理原始图像得到不同风格样本图像,比如,在训练素颜人脸到油画风格的风格转换网络时,需要获取素颜风格的人脸图像,进而,处理素颜人脸图像得到油画风格的人脸图像,这种处理不但难度较大,而且对算力的消耗较大,导致网络的训练效率较低。When training the style conversion network above, it is necessary to pre-process the original image to obtain sample images of different styles. For example, when training the style conversion network from plain face to oil painting style, it is necessary to obtain the face image of the plain face, and then process the face image of the plain face Obtaining oil painting-style face images is not only difficult, but also consumes a lot of computing power, resulting in low network training efficiency.
为了解决上述技术问题,本公开提出了一种无需预先对原始图像进行风格转换处理以获取训练样本图像的网络训练方法,在该方法中,如图1所示,提供两个对抗生成网络A和B,其中,对抗生成网络A仅仅处理第一风格的样本图像使得A可以对输入的图像得到第一风格的图像,B仅仅处理第二风格的样本图像,使得B可以输出第二风格的样本图像,进而,基于A和B的融合即可得到风格转换网络。由此,无需预先对样本图像进行第一风格到第二风格的转换处理,基于原始的第一风格的样本图像,和第二风格的样本图像即可实现风格转换网络的训练。In order to solve the above technical problems, this disclosure proposes a network training method that does not need to perform style conversion processing on the original image in advance to obtain training sample images. In this method, as shown in Figure 1, two adversarial generation networks A and B, where the confrontation generation network A only processes the sample image of the first style so that A can obtain the image of the first style for the input image, and B only processes the sample image of the second style so that B can output the sample image of the second style , and further, the style transfer network can be obtained based on the fusion of A and B. Thus, the training of the style conversion network can be realized based on the original sample images of the first style and the sample images of the second style without pre-transforming the sample images from the first style to the second style.
下面结合具体的实施例对该图像处理方法进行介绍。The image processing method will be introduced below in combination with specific embodiments.
图2为本公开实施例提供的一种图像处理方法的流程示意图,该方法可以由图像处理器执行,其中该装置可以采用软件和/或硬件实现,一般可集成在电子设备中。如图2所示,该方法包括:FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure. The method can be executed by an image processor, wherein the apparatus can be implemented by software and/or hardware, and generally can be integrated in an electronic device. As shown in Figure 2, the method includes:
步骤201,获取第一风格样本图像的第一对象特征,根据第一对象特征和第一风格样本图像训练第一对抗生成网络。 Step 201, acquire first object features of a first style sample image, and train a first adversarial generation network according to the first object features and the first style sample image.
其中,第一风格样本图像可以与后续实施例提到的第二风格样本对应,第一风格样本图像和第二风格样本图像可以为任意不同风格的样本图像,比如,第一风格样本图像为人脸图像,则第二风格样本图像可以为动物脸图像,比如,第一风格样本图像为素颜人脸图像,则第二风格样本图像为油画风格样本图像等。Wherein, the first style sample image may correspond to the second style sample mentioned in subsequent embodiments, and the first style sample image and the second style sample image may be sample images of any different styles, for example, the first style sample image is a human face image, the second style sample image may be an animal face image, for example, the first style sample image is a human face image without makeup, and the second style sample image is an oil painting style sample image.
在一些可能的实施例中,第一风格样本图像可以是在数据库中获取到的原始即具有第一风格的图像,也可以是对原始图像的第一分风格进一步强化处理后得到的第一风格样本图像,比如,当第一风格样本图像为素颜风格人脸图像,则可以直接在有关数据库中获取素颜人脸图像作为第一风格样本图像,也可以对获取的人脸图像进行妆容去除以获取第一风格样本图像。In some possible embodiments, the first style sample image may be the original image with the first style acquired in the database, or the first style obtained after further strengthening the first sub-style of the original image. Sample image, for example, when the first style sample image is a plain face image, you can directly obtain the plain face image in the relevant database as the first style sample image, or remove makeup from the acquired face image to obtain First style sample image.
在本实施例中,在获取到第一风格样本图像后,提取第一风格样本图像的第一对象特征,该第一对象特征为体现第一风格样本图的风格特征的任意特征,包括但不限于像素颜色特征、关键像素点位置特征等,关键点像素语义特征、区域轮廓特征等,进而,根据第一对象特征和第一风格样本图像训练第一对象生成网络,从而,使得训练得到的第一对抗生成网络可以对输入图像的第一对象特征的提取,得到具有第一风格的风格样本图像。In this embodiment, after the first style sample image is acquired, the first object feature of the first style sample image is extracted, and the first object feature is any feature reflecting the style feature of the first style sample image, including but not Limited to pixel color features, key pixel position features, etc., key point pixel semantic features, area contour features, etc., and then, according to the first object features and the first style sample image to train the first object generation network, so that the training obtained the first A pair of adversarial generative networks can extract the first object features of the input image to obtain style sample images with the first style.
步骤202,获取第二风格样本图像的第二对象特征,根据第二对象特征和第二风格样本图像训练第二对抗生成网络。 Step 202, acquire the second object features of the second style sample image, and train the second adversarial generation network according to the second object feature and the second style sample image.
在本实施例中,正如以上所描述的,第二风格样本图像和第一分风格样本图像对应,第二风格样本图像可以是在数据库中获取到的原始即具有第二风格的图像,也可以是对原始图像的第二风格进一步强化处理后得到的第二风格样本图像,比如,当第二风格样本图像为油画风格人脸图像,则可以直接在有关数据库中获取油画人脸图像作为第二风格样本图像,也可以对获取的名画人脸图像进行油画特征的强化处理以获取第二风格样本图像。In this embodiment, as described above, the second style sample image corresponds to the first sub-style sample image, and the second style sample image may be the original image with the second style acquired in the database, or is the second style sample image obtained after further strengthening the second style of the original image. For example, when the second style sample image is an oil painting style face image, the oil painting face image can be directly obtained in the relevant database as the second style. The style sample image may also be enhanced by oil painting features on the acquired face image of famous paintings to obtain a second style sample image.
需要说明的是,训练阶段的第二风格样本图像和第一风格样本图像是单独获取的,第二风格样本图像不是由第一风格样本图像处理得到的,因此,对算力消耗较低,进一步有助于提升风格转换网络的训练效率。It should be noted that the second style sample image and the first style sample image in the training stage are obtained separately, and the second style sample image is not obtained by processing the first style sample image, so the consumption of computing power is relatively low, further It helps to improve the training efficiency of the style transfer network.
在本实施例中,在获取到第二风格样本图像后,提取第二风格样本图像的第二对象特征,该第二对象特征为体现第二风格样本图的风格特征的任意特征,包括但不限于像素颜色特征、关键像素点位置特征等,关键点像素语义特征、区域轮廓特征等,进而,根据第二对象特征和第二风格样本图像训练第二对象生成网络,从而,使得训练得到的第二对抗生成网络可以对输入图像的第二对象特征的提取,得到具有第二风格的风格样本图像。In this embodiment, after the second style sample image is acquired, the second object feature of the second style sample image is extracted, and the second object feature is any feature reflecting the style feature of the second style sample image, including but not Limited to pixel color features, key pixel position features, etc., key point pixel semantic features, area contour features, etc., and then train the second object generation network according to the second object features and the second style sample image, so that the trained first The two-way adversarial generation network can extract the second object feature of the input image to obtain a style sample image with the second style.
步骤203,对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,以根据风格转换网络对第一风格和第二风格的图像进行图像风格转换处理。 Step 203, performing fusion processing on the first adversarial generation network and the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network.
在本实施例中,对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,从而,风格转换网络不但可以实现输入图像到第一风格的转换,而且可以实现输入图像到第二风格的转换,实现了基于风格转换网络对第一风格和第二风格的图像进行图像风格转换处理。In this embodiment, the first adversarial generation network and the second adversarial generation network are fused to generate a style conversion network, so that the style conversion network can not only convert the input image to the first style, but also realize the conversion of the input image to the first style. The conversion of the second style realizes the image style conversion processing of the images of the first style and the second style based on the style conversion network.
需要说明的是,在不同的应用场景中,对第一对抗生成网络和第二对抗生成网络进行融合处理生成风格转换网络的方式不同,示例说明如下:It should be noted that in different application scenarios, the fusion of the first adversarial generation network and the second adversarial generation network to generate a style transfer network is different. The examples are as follows:
在本公开的一个实施例中,如图3所示,对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,包括:In one embodiment of the present disclosure, as shown in FIG. 3 , the first confrontation generation network and the second confrontation generation network are fused to generate a style transfer network, including:
步骤301,根据第一对象特征和第二对象特征的相似度,确定与第一对抗生成网络对应的第一权重,以及与第二对抗生成网络对应的第二权重。Step 301: Determine a first weight corresponding to the first adversarial generation network and a second weight corresponding to the second adversarial generation network according to the similarity between the first object feature and the second object feature.
在本实施例中,确定第一对象特征和第二对象特征的相似度,该相似度体现了第一对抗生成网络输出的图像和第二对抗网络生成图像之间在特征维度的相似度,若是相似度较低,则两个生成对抗网络在融合时的权重会影响最后生成的风格转换图像的效果,比如,若是相似度较低,第一对抗生成网络对应的第一权重,相对于第二对抗生成网络对应的第二权重较大,则输出的风格转换后的图像更加偏向第一风格,反之,若是相似度较低,第一对抗生成网络对应的第一权重,相对于第二对抗生成网络对应的第二权重较小,则输出的风格转换后的图像更加偏向第二风格。In this embodiment, the similarity between the first object feature and the second object feature is determined, the similarity reflects the similarity in the feature dimension between the image output by the first confrontation generation network and the image generated by the second confrontation network, if If the similarity is low, the weights of the two generative adversarial networks will affect the effect of the final generated style conversion image. If the second weight corresponding to the adversarial generation network is larger, the output style-converted image is more inclined to the first style. On the contrary, if the similarity is low, the first weight corresponding to the first adversarial generation network, compared to the second adversarial generation network. The second weight corresponding to the network is smaller, and the output style-converted image is more inclined to the second style.
在本实施例中,可以将第一对象特征和第二对象特征输入预先训练的深度学习模型,以得到第一对象特征和第二对象特征的相似度。In this embodiment, the first object feature and the second object feature can be input into a pre-trained deep learning model to obtain the similarity between the first object feature and the second object feature.
在本公开的另一个实施例中,可以提取输入的第一风格图像的多个第一关键点,获取每个第一关键点的第一对象特征,提取输入的第二风格图像的多个第二关键点,获取每个第二关键点的第二对象特征,其中,第一关键点和第二关键点可以包括人脸的鼻子、眼角、 嘴唇等于人脸五官轮廓关联的点等,进而,计算第一关键点和第二关键点中相同关键点的第一对象特征和第二对象特征的关键点相似度,根据所有关键点的关键点相似度的均值作为第一对象特征和第二对象特征的相似度。In another embodiment of the present disclosure, multiple first key points of the input first style image may be extracted, the first object feature of each first key point may be obtained, and multiple first key points of the input second style image may be extracted. Two key points, obtain the second object feature of each second key point, wherein, the first key point and the second key point can include the nose of the human face, the corners of the eyes, the points where the lips are equal to the contours of the five sense organs of the human face, etc., and then, Calculate the key point similarity of the first object feature and the second object feature of the same key point in the first key point and the second key point, and use the mean value of the key point similarity of all key points as the first object feature and the second object feature Similarity of features.
进一步地,在一些可能的实施例中,可以根据场景需要预先构建第一权重和相似度的对应关系,基于该对象关系查询得到第一权重后,基于第一权重得到第二权重。Further, in some possible embodiments, the corresponding relationship between the first weight and the similarity may be pre-built according to the needs of the scene, and after the first weight is obtained based on the object relationship query, the second weight is obtained based on the first weight.
在另一些可能的实时中,可以计算相似度和预设标准相似度的差值,基于该差值查询预设的对象关系获取权重修正值,基于标准第一权重值和权重修正值之和得到第一权重,进而,基于第一权重得到第二权重。In other possible real-time, the difference between the similarity and the preset standard similarity can be calculated, and based on the difference, the preset object relationship can be queried to obtain the weight correction value, and based on the sum of the standard first weight value and the weight correction value, it can be obtained The first weight, and further, the second weight is obtained based on the first weight.
步骤302,获取第一对抗生成网络的输出图像与第一权重的第一乘积结果,以及获取第二对抗生成网络的输出图像和第二权重的第二乘积结果,将第一乘积结果和第二乘积结果进行融合处理,生成风格转换网络。 Step 302, obtain the first product result of the output image of the first confrontation generating network and the first weight, and obtain the second product result of the output image of the second confrontation generating network and the second weight, and combine the first product result and the second The product results are fused to generate a style transfer network.
在本实施例中,如图4所示,获取第一对抗生成网络的输出图像与第一权重的第一乘积结果,该第一乘积结果与第一风格对应,获取第二对抗生成网络的输出图像和第二权重的第二乘积结果,该第二乘积结果与第二风格对应,将第一乘积结果和第二乘积结果进行融合处理,得到了第一风格和第二风格融合处理结果,从而,生成风格转换网络,本实施例中的输出图像可以看作一个变量,风格转换网络为针对对应的变量的处理网络参数的组合。In this embodiment, as shown in FIG. 4, the first product result of the output image of the first confrontation generation network and the first weight is obtained, the first product result corresponds to the first style, and the output of the second confrontation generation network is obtained The second product result of the image and the second weight, the second product result corresponds to the second style, and the first product result and the second product result are fused to obtain the fusion processing result of the first style and the second style, thus , to generate a style conversion network, the output image in this embodiment can be regarded as a variable, and the style conversion network is a combination of processing network parameters for the corresponding variable.
在本公开的另一个实施例中,考虑到第一对抗生成网络可以将输入图像转换为第一风格的输出图像,第二对抗生成网络可以将输入图像转换为第二风格的输出图像,因此,如图5所示,可以将第一对抗网络和第二对抗网络的输出接入对齐网络,该对齐网络用于对第一对抗网络和第二对抗网络的输出对象进行对齐处理,该对齐处理包括姿态对齐、像素颜色对齐等中的一种或多种(对齐算法可以参照矩阵对齐、特征点对齐等方式,在此不再赘述),从而,在本实施例中,风格转换网络包括第一对抗生成网络、第二对抗生成网络和对应的对齐网络。In another embodiment of the present disclosure, considering that the first adversarial generation network can convert the input image into the first-style output image, and the second adversarial generation network can convert the input image into the second-style output image, therefore, As shown in Figure 5, the output of the first confrontation network and the second confrontation network can be connected to the alignment network, and the alignment network is used to align the output objects of the first confrontation network and the second confrontation network, and the alignment process includes One or more of attitude alignment, pixel color alignment, etc. (the alignment algorithm can refer to matrix alignment, feature point alignment, etc., which will not be described in detail here), thus, in this embodiment, the style transfer network includes the first confrontation The generative network, the second adversarial generative network, and the corresponding alignment network.
由此,本公开实施例的图像处理方法,在训练风格转换网络时,无需预先处理获取第一风格样本图像和与第一风格样本图像对应的第二风格样本图像,即无需消耗算力对图像由第一风格到第二风格的转换计算,基于第一对抗生成网络和第二对抗生成网络针对输入图像进行处理后,对第一风格的样本图像和第二风格的样本图像融合即可得到可以进行风格转换处理的风格转换网络,降低了风格网络的训练算力消耗。Therefore, the image processing method of the embodiment of the present disclosure, when training the style conversion network, does not need to pre-process the acquisition of the first style sample image and the second style sample image corresponding to the first style sample image, that is, it does not need to consume computing power to image For the conversion calculation from the first style to the second style, after processing the input image based on the first adversarial generation network and the second adversarial generation network, the sample image of the first style and the sample image of the second style can be fused to obtain The style conversion network that performs style conversion processing reduces the training computing power consumption of the style network.
综上,本公开实施例的图像处理方法,获取第一风格样本图像的第一对象特征,根据第一对象特征和第一风格样本图像训练第一对抗生成网,进而,获取第二风格样本图像的第二对象特征,根据第二对象特征和第二风格样本图像训练第二对抗生成网络,对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,以根据风格转换网络对第一风格和第二风格的图像进行图像风格转换处理。由此,降低了对图像风格转换时样本图像的处理算力要求,在保证风格转换效果的前提下,提升了风格转换网络的训练效率。To sum up, the image processing method of the embodiment of the present disclosure obtains the first object feature of the first style sample image, trains the first adversarial generation network according to the first object feature and the first style sample image, and then obtains the second style sample image The second object feature of , train the second adversarial generation network according to the second object feature and the second style sample image, perform fusion processing on the first adversarial generation network and the second adversarial generation network, generate a style conversion network, and transform the network according to the style An image style conversion process is performed on the images of the first style and the second style. As a result, the processing power requirements for sample images during image style conversion are reduced, and the training efficiency of the style conversion network is improved on the premise of ensuring the effect of style conversion.
需要说明的是,在不同的应用场景中,对第一对抗生成网络和第二对象生成网络的训练方式不同,下面示例说明:It should be noted that in different application scenarios, the training methods for the first adversarial generation network and the second object generation network are different. The following examples illustrate:
在本公开的一个实施例中,如图6所示,取第一风格样本图像的第一对象特征,根据第一对象特征和第一风格样本图像训练第一对抗生成网络,包括:In one embodiment of the present disclosure, as shown in FIG. 6 , the first object feature of the first style sample image is taken, and the first adversarial generation network is trained according to the first object feature and the first style sample image, including:
步骤601,对第一风格样本图像中的第一对象进行关键点分割检测,提取第一对象的关键区域轮廓特征。Step 601: Carry out key point segmentation and detection on the first object in the first style sample image, and extract key area contour features of the first object.
其中,第一对象为待风格转换的实体对象,包括但不限于人脸、服饰等。Wherein, the first object is an entity object to be style converted, including but not limited to human face, clothing, and the like.
在本实施例中,为了提升处理效率,对第一风格样本图像中的第一对象进行关键点分割检测,提取第一对象的关键区域轮廓特征,也即是说,基于关键点检测技术,识别第一对象的不同区域,进而,将第一对象分割为多个关键区域,以便于基于关键区域粒度进行后续图像处理。In this embodiment, in order to improve the processing efficiency, the key point segmentation detection is performed on the first object in the first style sample image, and the key area contour features of the first object are extracted, that is to say, based on the key point detection technology, the recognition Different regions of the first object, and furthermore, the first object is divided into a plurality of key regions, so as to perform subsequent image processing based on the granularity of the key regions.
其中,关键点分割检测时的关键点可以是预先定义的,也可以是根据实验数据学习得到的,以第一对象为人脸为例,如图7所示,对应的关键点可以为鼻子区域的关键点、左眼区域的关键点、右眼区域的关键点、嘴巴区域的关键点,其他脸部区域关键点等,进而,基于这些关键点提取关键区域轮廓特征,该轮廓特征包括但不限于关键区域轮廓对应的像素点位置以及像素点之间的位置关系等。Among them, the key points during the key point segmentation detection can be pre-defined, or can be learned from experimental data. Taking the first object as a face as an example, as shown in Figure 7, the corresponding key points can be the nose area Key points, key points of the left eye area, key points of the right eye area, key points of the mouth area, key points of other face areas, etc., and then extract key area contour features based on these key points, the contour features include but are not limited to The pixel position corresponding to the outline of the key area and the positional relationship between the pixels, etc.
步骤602,通过待训练的第一对抗生成网络中的生成网络对第一对象的关键区域轮廓特征进行处理生成第一参考样本图像。In step 602 , the key area contour features of the first object are processed by the generation network in the first adversarial generation network to be trained to generate a first reference sample image.
在本实施例中,通过待训练的第一对抗生成网络中的生成网络对第一对象的关键区域轮廓特征进行处理,生成第一参考样本图像,其中,第一参考样本图像为基于关键区域轮廓特征提取维度下的第一风格的图像。In this embodiment, the key region contour features of the first object are processed by the generation network in the first adversarial generation network to be trained to generate the first reference sample image, wherein the first reference sample image is based on the key region contour First-style images in the feature extraction dimension.
步骤603,根据第一风格样本图像和第一参考样本图像确定第一损失函数。Step 603: Determine a first loss function according to the first style sample image and the first reference sample image.
容易理解的是,由于第一对抗生成网络应当输出第一风格的图像,因此,可以通过第一参考样本图像与第一风格样本图像之间的第一损失函数训练对应的第一对抗生成网络。It is easy to understand that since the first adversarial generation network should output images of the first style, the corresponding first adversarial generation network can be trained through the first loss function between the first reference sample image and the first style sample image.
需要说明的是,在不同的应用场景下,第一损失函数的计算方式不同,示例如下:It should be noted that in different application scenarios, the first loss function is calculated in different ways, examples are as follows:
在一些可能的实施例中,可以计算第一参考样本图像到第一风格样本图像的光流场,即计算第一参考样本图像到第一风格样本图像中,相同的关键点的运动光流场,基于运动光流场的确定第一损失函数,其中,运动光流场标识第一参考样本图像和第一风格样本图像之间的对齐误差,光流场越大,则表明第一参考样本图像和第一风格样本图像之间的误差越大。In some possible embodiments, the optical flow field from the first reference sample image to the first style sample image can be calculated, that is, the motion optical flow field of the same key point can be calculated from the first reference sample image to the first style sample image , the determination of the first loss function based on the motion optical flow field, wherein the motion optical flow field identifies the alignment error between the first reference sample image and the first style sample image, and the larger the optical flow field is, it indicates that the first reference sample image The larger the error between and the first style sample image.
在另一些可能的实施例中,为了提高第一损失函数的计算效率,将第一参考样本图像划分为多个网格块,并且根据同样的网格划分策略将第一风格样本图像也划分为多个网格块,计算每个网格块包含的所有像素点的像素均值,基于第一参考样本图像和第一风格样本图像之间对应位置网格块之间的像素均值之间的差值确定的第一损失函数。比如,将所有网格之间的像素均值的差值的均值作为第一损失函数。In some other possible embodiments, in order to improve the calculation efficiency of the first loss function, the first reference sample image is divided into multiple grid blocks, and the first style sample image is also divided into grid blocks according to the same grid division strategy A plurality of grid blocks, calculating the pixel mean value of all pixels contained in each grid block, based on the difference between the pixel mean values of corresponding position grid blocks between the first reference sample image and the first style sample image Determining the first loss function. For example, the mean of the differences of pixel mean values between all grids is used as the first loss function.
步骤604,根据第一损失函数进行反向传播训练第一对抗生成网络。Step 604: Perform back propagation according to the first loss function to train the first adversarial generative network.
在本实施例中,根据第一损失函数进行方向传播训练第一对抗生成网络,即调整待训练的第一对抗生成网络的网络参数,从而使得调整网络参数后的第一对抗生成网络可以输出与第一风格一致的相关图像。In this embodiment, the direction propagation is performed according to the first loss function to train the first adversarial generation network, that is, the network parameters of the first adversarial generation network to be trained are adjusted, so that the first adversarial generation network after adjusting the network parameters can output the same Relevant images with a consistent first style.
为了实现对风格转换的平滑性,第二对抗生成网络的训练方式可以和第一对抗生成网络一致。In order to achieve smooth style transfer, the training method of the second adversarial generation network can be consistent with that of the first adversarial generation network.
在本实施例中,如图8所示,获取第二风格样本图像的第二对象特征,根据第二对象特征和第二风格样本图像训练第二对抗生成网络,包括:In this embodiment, as shown in FIG. 8, the second object feature of the second style sample image is obtained, and the second adversarial generation network is trained according to the second object feature and the second style sample image, including:
步骤801,对第二风格样本图像中的第二对象进行关键点分割检测,提取第二对象的关键区域轮廓特征。 Step 801, perform key point segmentation and detection on the second object in the second style sample image, and extract key area contour features of the second object.
其中,第二对象为待风格转换的实体对象,包括但不限于人脸、服饰等。第一对象可以和第二对象一致,比如,第一对象为人脸,则第二对象也为人脸,当然,第一对象和第二对象也可以不一致,比如,第一对象为人脸,则第二对象为猫脸等。Wherein, the second object is an entity object to be style-transformed, including but not limited to human face, clothing, and the like. The first object can be consistent with the second object. For example, if the first object is a human face, the second object is also a human face. Of course, the first object and the second object can also be inconsistent. For example, if the first object is a human face, the second object can also be a human face. The objects are cat faces and the like.
在本实施例中,为了提升处理效率,对第二风格样本图像中的第二对象进行关键点分割检测,提取第二对象的关键区域轮廓特征,也即是说,基于关键点检测技术,识别第二对象的不同区域,进而,将第二对象分割为多个关键区域,以便于基于关键区域粒度进行 后续图像处理。In this embodiment, in order to improve the processing efficiency, the key point segmentation detection is performed on the second object in the second style sample image, and the key area contour features of the second object are extracted, that is to say, based on the key point detection technology, the recognition Different regions of the second object, and furthermore, the second object is divided into a plurality of key regions, so as to perform subsequent image processing based on the granularity of the key regions.
其中,关键点分割检测时的关键点可以是预先定义的,也可以是根据实验数据学习得到的,以第二对象为人脸为例,对应的关键点可以为鼻子区域的关键点、左眼区域的关键点、右眼区域的关键点、嘴巴区域的关键点,其他脸部区域关键点等,进而,基于这些关键点提取关键区域轮廓特征,该轮廓特征包括但不限于关键区域轮廓对应的像素点位置以及像素点之间的位置关系等。Among them, the key points during key point segmentation detection can be pre-defined, or can be learned from experimental data. Taking the second object as a face as an example, the corresponding key points can be the key points of the nose area, the left eye area The key points of the key points, the key points of the right eye area, the key points of the mouth area, the key points of other face areas, etc., and then extract the key area contour features based on these key points, the contour features include but not limited to the pixels corresponding to the key area contour The position of the point and the positional relationship between the pixels, etc.
步骤802,通过待训练的第二对抗生成网络中的生成网络对第二对象的关键区域轮廓特征进行处理生成第二参考样本图像。 Step 802 , using the generation network in the second adversarial generation network to be trained to process the contour features of key regions of the second object to generate a second reference sample image.
在本实施例中,通过待训练的第二对抗生成网络中的生成网络对第二对象的关键区域轮廓特征进行处理,生成第二参考样本图像,其中,第二参考样本图像为基于关键区域轮廓特征提取维度下的第二风格的图像。In this embodiment, the key region contour features of the second object are processed by the generation network in the second adversarial generation network to be trained to generate a second reference sample image, wherein the second reference sample image is based on the key region contour Image of the second style in feature extraction dimension.
步骤803,根据第二风格样本图像和第二参考样本图像确定第二损失函数。Step 803: Determine a second loss function according to the second style sample image and the second reference sample image.
容易理解的是,由于第二对抗生成网络应当输出第二风格的图像,因此,可以通过第二参考样本图像与第二风格样本图像之间的第二损失函数训练对应的第二对抗生成网络。It is easy to understand that since the second adversarial generation network should output images of the second style, the corresponding second adversarial generation network can be trained through a second loss function between the second reference sample image and the second style sample image.
需要说明的是,在不同的应用场景下,第二损失函数的计算方式不同,示例如下:It should be noted that in different application scenarios, the second loss function is calculated in different ways, examples are as follows:
在一些可能的实施例中,可以计算第二参考样本图像到第二风格样本图像的光流场,即计算第二参考样本图像到第二风格样本图像中,相同的关键点的运动光流场,基于运动光流场的确定第二损失函数,其中,运动光流场标识第二参考样本图像和第二风格样本图像之间的对齐误差,光流场越大,则表明第二参考样本图像和第二风格样本图像之间的误差越大。In some possible embodiments, the optical flow field from the second reference sample image to the second style sample image can be calculated, that is, the motion optical flow field of the same key point can be calculated from the second reference sample image to the second style sample image , the second loss function is determined based on the motion optical flow field, wherein the motion optical flow field identifies the alignment error between the second reference sample image and the second style sample image, and the larger the optical flow field is, the second reference sample image The larger the error between and the second style sample image.
在另一些可能的实施例中,为了提高第二损失函数的计算效率,将第二参考样本图像换分为多个网格块,并且根据同样的网格划分策略将第一风格样本图像也划分为多个网格块,计算每个网格块包含的所有像素点的像素均值,基于第二参考样本图像和第二风格样本图像之间对应位置网格块之间的像素均值之间的差值确定的第二损失函数。比如,将所有网格之间的像素均值的差值的均值作为第二损失函数。In some other possible embodiments, in order to improve the calculation efficiency of the second loss function, the second reference sample image is divided into multiple grid blocks, and the first style sample image is also divided according to the same grid division strategy For multiple grid blocks, calculate the pixel mean value of all pixels contained in each grid block, based on the difference between the pixel mean values between the corresponding position grid blocks between the second reference sample image and the second style sample image The value determines the second loss function. For example, the mean of the differences of the pixel mean values between all grids is used as the second loss function.
步骤804,根据第二损失函数进行反向传播训练第二对抗生成网络。 Step 804, perform backpropagation training on the second adversarial generative network according to the second loss function.
在本实施例中,根据第二损失函数进行方向传播训练第二对抗生成网络,即调整待训练的第二对抗生成网络的网络参数,从而使得调整网络参数后的第二对抗生成网络可以输出与第二风格一致的相关图像。In this embodiment, the second adversarial generation network is trained by direction propagation according to the second loss function, that is, the network parameters of the second adversarial generation network to be trained are adjusted, so that the second adversarial generation network after adjusting the network parameters can output the same A second consistent style of related images.
在本公开的另一个实施例中,参照图9,对第一风格样本图像进行第一对象关键点分割检测,比如,基于人脸解析技术进行人脸关键点分割检测等,进而得到第一对象的关键区域轮廓特征mask1,在得到mask1后,对mask1编码后得到第一编码结果,对第一风格样本图像进行编码后得到第二编码结果,基于第一编码结果和第二编码结果进行融合后得到第一特征图像,该第一特征图像一方面在关键区域轮廓上体现了第一对象的轮廓特征,另一方面,结合了原始的第一风格样本图像,保留了原始的第一风格的特征。In another embodiment of the present disclosure, referring to FIG. 9 , the first object key point segmentation detection is performed on the first style sample image, for example, the face key point segmentation detection is performed based on the face analysis technology, and then the first object is obtained. The contour feature mask1 of the key region, after obtaining mask1, encode mask1 to obtain the first encoding result, encode the first style sample image to obtain the second encoding result, and perform fusion based on the first encoding result and the second encoding result Obtain the first feature image, which on the one hand embodies the outline features of the first object on the outline of the key area, and on the other hand, combines the original first style sample image to retain the original first style features .
进而,基于第一特征图像和mask1融合后得到第二特征图,将第二特征图输入到第一对抗生成网络得到对应的第三参考样本图像,计算第三参考样本图像和第一风格样本图像之间的损失值,若是损失值大于预设阈值,则调整第一对抗生成网络的网络参数,直至上述损失值小于预设阈值,则完成对第一对抗生成网络的训练。Furthermore, based on the fusion of the first feature image and mask1, the second feature map is obtained, and the second feature map is input to the first confrontation generation network to obtain the corresponding third reference sample image, and the third reference sample image and the first style sample image are calculated If the loss value is greater than the preset threshold, adjust the network parameters of the first adversarial generation network until the above loss value is less than the preset threshold, then complete the training of the first adversarial generation network.
同样的,在本实施例中,参照图10,对第二风格样本图像进行第二对象关键点分割检测,比如,基于人脸解析技术进行人脸关键点分割检测等,进而得到第二对象的关键区域轮廓特征mask2,在得到mask2后,对mask2编码后得到第三编码结果,对第二风格样本图像进行编码后得到第四编码结果,基于第三编码结果和第四编码结果进行融合后得到第三特征图像,该第三特征图像一方面在关键区域轮廓上体现了第二对象的轮廓特征,另一方面,结合了原始的第二风格样本图像,保留了原始的第二风格的特征。Similarly, in this embodiment, with reference to FIG. 10 , the second object key point segmentation detection is performed on the second style sample image, for example, the face key point segmentation detection is performed based on the face analysis technology, and then the second object The key area contour feature mask2, after obtaining mask2, encode mask2 to obtain the third encoding result, encode the second style sample image to obtain the fourth encoding result, and obtain after fusion based on the third encoding result and the fourth encoding result The third feature image, on the one hand, the third feature image embodies the outline features of the second object on the outline of the key area, and on the other hand, combines the original second style sample image to retain the original second style features.
进而,基于第三特征图像和第三编码结果融合后得到第四特征图,将第四特征图输入到第二对抗生成网络得到对应的第四参考样本图像,计算第四参考样本图像和第二风格样本图像之间的损失值,若是损失值大于预设阈值,则调整第二对抗生成网络的网络参数,直至上述损失值小于预设阈值。Furthermore, based on the fusion of the third feature image and the third encoding result, the fourth feature map is obtained, the fourth feature map is input to the second confrontation generation network to obtain the corresponding fourth reference sample image, and the fourth reference sample image and the second For the loss value between the style sample images, if the loss value is greater than the preset threshold, adjust the network parameters of the second confrontation generation network until the above loss value is less than the preset threshold.
综上,本公开实施例的图像处理方法,可以根据场景的需要结合关键区域轮廓特进行有关对抗生成网络的训练,在保证对抗生成网络的训练精度的基础上,提升了对抗生成网络的训练效率。To sum up, the image processing method of the embodiment of the present disclosure can perform the training of the adversarial generation network according to the needs of the scene in combination with the outline characteristics of key regions, and improve the training efficiency of the adversarial generation network on the basis of ensuring the training accuracy of the adversarial generation network. .
基于上述实施例,均考虑了对抗生成网络和对应的正样本图像之间的距离来计算有关损失函数,这种计算方式可以导致输出的图像缺乏细节过于平滑。Based on the above-mentioned embodiments, the distance between the confrontation generation network and the corresponding positive sample image is considered to calculate the relevant loss function. This calculation method may cause the output image to lack details and be too smooth.
因此,在本公开的一个实施例中,结合负样本图像对对抗生成网络进行训练,即在计算有关损失函数时,还可结合负样本图像,下面以第一风格为油画风格,第二风格为素颜风格为例进行对抗生成网络的训练过程的说明。Therefore, in one embodiment of the present disclosure, the adversarial generation network is trained in combination with negative sample images, that is, when calculating the relevant loss function, negative sample images can also be combined. The first style is the oil painting style, and the second style is Take the plain makeup style as an example to illustrate the training process of the adversarial generative network.
在本实施例中,如图11所示,根据第一风格样本图像和第一参考样本图像确定第一损失函数,包括:In this embodiment, as shown in FIG. 11, the first loss function is determined according to the first style sample image and the first reference sample image, including:
步骤1101,对第一风格样本图像和第一参考样本图像进行融合加噪处理生成第一负样本图像。 Step 1101, performing fusion and noise addition processing on the first style sample image and the first reference sample image to generate a first negative sample image.
在本实施例中,对第一风格样本图像和第一参考样本图像进行融合后,可以在融合后的图像中加入随机噪声得到第一负样本图像。该第一负样本图像相对于第一风格样本图像不但引入了第一参考样本图像的误差,还引入了噪音误差。In this embodiment, after the first style sample image and the first reference sample image are fused, random noise may be added to the fused image to obtain the first negative sample image. Compared with the first style sample image, the first negative sample image not only introduces an error of the first reference sample image, but also introduces a noise error.
步骤1102,提取第一风格样本图像的第一高频信息,第一参考样本图像的第二高频信息,以及第一负样本图像的第三高频信息。 Step 1102, extract the first high-frequency information of the first style sample image, the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image.
在本实施例中,提取第一风格样本图像的第一高频信息,第一参考样本图像的第二高频信息,以及第一负样本图像的第三高频信息,其中,图像的高频信息可以理解为亮度差较大的细节较为丰富的像素点的像素信息等。In this embodiment, the first high-frequency information of the first style sample image, the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image are extracted, wherein the high-frequency information of the image The information can be understood as pixel information of a pixel with a large brightness difference and richer details, and the like.
步骤1103,通过第一对抗生成网络中的判别网络对第一高频信息、第二高频信息、以及第三高频信息进行判别处理生成对应的判别分数。In step 1103, the discriminant network in the first adversarial generation network performs discriminative processing on the first high-frequency information, the second high-frequency information, and the third high-frequency information to generate corresponding discriminant scores.
在本实施例中,通过第一对抗生成网络中的判别网络对第一高频信息、第二高频信息、以及第三高频信息进行判别处理生成对应的判别分数,该判别分数表示了判别器对第一高频信息、第二高频信息和第三高频信息属于第一风格的分数。In this embodiment, the first high-frequency information, the second high-frequency information, and the third high-frequency information are discriminated by the discriminant network in the first adversarial generation network to generate corresponding discriminant scores, which represent the discriminant The score of the first style belongs to the first high-frequency information, the second high-frequency information and the third high-frequency information.
步骤1104,根据判别分数确定第一损失函数。 Step 1104, determine a first loss function according to the discriminant score.
在本实施例中,根据判别分数确定第一损失函数,比如,直接计算第一高频信息和分别和第一参考样本图像的第二高频信息,以及所述第一负样本图像的第三高频信息的第一平方绝对误差值,以及第二平方绝对误差值,计算第一平方绝对误差值和第二平方绝对误差值的比值作为第一损失函数。In this embodiment, the first loss function is determined according to the discriminant score, for example, directly calculating the first high-frequency information and the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image For the first absolute square error value and the second absolute square error value of the high-frequency information, a ratio of the first absolute square error value to the second absolute square error value is calculated as a first loss function.
或者,直接计算第一高频信息和分别和第一参考样本图像的第二高频信息,以及所述第一负样本图像的第三高频信息的第一差值,以及第二差值,计算第一差值和第二差值的比值作为第一损失函数。Or directly calculate the first difference between the first high-frequency information and the second high-frequency information of the first reference sample image, and the third high-frequency information of the first negative sample image, and the second difference, A ratio of the first difference to the second difference is calculated as a first loss function.
从而,在本实施例中,在保证训练第一对抗生成网络时,使训练后的第一对抗生成网络输出的图像不但和第一风格样本图像在特征层面上接近的同时与第一负样本图像远离,从而减少了一些伪像和噪声的引入,保证了第一对抗生成网络输出的图像是油画风格。Therefore, in this embodiment, while ensuring the training of the first adversarial generation network, the image output by the trained first adversarial generation network is not only close to the first style sample image on the feature level, but also similar to the first negative sample image Far away, thereby reducing the introduction of some artifacts and noise, and ensuring that the image output by the first confrontation generation network is an oil painting style.
同样的,在本实施例中,如图12所示,根据第二风格样本图像和第二参考样本图像 确定第二损失函数,包括:Similarly, in this embodiment, as shown in Figure 12, the second loss function is determined according to the second style sample image and the second reference sample image, including:
步骤1201,对第二风格样本图像和第二参考样本图像进行融合加噪处理生成第二负样本图像。 Step 1201, performing fusion and noise addition processing on the second style sample image and the second reference sample image to generate a second negative sample image.
在本实施例中,对第二风格样本图像和第二参考样本图像进行融合后,可以在融合后的图像中加入随机噪声得到第二负样本图像。该第二负样本图像相对于第二风格样本图像不但引入了第二参考样本图像的误差,还引入了噪音误差。In this embodiment, after the second style sample image and the second reference sample image are fused, random noise may be added to the fused image to obtain a second negative sample image. Compared with the second style sample image, the second negative sample image not only introduces an error of the second reference sample image, but also introduces a noise error.
步骤1202,提取第二风格样本图像的第一纹理特征,第二参考样本图像的第二纹理特征,以及第二负样本图像的第三纹理特征。 Step 1202, extracting the first texture feature of the second style sample image, the second texture feature of the second reference sample image, and the third texture feature of the second negative sample image.
在本实施例中,提取第二风格样本图像的第一纹理特征,第二参考样本图像的第二纹理特征,以及第二负样本图像的第三纹理特征,由其中,纹理特征反映了对应图像属于素颜风格的像素点的颜色、亮度等特征。In this embodiment, the first texture feature of the second style sample image, the second texture feature of the second reference sample image, and the third texture feature of the second negative sample image are extracted, wherein the texture feature reflects the corresponding image Features such as color and brightness of pixels belonging to plain makeup style.
步骤1203,根据第一纹理特征、第二纹理特征,以及第三纹理特征确定第二损失函数。Step 1203: Determine a second loss function according to the first texture feature, the second texture feature, and the third texture feature.
在本实施例中,可以通过第二对抗生成网络中的判别网络对第一纹理特征、第二纹理特征、以及第三纹理特征进行判别处理生成对应的判别分数,该判别分数表示了判别器对第一纹理特征、第二纹理特征和第三纹理特征属于第二风格的分数。In this embodiment, the discriminant network in the second adversarial generation network can be used to discriminate the first texture feature, the second texture feature, and the third texture feature to generate corresponding discriminant scores. The first texture feature, the second texture feature and the third texture feature belong to the score of the second style.
进而,可根据判别分数确定第二损失函数,比如,直接计算第一纹理特征和分别和第二参考样本图像的第二纹理特征,以及第二负样本图像的第三纹理特征的第三平方绝对误差值,以及第四平方绝对误差值,计算第三平方绝对误差值和第四平方绝对误差值的比值作为第二损失函数。Furthermore, the second loss function can be determined according to the discriminant score, for example, directly calculate the first texture feature and the second texture feature of the second reference sample image respectively, and the third square absolute of the third texture feature of the second negative sample image The error value, and the fourth squared absolute error value, calculating the ratio of the third squared absolute error value and the fourth squared absolute error value as the second loss function.
或者,直接计算第一纹理特征和分别和第二纹理特征,以及第三纹理特征的第三差值,以及第四差值,计算第三差值和第四差值的比值作为第二损失函数。Alternatively, directly calculate the first texture feature and the third difference between the second texture feature and the third texture feature, and the fourth difference, and calculate the ratio of the third difference to the fourth difference as the second loss function .
从而,在本实施例中,在保证训练第二对抗生成网络时,使训练后的第二对抗生成网络输出的图像不但和第二风格样本图像在特征层面上接近的同时与第二负样本图像远离,从而减少了一些伪像和噪声的引入,保证了第二对抗生成网络输出的图像是素颜风格。Therefore, in this embodiment, while ensuring the training of the second adversarial generation network, the image output by the trained second adversarial generation network is not only close to the second style sample image on the feature level, but also similar to the second negative sample image away, thereby reducing the introduction of some artifacts and noise, and ensuring that the image output by the second adversarial generation network is a plain style.
进一步地,在根据风格转换网络对第一风格和第二风格的图像进行图像风格转换处理时,可以首先提取具有素颜风格原始图像中目标对象的关键区域轮廓特征,其中,目标对象包括但不限于上述提到的人脸的各个部位等,进而,对素颜风格原始图像和目标对象的关键区域轮廓特征进行编码生成目标对象的特征数据。Further, when performing image style conversion processing on the images of the first style and the second style according to the style conversion network, the key area contour features of the target object in the original image with plain makeup style can be extracted first, wherein the target object includes but is not limited to The various parts of the face mentioned above, etc., and then encode the key area contour features of the original image without makeup style and the target object to generate the feature data of the target object.
进一步地,在得到目标对象的特征数据后,由于预先训练的风格转换网络包含了第二 对抗网络的网络特性,因此,进一步可以通过风格转换网络对目标对象的特征数据和目标对象的关键区域轮廓特征进行图像融合处理,基于融合处理后的图像进行油画风格领域的转换,得到具有油画风格的目标图像。Furthermore, after obtaining the feature data of the target object, since the pre-trained style transfer network contains the network characteristics of the second confrontation network, the feature data of the target object and the key region outline of the target object can be further analyzed by the style transfer network The image fusion processing is performed on the features, and the conversion of the oil painting style field is performed based on the fused image, and the target image with the oil painting style is obtained.
也可以理解,预先训练好的第一对抗生成网络可以基于输入的素颜风格原始图像,提取对应的反映了素颜风格特点的目标对象的关键轮廓特征,进而,对素颜风格原始图像和目标对象在素颜维度的关键区域轮廓特征进行编码生成目标对象的特征数据,对目标对象的特征数据和目标对象的关键区域轮廓特征进行图像融合处理得到了具有素颜风格的关键区域轮廓维度的新的原始图像。It can also be understood that the pre-trained first adversarial generation network can extract the corresponding key contour features of the target object reflecting the characteristics of the plain face style based on the input original image of the plain face style, and then, the original image of the plain face style and the target object in the plain face The key area contour features of the dimension are encoded to generate the feature data of the target object, and the image fusion processing is performed on the feature data of the target object and the key area contour features of the target object to obtain a new original image of the key area contour dimension with plain makeup style.
基于该新的原始图像输入到预先训练好的第二对抗生成网络,第二对抗生成网络提取新的原始图像的目标对象在油画风格维度的关键轮廓特征,进而,对新的风格原始图像和目标对象的油画风格维度的关键区域轮廓特征进行编码生成目标对象的新的特征数据,由于第二对抗生成网络可以基于输入的图像的特征输入得到第二风格的图像,因此,第二对抗生成网络基于新的特征数据得到油画风格的目标图像,其中,风格转换网络作用在第一对抗生成网络和第二对抗生成网络的权重,体现在对每个对抗生成网络输出结果的乘积上,具体可参照上述实施例,在此不再赘述。Based on the new original image input to the pre-trained second adversarial generation network, the second adversarial generation network extracts the key contour features of the target object in the oil painting style dimension of the new original image, and then, the new style original image and target The key area contour features of the oil painting style dimension of the object are encoded to generate new feature data of the target object. Since the second adversarial generation network can obtain an image of the second style based on the feature input of the input image, the second adversarial generation network is based on The new feature data obtains the target image of the oil painting style, in which the weight of the style conversion network acting on the first confrontation generation network and the second confrontation generation network is reflected in the product of the output results of each confrontation generation network. For details, please refer to the above The embodiment will not be repeated here.
从而,如图13所示,若是输入素颜图像,经过本实施例中的风格转换网络,可以结合对第一对抗生成网络和第二对抗生成网络的输出融合,得到对应的油画风格的图像,该油画风格的图像细节丰富真实感较强。Therefore, as shown in Figure 13, if the input is a plain face image, through the style conversion network in this embodiment, the output fusion of the first confrontation generation network and the second confrontation generation network can be combined to obtain a corresponding oil painting style image. Oil painting-style image details are rich and realistic.
综上,本公开实施例的图像处理方法,结合输入的风格样本图像分别和正样本图像和负样本图像的距离,在特征层面的损失值训练得到对应的对抗生成网络,在保证对抗生成网络输出的有关图像的图像细节的丰富度的基础上,提升输出的图像的纯净度,使得融合后的目标图像的风格转换效果和第二风格一致。To sum up, the image processing method of the embodiment of the present disclosure combines the distance between the input style sample image and the positive sample image and the negative sample image respectively, and trains the loss value at the feature level to obtain the corresponding adversarial generation network, while ensuring the output of the adversarial generation network. On the basis of the richness of the image details of the relevant image, the purity of the output image is improved, so that the style conversion effect of the fused target image is consistent with the second style.
为了实现上述实施例,本公开还提出了一种图像处理器。In order to realize the above-mentioned embodiments, the present disclosure also proposes an image processor.
图14为本公开实施例提供的一种图像处理器的结构示意图,该装置可由软件和/或硬件实现,一般可集成在电子设备中图像处理。如图14所示,该装置包括:FIG. 14 is a schematic structural diagram of an image processor provided by an embodiment of the present disclosure. The device can be implemented by software and/or hardware, and can generally be integrated into an electronic device for image processing. As shown in Figure 14, the device includes:
第一训练模块1610,用于获取第一风格样本图像的第一对象特征,根据第一对象特征和第一风格样本图像训练第一对抗生成网络;The first training module 1610 is used to obtain the first object feature of the first style sample image, and train the first confrontation generation network according to the first object feature and the first style sample image;
第二训练模块1620,用于获取第二风格样本图像的第二对象特征,根据第二对象特征 和第二风格样本图像训练第二对抗生成网络;The second training module 1620 is used to obtain the second object feature of the second style sample image, and train the second confrontation generating network according to the second object feature and the second style sample image;
融合模块1630,用于对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,以根据风格转换网络对第一风格和第二风格的图像进行图像风格转换处理。本公开实施例所提供的图像处理器可执行本公开任意实施例所提供的图像处理方法,具备执行方法相应的功能模块和有益效果。The fusion module 1630 is configured to perform fusion processing on the first adversarial generation network and the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on images of the first style and the second style according to the style conversion network. The image processor provided by the embodiment of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
为了实现上述实施例,本公开还提出一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现上述实施例中的图像处理方法。In order to implement the above embodiments, the present disclosure also proposes a computer program product, including computer programs/instructions, which implement the image processing methods in the above embodiments when the computer programs/instructions are executed by a processor.
图15本公开实施例提供的一种电子设备的结构示意图。FIG. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
下面具体参考图15,其示出了适于用来实现本公开实施例中的电子设备1700的结构示意图。本公开实施例中的电子设备1700可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图15示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring specifically to FIG. 15 , it shows a schematic structural diagram of an electronic device 1700 suitable for implementing an embodiment of the present disclosure. The electronic device 1700 in the embodiment of the present disclosure may include, but not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Tablet Computers), PMPs (Portable Multimedia Players), vehicle-mounted terminals ( Mobile terminals such as car navigation terminals) and stationary terminals such as digital TVs, desktop computers and the like. The electronic device shown in FIG. 15 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
如图15所示,电子设备1700可以包括处理器(例如中央处理器、图形处理器等)1701,其可以根据存储在只读存储器(ROM)1702中的程序或者从存储器1708加载到随机访问存储器(RAM)1703中的程序而执行各种适当的动作和处理。在RAM 1703中,还存储有电子设备1700操作所需的各种程序和数据。处理器1701、ROM 1702以及RAM 1703通过总线1704彼此相连。输入/输出(I/O)接口1705也连接至总线1704。As shown in FIG. 15 , an electronic device 1700 may include a processor (such as a central processing unit, a graphics processing unit, etc.) 1701, which may be stored in a read-only memory (ROM) 1702 or loaded into a random access memory from a memory 1708. (RAM) 1703 to execute various appropriate actions and processing. In the RAM 1703, various programs and data necessary for the operation of the electronic device 1700 are also stored. The processor 1701, ROM 1702, and RAM 1703 are connected to each other through a bus 1704. An input/output (I/O) interface 1705 is also connected to the bus 1704 .
通常,以下装置可以连接至I/O接口1705:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置1706;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置1707;包括例如磁带、硬盘等的存储器1708;以及通信装置1709。通信装置1709可以允许电子设备1700与其他设备进行无线或有线通信以交换数据。虽然图15示出了具有各种装置的电子设备1700,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 1705: input devices 1706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 1707 such as a computer; a memory 1708 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 1709. The communication means 1709 may allow the electronic device 1700 to perform wireless or wired communication with other devices to exchange data. While FIG. 15 shows electronic device 1700 having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读 介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置1709从网络上被下载和安装,或者从存储器1708被安装,或者从ROM 1702被安装。在该计算机程序被处理器1701执行时,执行本公开实施例的图像处理方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer readable medium, the computer program including program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 1709, or from memory 1708, or from ROM 1702. When the computer program is executed by the processor 1701, the above-mentioned functions defined in the image processing method of the embodiment of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。In some embodiments, the client and the server can communicate using any currently known or future network protocols such as HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol), and can communicate with digital data in any form or medium The communication (eg, communication network) interconnections. Examples of communication networks include local area networks ("LANs"), wide area networks ("WANs"), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取第一风格样本图像的第一对象特征,根据第一对象特征和第一风格样本图像训练第一对抗生成网,进而,获取第二风格样本图像的第二对象特征, 根据第二对象特征和第二风格样本图像训练第二对抗生成网络,对第一对抗生成网络和第二对抗生成网络进行融合处理,生成风格转换网络,以根据风格转换网络对第一风格和第二风格的图像进行图像风格转换处理。由此,降低了对图像风格转换时样本图像的处理算力要求,在保证风格转换效果的前提下,提升了风格转换网络的训练效率。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the first object feature of the first style sample image, according to the first object feature and The first style sample image trains the first adversarial generation network, and then obtains the second object feature of the second style sample image, trains the second adversarial generation network according to the second object feature and the second style sample image, and the first adversarial generation network performing fusion processing with the second adversarial generation network to generate a style conversion network, so as to perform image style conversion processing on the images of the first style and the second style according to the style conversion network. As a result, the processing power requirements for sample images during image style conversion are reduced, and the training efficiency of the style conversion network is improved on the premise of ensuring the effect of style conversion.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described herein above may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令 执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
根据本公开的一个或多个实施例,本公开提供了一种图像处理方法,包括:获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;According to one or more embodiments of the present disclosure, the present disclosure provides an image processing method, including: acquiring a first object feature of a first style sample image, and according to the first object feature and the first style sample image Train the first adversarial generative network;
获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;Obtaining a second object feature of a second style sample image, and training a second adversarial generation network according to the second object feature and the second style sample image;
对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。performing fusion processing on the first confrontational generation network and the second confrontational generation network to generate a style conversion network, so as to perform image style conversion on images of the first style and the second style according to the style conversion network deal with.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络,包括:According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure, the acquisition of the first object feature of the first style sample image is performed according to the first object feature and the first style sample image training The first adversarial generative network, including:
对所述第一风格样本图像中的第一对象进行关键点分割检测,提取所述第一对象的关键区域轮廓特征;performing key point segmentation and detection on the first object in the first style sample image, and extracting key area contour features of the first object;
通过待训练的第一对抗生成网络中的生成网络对所述第一对象的关键区域轮廓特征进行处理生成第一参考样本图像;Processing the key area contour features of the first object through the generation network in the first confrontation generation network to be trained to generate a first reference sample image;
根据所述第一风格样本图像和所述第一参考样本图像确定第一损失函数;determining a first loss function based on the first style sample image and the first reference sample image;
根据所述第一损失函数进行反向传播训练所述第一对抗生成网络。performing backpropagation training on the first adversarial generation network according to the first loss function.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure,
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络,包括:According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure, the acquisition of the second object features of the second style sample image is performed according to the second object features and the second style sample image training The second adversarial generative network, including:
对所述第二风格样本图像中的第二对象进行关键点分割检测,提取所述第二对象的关 键区域轮廓特征;Carry out key point segmentation and detection to the second object in the second style sample image, and extract the key area contour features of the second object;
通过待训练的第二对抗生成网络中的生成网络对所述第二对象的关键区域轮廓特征进行处理生成第二参考样本图像;Processing the key area contour features of the second object through the generation network in the second confrontation generation network to be trained to generate a second reference sample image;
根据所述第二风格样本图像和所述第二参考样本图像确定第二损失函数;determining a second loss function based on the second style sample image and the second reference sample image;
根据所述第二损失函数进行反向传播训练所述第二对抗生成网络。performing backpropagation training on the second adversarial generation network according to the second loss function.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,包括:According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure, the fusion processing of the first confrontation generation network and the second confrontation generation network to generate a style conversion network includes:
根据所述第一对象特征和所述第二对象特征的相似度,确定与所述第一对抗生成网络对应的第一权重,以及与所述第二对抗生成网络对应的第二权重;determining a first weight corresponding to the first adversarial generation network and a second weight corresponding to the second adversarial generation network according to the similarity between the first object feature and the second object feature;
获取所述第一对抗生成网络的输出图像与所述第一权重的第一乘积结果,以及获取所述第二对抗生成网络的输出图像和所述第二权重的第二乘积结果,将所述第一乘积结果和所述第二乘积结果进行融合处理,生成所述风格转换网络。Obtaining the first product result of the output image of the first confrontation generating network and the first weight, and obtaining the second product result of the output image of the second confrontation generating network and the second weight, the The first product result and the second product result are fused to generate the style transfer network.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述第一风格为油画风格,所述第二风格为素颜风格;According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure, the first style is an oil painting style, and the second style is a plain makeup style;
所述根据所述第一风格样本图像和所述第一参考样本图像确定第一损失函数,包括:The determining a first loss function according to the first style sample image and the first reference sample image includes:
对所述第一风格样本图像和所述第一参考样本图像进行融合加噪处理生成第一负样本图像;performing fusion and noise processing on the first style sample image and the first reference sample image to generate a first negative sample image;
提取所述第一风格样本图像的第一高频信息,所述第一参考样本图像的第二高频信息,以及所述第一负样本图像的第三高频信息;extracting first high-frequency information of the first style sample image, second high-frequency information of the first reference sample image, and third high-frequency information of the first negative sample image;
通过所述第一对抗生成网络中的判别网络对所述第一高频信息、所述第二高频信息、以及所述第三高频信息进行判别处理生成对应的判别分数;performing discrimination processing on the first high-frequency information, the second high-frequency information, and the third high-frequency information through the discrimination network in the first confrontation generation network to generate corresponding discrimination scores;
根据所述判别分数确定所述第一损失函数。The first loss function is determined based on the discriminant score.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述根据所述第二风格样本图像和所述第二参考样本图像确定第二损失函数,包括:According to one or more embodiments of the present disclosure, in the image processing method provided in the present disclosure, the determining the second loss function according to the second style sample image and the second reference sample image includes:
对所述第二风格样本图像和所述第二参考样本图像进行融合加噪处理生成第二负样本图像;Perform fusion and noise processing on the second style sample image and the second reference sample image to generate a second negative sample image;
提取所述第二风格样本图像的第一纹理特征,所述第二参考样本图像的第二纹理特征,以及所述第二负样本图像的第三纹理特征;extracting a first texture feature of the second style sample image, a second texture feature of the second reference sample image, and a third texture feature of the second negative sample image;
根据所述第一纹理特征、所述第二纹理特征,以及所述第三纹理特征确定所述第二损 失函数。The second loss function is determined according to the first texture feature, the second texture feature, and the third texture feature.
根据本公开的一个或多个实施例,本公开提供的图像处理方法中,所述根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理,包括:According to one or more embodiments of the present disclosure, in the image processing method provided by the present disclosure, performing image style conversion processing on images of the first style and the second style according to the style conversion network includes:
提取具有所述素颜风格原始图像中目标对象的关键区域轮廓特征;Extract key region contour features of the target object in the original image with the plain makeup style;
对所述素颜风格原始图像和所述目标对象的关键区域轮廓特征进行编码生成所述目标对象的特征数据;Encoding the original image of plain makeup style and the outline features of key regions of the target object to generate feature data of the target object;
通过所述风格转换网络对所述目标对象的特征数据和所述目标对象的关键区域轮廓特征进行图像融合处理,生成具有所述油画风格的目标图像。performing image fusion processing on the feature data of the target object and the key area contour features of the target object through the style conversion network to generate a target image with the oil painting style.
根据本公开的一个或多个实施例,本公开提供了一种图像处理器,包括:According to one or more embodiments of the present disclosure, the present disclosure provides an image processor, including:
第一训练模块,用于获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;The first training module is used to obtain the first object feature of the first style sample image, and train the first confrontation generation network according to the first object feature and the first style sample image;
第二训练模块,用于获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;The second training module is used to obtain the second object feature of the second style sample image, and train the second confrontation generation network according to the second object feature and the second style sample image;
融合模块,用于对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。A fusion module, configured to perform fusion processing on the first confrontational generation network and the second confrontational generation network, and generate a style conversion network, so as to perform a fusion process on the first style and the second style according to the style conversion network. The image undergoes image style conversion processing.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述第一训练模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the first training module is specifically used for:
对所述第一风格样本图像中的第一对象进行关键点分割检测,提取所述第一对象的关键区域轮廓特征;performing key point segmentation and detection on the first object in the first style sample image, and extracting key area contour features of the first object;
通过待训练的第一对抗生成网络中的生成网络对所述第一对象的关键区域轮廓特征进行处理生成第一参考样本图像;Processing the key area contour features of the first object through the generation network in the first confrontation generation network to be trained to generate a first reference sample image;
根据所述第一风格样本图像和所述第一参考样本图像确定第一损失函数;determining a first loss function based on the first style sample image and the first reference sample image;
根据所述第一损失函数进行反向传播训练所述第一对抗生成网络。performing backpropagation training on the first adversarial generation network according to the first loss function.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述第二训练模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the second training module is specifically used for:
对所述第二风格样本图像中的第二对象进行关键点分割检测,提取所述第二对象的关键区域轮廓特征;performing key point segmentation detection on the second object in the second style sample image, and extracting key area contour features of the second object;
通过待训练的第二对抗生成网络中的生成网络对所述第二对象的关键区域轮廓特征 进行处理生成第二参考样本图像;Process the key area contour features of the second object to generate the second reference sample image by the generation network in the second confrontation generation network to be trained;
根据所述第二风格样本图像和所述第二参考样本图像确定第二损失函数;determining a second loss function based on the second style sample image and the second reference sample image;
根据所述第二损失函数进行反向传播训练所述第二对抗生成网络。performing backpropagation training on the second adversarial generation network according to the second loss function.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述融合模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the fusion module is specifically used for:
根据所述第一对象特征和所述第二对象特征的相似度,确定与所述第一对抗生成网络对应的第一权重,以及与所述第二对抗生成网络对应的第二权重;determining a first weight corresponding to the first adversarial generation network and a second weight corresponding to the second adversarial generation network according to the similarity between the first object feature and the second object feature;
获取所述第一对抗生成网络的输出图像与所述第一权重的第一乘积结果,以及获取所述第二对抗生成网络的输出图像和所述第二权重的第二乘积结果,将所述第一乘积结果和所述第二乘积结果进行融合处理,生成所述风格转换网络。Obtaining the first product result of the output image of the first confrontation generating network and the first weight, and obtaining the second product result of the output image of the second confrontation generating network and the second weight, the The first product result and the second product result are fused to generate the style transfer network.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述第一风格为油画风格,所述第二风格为素颜风格;所述融合模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the first style is an oil painting style, and the second style is a plain makeup style; the fusion module is specifically used for:
对所述第一风格样本图像和所述第一参考样本图像进行融合加噪处理生成第一负样本图像;performing fusion and noise processing on the first style sample image and the first reference sample image to generate a first negative sample image;
提取所述第一风格样本图像的第一高频信息,所述第一参考样本图像的第二高频信息,以及所述第一负样本图像的第三高频信息;extracting first high-frequency information of the first style sample image, second high-frequency information of the first reference sample image, and third high-frequency information of the first negative sample image;
通过所述第一对抗生成网络中的判别网络对所述第一高频信息、所述第二高频信息、以及所述第三高频信息进行判别处理生成对应的判别分数;performing discrimination processing on the first high-frequency information, the second high-frequency information, and the third high-frequency information through the discrimination network in the first confrontation generation network to generate corresponding discrimination scores;
根据所述判别分数确定所述第一损失函数。The first loss function is determined based on the discriminant score.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述第二训练模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the second training module is specifically used for:
对所述第二风格样本图像和所述第二参考样本图像进行融合加噪处理生成第二负样本图像;Perform fusion and noise processing on the second style sample image and the second reference sample image to generate a second negative sample image;
提取所述第二风格样本图像的第一纹理特征,所述第二参考样本图像的第二纹理特征,以及所述第二负样本图像的第三纹理特征;extracting a first texture feature of the second style sample image, a second texture feature of the second reference sample image, and a third texture feature of the second negative sample image;
根据所述第一纹理特征、所述第二纹理特征,以及所述第三纹理特征确定所述第二损失函数。The second loss function is determined according to the first texture feature, the second texture feature, and the third texture feature.
根据本公开的一个或多个实施例,本公开提供的图像处理器中,所述第二训练模块,具体用于:According to one or more embodiments of the present disclosure, in the image processor provided by the present disclosure, the second training module is specifically used for:
提取具有所述素颜风格原始图像中目标对象的关键区域轮廓特征;Extract key region contour features of the target object in the original image with the plain makeup style;
对所述素颜风格原始图像和所述目标对象的关键区域轮廓特征进行编码生成所述目标对象的特征数据;Encoding the original image of plain makeup style and the outline features of key regions of the target object to generate feature data of the target object;
通过所述风格转换网络对所述目标对象的特征数据和所述目标对象的关键区域轮廓特征进行图像融合处理,生成具有所述油画风格的目标图像。performing image fusion processing on the feature data of the target object and the key area contour features of the target object through the style conversion network to generate a target image with the oil painting style.
根据本公开的一个或多个实施例,本公开提供了一种电子设备,包括:According to one or more embodiments of the present disclosure, the present disclosure provides an electronic device, including:
处理器;processor;
用于存储所述处理器可执行指令的存储器;memory for storing said processor-executable instructions;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现如本公开提供的任一所述的图像处理方法。The processor is configured to read the executable instructions from the memory, and execute the instructions to implement any image processing method provided in the present disclosure.
根据本公开的一个或多个实施例,本公开提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行如本公开提供的任一所述的图像处理方法。According to one or more embodiments of the present disclosure, the present disclosure provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute any image provided by the present disclosure. Approach.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present disclosure and an illustration of the applied technical principles. Those skilled in the art should understand that the disclosure scope involved in this disclosure is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, but also covers the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of equivalent features. For example, a technical solution formed by replacing the above-mentioned features with (but not limited to) technical features with similar functions disclosed in this disclosure.
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。In addition, while operations are depicted in a particular order, this should not be understood as requiring that the operations be performed in the particular order shown or performed in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (11)

  1. 一种图像处理方法,包括:An image processing method, comprising:
    获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;Obtaining a first object feature of the first style sample image, and training a first confrontation generation network according to the first object feature and the first style sample image;
    获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;以及Obtaining a second object feature of a second style sample image, and training a second adversarial generative network according to the second object feature and the second style sample image; and
    对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。performing fusion processing on the first confrontational generation network and the second confrontational generation network to generate a style conversion network, so as to perform image style conversion on images of the first style and the second style according to the style conversion network deal with.
  2. 根据权利要求1所述的方法,其中,所述获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络,包括:The method according to claim 1, wherein said acquiring first object features of a first style sample image, and training a first adversarial generation network according to said first object feature and said first style sample image, comprises:
    对所述第一风格样本图像中的第一对象进行关键点分割检测,提取所述第一对象的关键区域轮廓特征;performing key point segmentation and detection on the first object in the first style sample image, and extracting key area contour features of the first object;
    通过待训练的第一对抗生成网络中的生成网络对所述第一对象的关键区域轮廓特征进行处理生成第一参考样本图像;Processing the key area contour features of the first object through the generation network in the first confrontation generation network to be trained to generate a first reference sample image;
    根据所述第一风格样本图像和所述第一参考样本图像确定第一损失函数;以及determining a first loss function based on the first style sample image and the first reference sample image; and
    根据所述第一损失函数进行反向传播训练所述第一对抗生成网络。performing backpropagation training on the first adversarial generation network according to the first loss function.
  3. 根据权利要求2所述的方法,其中,所述获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络,包括:The method according to claim 2, wherein said acquiring the second object features of the second style sample image, and training a second adversarial generation network according to the second object feature and the second style sample image include:
    对所述第二风格样本图像中的第二对象进行关键点分割检测,提取所述第二对象的关键区域轮廓特征;performing key point segmentation detection on the second object in the second style sample image, and extracting key area contour features of the second object;
    通过待训练的第二对抗生成网络中的生成网络对所述第二对象的关键区域轮廓特征进行处理生成第二参考样本图像;Processing the key area contour features of the second object through the generation network in the second confrontation generation network to be trained to generate a second reference sample image;
    根据所述第二风格样本图像和所述第二参考样本图像确定第二损失函数;以及determining a second loss function based on the second style sample image and the second reference sample image; and
    根据所述第二损失函数进行反向传播训练所述第二对抗生成网络。performing backpropagation training on the second adversarial generation network according to the second loss function.
  4. 根据权利要求1-3中任一所述的方法,其中,所述对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,包括:The method according to any one of claims 1-3, wherein the fusion processing of the first confrontation generation network and the second confrontation generation network to generate a style transfer network includes:
    根据所述第一对象特征和所述第二对象特征的相似度,确定与所述第一对抗生成网络 对应的第一权重,以及与所述第二对抗生成网络对应的第二权重;以及determining a first weight corresponding to the first adversarial generation network and a second weight corresponding to the second adversarial generation network according to the similarity between the first object feature and the second object feature; and
    获取所述第一对抗生成网络的输出图像与所述第一权重的第一乘积结果,以及获取所述第二对抗生成网络的输出图像和所述第二权重的第二乘积结果,将所述第一乘积结果和所述第二乘积结果进行融合处理,生成所述风格转换网络。Obtaining the first product result of the output image of the first confrontation generating network and the first weight, and obtaining the second product result of the output image of the second confrontation generating network and the second weight, the The first product result and the second product result are fused to generate the style transfer network.
  5. 根据权利要求3-4中任一所述的方法,其中,所述第一风格为油画风格,所述第二风格为素颜风格;The method according to any one of claims 3-4, wherein the first style is an oil painting style, and the second style is a plain makeup style;
    所述根据所述第一风格样本图像和所述第一参考样本图像确定第一损失函数,包括:The determining a first loss function according to the first style sample image and the first reference sample image includes:
    对所述第一风格样本图像和所述第一参考样本图像进行融合加噪处理生成第一负样本图像;performing fusion and noise processing on the first style sample image and the first reference sample image to generate a first negative sample image;
    提取所述第一风格样本图像的第一高频信息,所述第一参考样本图像的第二高频信息,以及所述第一负样本图像的第三高频信息;extracting first high-frequency information of the first style sample image, second high-frequency information of the first reference sample image, and third high-frequency information of the first negative sample image;
    通过所述第一对抗生成网络中的判别网络对所述第一高频信息、所述第二高频信息、以及所述第三高频信息进行判别处理生成对应的判别分数;以及performing discrimination processing on the first high-frequency information, the second high-frequency information, and the third high-frequency information through the discrimination network in the first confrontation generation network to generate corresponding discrimination scores; and
    根据所述判别分数确定所述第一损失函数。The first loss function is determined based on the discriminant score.
  6. 根据权利要求5所述的方法,其中,所述根据所述第二风格样本图像和所述第二参考样本图像确定第二损失函数,包括:The method according to claim 5, wherein said determining a second loss function according to said second style sample image and said second reference sample image comprises:
    对所述第二风格样本图像和所述第二参考样本图像进行融合加噪处理生成第二负样本图像;Perform fusion and noise processing on the second style sample image and the second reference sample image to generate a second negative sample image;
    提取所述第二风格样本图像的第一纹理特征,所述第二参考样本图像的第二纹理特征,以及所述第二负样本图像的第三纹理特征;以及extracting a first texture feature of the second style sample image, a second texture feature of the second reference sample image, and a third texture feature of the second negative sample image; and
    根据所述第一纹理特征、所述第二纹理特征,以及所述第三纹理特征确定所述第二损失函数。The second loss function is determined according to the first texture feature, the second texture feature, and the third texture feature.
  7. 根据权利要求6所述的方法,其中,所述根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理,包括:The method according to claim 6, wherein the performing image style conversion processing on the images of the first style and the second style according to the style conversion network comprises:
    提取具有所述素颜风格原始图像中目标对象的关键区域轮廓特征;Extract key region contour features of the target object in the original image with the plain makeup style;
    对所述素颜风格原始图像和所述目标对象的关键区域轮廓特征进行编码生成所述目标对象的特征数据;以及Encoding the original image without makeup style and the contour features of key regions of the target object to generate feature data of the target object; and
    通过所述风格转换网络对所述目标对象的特征数据和所述目标对象的关键区域轮廓特征进行图像融合处理,生成具有所述油画风格的目标图像。performing image fusion processing on the feature data of the target object and the key area contour features of the target object through the style conversion network to generate a target image with the oil painting style.
  8. 一种图像处理器,包括:An image processor comprising:
    第一训练模块,被配置为获取第一风格样本图像的第一对象特征,根据所述第一对象特征和所述第一风格样本图像训练第一对抗生成网络;The first training module is configured to obtain a first object feature of a first style sample image, and train a first confrontation generation network according to the first object feature and the first style sample image;
    第二训练模块,被配置为获取第二风格样本图像的第二对象特征,根据所述第二对象特征和所述第二风格样本图像训练第二对抗生成网络;以及The second training module is configured to obtain a second object feature of a second style sample image, and train a second adversarial generation network according to the second object feature and the second style sample image; and
    融合模块,被配置为对所述第一对抗生成网络和所述第二对抗生成网络进行融合处理,生成风格转换网络,以根据所述风格转换网络对所述第一风格和所述第二风格的图像进行图像风格转换处理。A fusion module configured to perform fusion processing on the first confrontational generation network and the second confrontational generation network to generate a style conversion network, so as to analyze the first style and the second style according to the style conversion network The image is subjected to image style conversion processing.
  9. 一种电子设备,所述电子设备包括:An electronic device comprising:
    处理器;以及processor; and
    被配置为存储所述处理器可执行指令的存储器;a memory configured to store said processor-executable instructions;
    所述处理器被配置为从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-7中任一所述的图像处理方法。The processor is configured to read the executable instructions from the memory, and execute the instructions to implement the image processing method described in any one of claims 1-7 above.
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被配置为执行上述权利要求1-7中任一所述的图像处理方法。A computer-readable storage medium, the computer-readable storage medium storing a computer program configured to execute the image processing method described in any one of claims 1-7 above.
  11. 一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现上述权利要求1-9任一项所述的方法。A computer program product, the computer program product comprising a computer program/instruction, when the computer program/instruction is executed by a processor, the method described in any one of claims 1-9 above is realized.
PCT/CN2022/140574 2021-12-21 2022-12-21 Image processing method and apparatus, device, and medium WO2023116744A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111574622.1A CN116310615A (en) 2021-12-21 2021-12-21 Image processing method, device, equipment and medium
CN202111574622.1 2021-12-21

Publications (1)

Publication Number Publication Date
WO2023116744A1 true WO2023116744A1 (en) 2023-06-29

Family

ID=86831024

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/140574 WO2023116744A1 (en) 2021-12-21 2022-12-21 Image processing method and apparatus, device, and medium

Country Status (2)

Country Link
CN (1) CN116310615A (en)
WO (1) WO2023116744A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758379A (en) * 2023-08-14 2023-09-15 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402112A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111402151A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111783603A (en) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 Training method for generating confrontation network, image face changing method and video face changing method and device
US20210303927A1 (en) * 2020-03-25 2021-09-30 Microsoft Technology Licensing, Llc Multi-Task GAN, and Image Translator and Image Classifier Trained Thereby

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402112A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111402151A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
US20210303927A1 (en) * 2020-03-25 2021-09-30 Microsoft Technology Licensing, Llc Multi-Task GAN, and Image Translator and Image Classifier Trained Thereby
CN111783603A (en) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 Training method for generating confrontation network, image face changing method and video face changing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758379A (en) * 2023-08-14 2023-09-15 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN116758379B (en) * 2023-08-14 2024-05-28 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116310615A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
WO2022089360A1 (en) Face detection neural network and training method, face detection method, and storage medium
WO2020177673A1 (en) Video sequence selection method, computer device and storage medium
WO2022012179A1 (en) Method and apparatus for generating feature extraction network, and device and computer-readable medium
WO2023125379A1 (en) Character generation method and apparatus, electronic device, and storage medium
WO2023005386A1 (en) Model training method and apparatus
WO2023143178A1 (en) Object segmentation method and apparatus, device and storage medium
WO2024012255A1 (en) Semantic segmentation model training method and apparatus, electronic device, and storage medium
WO2023072015A1 (en) Method and apparatus for generating character style image, device, and storage medium
WO2023050868A1 (en) Method and apparatus for training fusion model, image fusion method and apparatus, and device and medium
WO2023232056A1 (en) Image processing method and apparatus, and storage medium and electronic device
WO2023217117A1 (en) Image assessment method and apparatus, and device, storage medium and program product
WO2023030381A1 (en) Three-dimensional human head reconstruction method and apparatus, and device and medium
CN113393544B (en) Image processing method, device, equipment and medium
WO2023116744A1 (en) Image processing method and apparatus, device, and medium
CN117095006B (en) Image aesthetic evaluation method, device, electronic equipment and storage medium
CN113610034B (en) Method and device for identifying character entities in video, storage medium and electronic equipment
WO2022012178A1 (en) Method for generating objective function, apparatus, electronic device and computer readable medium
CN117456063A (en) Face driving method and device based on voice, electronic equipment and storage medium
WO2023143118A1 (en) Image processing method and apparatus, device, and medium
WO2023130925A1 (en) Font recognition method and apparatus, readable medium, and electronic device
WO2023093481A1 (en) Fourier domain-based super-resolution image processing method and apparatus, device, and medium
WO2023071694A1 (en) Image processing method and apparatus, and electronic device and storage medium
WO2024040870A1 (en) Text image generation, training, and processing methods, and electronic device
WO2023040813A1 (en) Facial image processing method and apparatus, and device and medium
CN114049417B (en) Virtual character image generation method and device, readable medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22910056

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE