CN114049290A - Image processing method, device, device and storage medium - Google Patents

Image processing method, device, device and storage medium Download PDF

Info

Publication number
CN114049290A
CN114049290A CN202111325365.8A CN202111325365A CN114049290A CN 114049290 A CN114049290 A CN 114049290A CN 202111325365 A CN202111325365 A CN 202111325365A CN 114049290 A CN114049290 A CN 114049290A
Authority
CN
China
Prior art keywords
image
head
feature map
area
target person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111325365.8A
Other languages
Chinese (zh)
Inventor
束长勇
刘家铭
洪智滨
韩钧宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202111325365.8A priority Critical patent/CN114049290A/en
Publication of CN114049290A publication Critical patent/CN114049290A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了图像处理方法、装置设备及存储介质,涉及人工智能领域,具体为深度学习、计算机视觉技术领域,可应用于人脸图像处理、人脸识别等场景下。具体实现方案为:获取参考图像和目标人员头部图;利用目标人员头部替换参考图像中参考人员头部得到待合成图像,待合成图像包括部分参考背景、目标人员头部,以及位于部分参考背景和目标人员头部之间的待填充区域;对参考图像和待合成图像进行特征提取得到肤色样例特征图和填充样例特征图;基于肤色样例特征图、填充样例特征图和待合成图像生成合成图像。通过提取肤色样例特征图和填充样例特征图用于图像合成使得合成图像更加自然、真实。

Figure 202111325365

The present disclosure provides an image processing method, an apparatus, and a storage medium, which relate to the field of artificial intelligence, specifically the field of deep learning and computer vision technology, and can be applied to scenarios such as face image processing and face recognition. The specific implementation scheme is: obtaining the reference image and the head image of the target person; using the head of the target person to replace the head of the reference person in the reference image to obtain the image to be synthesized, and the image to be synthesized includes part of the reference background, the head of the target person, and the image to be synthesized in part of the reference image. The area to be filled between the background and the head of the target person; the feature extraction of the reference image and the image to be synthesized to obtain the skin color sample feature map and the filled sample feature map; based on the skin color sample feature map, the filled sample feature map and the feature map to be Composite Image Generate composite image. By extracting the skin color sample feature map and filling sample feature map for image synthesis, the composite image is more natural and realistic.

Figure 202111325365

Description

Image processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of deep learning and computer vision technologies, which can be applied to scenes such as face image processing and face recognition, and in particular, to an image processing method, apparatus, device, storage medium, and computer program product.
Background
With the development of computing technology and artificial intelligence, the fusion network has functions of skin color alignment, neck and background filling, and is widely applied to scenes such as face image editing and fusion, for example, fusing a head portrait of a person to a body of a specific person or a specific scene or background.
Disclosure of Invention
The present disclosure provides an image processing method, apparatus, device, storage medium, and computer program product.
According to a first aspect of the present disclosure, there is provided an image processing method including: acquiring a reference image and a target person head image, wherein the reference image comprises a reference background and a reference person head; replacing the head of a reference person in the reference image by the head of the target person to obtain an image to be synthesized, wherein the image to be synthesized comprises a part of reference background, the head of the target person and a region to be filled between the part of reference background and the head of the target person; performing feature extraction on the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map; and generating a composite image based on the skin color sample characteristic diagram, the filling sample characteristic diagram and the image to be synthesized.
According to a second aspect of the present disclosure, there is provided an image processing apparatus comprising: an acquisition module configured to acquire a reference image and a target person head image, wherein the reference image includes a reference background and a reference person head; the replacing module is configured to replace the head of the reference person in the reference image by using the head image of the target person to obtain an image to be synthesized, wherein the image to be synthesized comprises a part of reference background, the head of the target person and a region to be filled between the part of reference background and the head of the target person; the characteristic extraction module is configured to extract the characteristics of the reference image and the image to be synthesized to obtain a skin color sample characteristic diagram and a filling sample characteristic diagram; and the image generation module is configured to generate a composite image based on the skin color sample characteristic diagram, the filling sample characteristic diagram and the image to be synthesized.
According to a third aspect of the present disclosure, there is provided an electronic apparatus comprising: at least one processor; and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to implement a method as described in any of the implementations of the first aspect.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method as described in any implementation manner of the first aspect.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method as described in any of the implementations of the first aspect.
According to the image processing method, the device, the equipment, the storage medium and the computer program product, firstly, the head of a target person is used for replacing the head of a reference person in a reference image to obtain an image to be synthesized, then, the reference image and the image to be synthesized are subjected to feature extraction to obtain a skin color sample feature map and a filling sample feature map, and finally, a synthetic image is generated based on the skin color sample feature map, the filling sample feature map and the image to be synthesized, and the generated synthetic image is more natural and real.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is an exemplary system architecture to which the present disclosure may be applied;
FIG. 2 is a flow diagram of one embodiment of an image processing method of the present disclosure;
FIG. 3 is a flow diagram of another embodiment of an image processing method of the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of an image processing method of the present disclosure;
fig. 5A is a schematic view of an application scenario of the image processing method of the present disclosure;
FIG. 5B is a schematic diagram of extracting a skin tone sample feature map and a fill sample feature map in the scene of FIG. 5A;
fig. 6 is a schematic configuration diagram of an example of an image processing apparatus of the present disclosure;
fig. 7 is a block diagram of an electronic device for implementing a method of image processing of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the image processing method or image processing apparatus of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user can use the terminal apparatuses 101, 102, 103 to interact with the server 105 through the network 104 to acquire image processing results and the like. Various client applications, such as an image composition application, etc., may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the above-described electronic apparatuses. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may provide various image composition based services or applications. For example, the server 105 may process the reference image and the target person head image acquired from the terminal apparatuses 101, 102, 103, and generate a processing result (e.g., generate a composite image).
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the image processing method provided by the embodiment of the present disclosure is generally executed by the server 105, and accordingly, the image processing apparatus is generally disposed in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring to fig. 2, fig. 2 shows a flowchart of an image processing method provided by an embodiment of the present disclosure, where the flowchart 200 includes the following steps:
step 201, acquiring a reference image and a target person head image.
In the present embodiment, the execution subject of the image processing method (e.g., the server 105 shown in fig. 1) may acquire the reference image and the target person head image. Wherein the reference image comprises a reference background and a reference person head. The reference image may be acquired by directly using an image sensor, for example, the image sensor may be a camera, or may be acquired from a local file storing a large number of images. For example, the reference image may be an image captured by a camera with the reference person as a target and the environment where the reference person is located as a background. The head image of the target person may be an image obtained by individually dividing a head region of a certain person from an image captured by the camera. Optionally, the reference image further comprises an exposed skin area other than the head of the reference person; illustratively, the reference image may include the neck and/or arms, etc. of the reference person in addition to the head of the reference person.
And 202, replacing the head of the reference person in the reference image by using the head image of the target person to obtain an image to be synthesized.
In this embodiment, after the execution main body acquires the reference image and the target person head image, the target person head image may be used to replace the head of the reference person in the reference image to obtain an image to be synthesized, where the image to be synthesized includes a part of the reference background, the head of the target person, and a region to be filled between the part of the reference background and the head of the target person. In the implementation process, considering that the head sizes, shapes and the like of different people have differences, when the head of the reference person is replaced, the head of the reference person and the background of the preset distance or shape around the head can be removed in advance, the head image of the target person is added to the position of the head of the original reference person, and therefore the image to be synthesized is obtained, wherein the partial reference background refers to a part of the reference background in the reference image, for example, the residual background area of the background of the preset distance or shape around the head of the reference person is removed from the reference background.
And 203, extracting the features of the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map.
In this embodiment, after obtaining the image to be synthesized, the executing entity may perform feature extraction on the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map. The way of feature extraction may be any existing way of extraction including, but not limited to, HOG (histogram of oriented gradients) extraction algorithm, scale invariant feature transform, neural network features, and the like. The skin color sample characteristic features skin color information of a reference person in the reference image, and the filling sample characteristic map features filling information of a region to be filled indicated by the image to be synthesized, which is extracted from the reference image.
And step 204, generating a composite image based on the skin color sample characteristic diagram, the filling sample characteristic diagram and the image to be synthesized.
In this embodiment, after obtaining the skin color sample feature map and the filling sample feature map, the execution subject may generate a synthesized image based on the skin color sample feature map, the filling sample feature map and the image to be synthesized. The executing body can fuse the images to be synthesized by combining the acquired skin color sample characteristic diagram and the filling sample characteristic diagram by adopting a fusion network to obtain a synthesized image, wherein the synthesized image comprises the head of the target person, and the skin color of the target person is the same as that of the reference person.
The image processing method provided by this embodiment includes first replacing the head of a reference person in a reference image with the head of a target person to obtain an image to be synthesized, then performing feature extraction on the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map, and finally generating a synthesized image based on the skin color sample feature map, the filling sample feature map and the image to be synthesized, where the generated synthesized image is more natural and real.
With further continuing reference to fig. 3, fig. 3 illustrates a flow 300 of another embodiment of an image processing method of the present disclosure. The image processing method comprises the following steps:
and 301, acquiring a reference image and a target person head image.
In this embodiment, the specific operation of step 301 has been described in detail in step 201 in the embodiment shown in fig. 2, and is not described herein again.
Step 302, performing five sense organs segmentation on the head of the reference person in the reference image to obtain a head mask, and taking the region except the head mask in the reference image as a reference background.
In the present embodiment, the five sense organs segmentation means that an image is divided into regions corresponding to respective organs through five sense organs (eyebrows, eyes, nose, mouth, eyebrows, ears) of a human face, and then a complete head region is located through the divided regions; the head mask represents a region where the head is located, for example, in a specific implementation process, the execution subject may extract the head as a solid to obtain a binarized image as the head mask, specifically, a pixel value of the region where the head is located in the reference image may be set to 255, and a pixel value of a region other than the head in the reference image may be set to 0, that is, the head is represented by white, and the rest is filled by black. Optionally, in a specific implementation process, the execution subject may further perform translation and scaling on the reference image according to the size of the head portrait of the target person before performing the five-sense organ segmentation, so that the size of the head of the reference person in the reference image after the translation and scaling is equivalent to the size of the head of the target person.
And step 303, expanding the head mask to obtain an expanded region, wherein the area of the expanded region is larger than that of the image of the head of the target person.
In this embodiment, the expansion manner includes, but is not limited to, expanding the outermost edge of the head part by a preset distance with reference to the center of the head mask, or expanding the other side of the head part by a preset distance with reference to one side of the head part. For example, the above implementation subject may set the pixel value of the region having a distance of less than two millimeters from the outermost edge of the head to 0 in the region having a pixel value of 255 in the head mask obtained in the foregoing, so as to obtain an expansion region having a larger area than the original head, and the embodiment does not limit the expansion size, and the above-mentioned specific distance value is merely used for illustration.
And step 304, determining a region which does not intersect with the expansion region in the reference background as a part of the reference background, adding the head image of the target person to the expansion region, and determining a region which does not intersect with the head of the target person in the expansion region as a region to be filled.
In this embodiment, after the execution main body obtains the expansion region, the execution main body may map the expansion region to a position corresponding to a reference person on the reference image by comparing the reference background with the expansion region, and the expansion region may be mapped in a manner that an original head region in the expansion region corresponds to a head region on the reference image, so as to obtain a region that is not overlapped with the expansion region in the reference background as a partial reference background; further, the execution main body may map the head of the target person to the expansion region after the expansion region is obtained, and since the expansion region is obtained based on the expansion of the head of the reference person, mapping may be performed in a manner that parts of five sense organs are aligned when mapping the head of the target person, for example, aligning the nose of the target person with the nose of the original reference person when adding the head portrait of the target person, and finally, taking a region of the expansion region that does not cover the head of the target person as a region to be filled.
And 305, integrating the head image of the target person, part of the reference background and the region to be filled to obtain an image to be synthesized.
In this embodiment, after the execution main body determines the area to be filled and the partial reference background, the execution main body integrates the obtained partial reference background, the area to be filled, and the head image of the target person to obtain an image to be synthesized, where a part of the background of the image to be synthesized surrounds the area to be filled, and the area to be filled surrounds the head image of the target person.
Step 306, extracting the features of the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map;
and 307, generating a synthetic image based on the skin color sample characteristic diagram, the filling sample characteristic diagram and the image to be synthesized.
In the present embodiment, the specific operations of steps 306-307 have been described in detail in steps 203 and 204 in the embodiment shown in fig. 2, and are not described herein again.
According to the image processing method provided by the embodiment, the images to be synthesized are obtained by segmenting and expanding the reference image and integrating the segmented and expanded reference image with the head image of the target person, and the images to be synthesized are constructed by utilizing the reference image, so that the synthesized images are more real and natural.
With further continuing reference to fig. 4, fig. 4 shows a flow chart in yet another embodiment of an image processing method of the present disclosure, the image processing method comprising the steps of:
step 401, acquiring a reference image and a target person head image.
And 402, replacing the head of the reference person in the reference image by the head image of the target person to obtain an image to be synthesized.
In the present embodiment, the specific operations of steps 401 and 402 have been described in detail in step 201 and 202 in the embodiment shown in fig. 2, and are not described herein again.
And step 403, extracting the features of the reference image and the features of the image to be synthesized by using the feature extraction network.
In this embodiment, the feature extraction network may select classical backbone networks such as AlexNet, ZF Net, VGGNet, inclusion, ResNet, and the like without loss of generality, and the feature extraction network adopts a dual-input dual-output structure, specifically, a reference image and an image to be synthesized are used as dual inputs, and features of the reference image and features of the image to be synthesized are used as dual outputs.
And step 404, extracting a skin color sample feature map and a filling sample feature map from the features of the reference image and the features of the image to be synthesized by adopting an attention mechanism.
In this embodiment, after the execution subject extracts the features of the reference image and the features of the image to be synthesized, the execution subject may extract skin color information of the reference person as a skin color sample feature map by combining the features of the reference image and the features of the image to be synthesized, and extract information of a region to be filled as a filling sample feature map by combining the features of the reference image and the features of the image to be synthesized, so as to provide richer fusion information for subsequent image synthesis.
Optionally, step 404 includes determining the head features and the region features to be filled of the target person based on the features of the image to be synthesized; calculating an attention matrix by using the head characteristics of the target person and the characteristics of the reference image to obtain a color attention characteristic diagram; multiplying the color attention feature map and the features of the reference image to obtain a skin color sample feature map; calculating an attention matrix by using the characteristics of the region to be filled and the characteristics of the reference image to obtain a filled region attention characteristic diagram; and multiplying the attention feature map of the filling area with the features of the reference image to obtain a filling sample feature map.
And step 405, performing color processing on the head image of the target person to obtain a head gray scale image.
In this embodiment, the color processing refers to traversing the color of the head image of the target person, and processing the head image of the target person, which is originally composed of three RGB channels, into a single-channel grayscale image, so as to obtain a head grayscale image.
And step 406, synthesizing the skin color sample feature map, the filling sample feature map, the head mask, the head gray scale map and part of the reference background to obtain a synthesized image.
In this embodiment, after the execution subject extracts the skin color sample feature map and the filling sample feature map, the execution subject sends the skin color sample feature map and the filling sample feature map, the head mask, the head grayscale map, and a part of the reference background into a pre-trained fusion network for fusion, so as to obtain a composite image.
Optionally, step 406 includes stitching the skin color sample feature map, the filling sample feature map, the head mask, the head grayscale map, and a part of the reference background to obtain a stitched map; and inputting the mosaic into a pre-trained fusion network for fusion to obtain a synthetic image. For example, converged networks include, but are not limited to, a Unet network. Splicing refers to combining information of each channel along the dimension of the channel; for example, the signature 1 has four channels, denoted as B × C1 × W × H, the signature 2 has four channels, denoted as B × C2 × W × H, and a spliced graph is obtained by splicing the signature 1 and the signature 2 along the channels, denoted as B (C1+ C2) × W × H, the number of channels in the signature is not limited in this embodiment, and the number of channels in this embodiment is merely used for illustration.
In this embodiment, the pre-trained fusion network generates a human body image G (X) specifying a pose expression and an ID by a generatorInput,XRef) Y, wherein XInputFor picture of image to be synthesized, XRefThe image is a reference image, Y is a composite image output after fusion, and the loss function during the training of the fusion network comprises the following components:
(1) ID reservation loss. The intermediate features extracted by Arcface are adopted for alignment in a high-dimensional information space:
LID=||Arcface(Y)-Arcface(XGT)||2
wherein XGTIs part of the background derived from the reference image.
(2) Image feature alignment is lost. The intermediate features extracted using VGG19 are aligned in the high-dimensional information space:
LVGG=||VGG(Y)-VGG(XGT)||2
(3) and judging the loss of feature alignment. The intermediate features extracted by the discriminator D are aligned in the high-dimensional information space:
LD=||D(Y)-D(XGT)||2
(4) the discriminator is lost. Countermeasure training with discriminators to reduce artifacts in the generated images:
LGAN=E(logD(XGT))+E(log(1-D(Y)))
in the image processing method provided by the embodiment, the attention mechanism is adopted to extract the skin color sample characteristic diagram and the filling sample characteristic diagram, so that the skin color sample characteristic diagram has more real skin texture compared with the average color information adopted by the fusion network, the reduction of the migration quality of skin color caused by the adoption of the average color information is avoided, meanwhile, the filling sample characteristic diagram avoids the information imagined by the network from being inconsistent with the original information of the region to be filled, more complete and reliable information is provided for the fusion network, and the image synthesis mode is enriched.
In order to facilitate understanding of the technical solution of the present invention, a head-changing application scenario is taken as an example for detailed description, please refer to fig. 5A and 5B, where fig. 5A illustrates an application scenario of the image processing method of the present disclosure, fig. 5B is a schematic diagram of extracting a skin color sample feature map and filling a sample feature map in the application scenario of fig. 5A, in the application scenario, a reference image 1 includes a reference person's neck in addition to a reference person's head and background, in an implementation process, the execution subject inputs the reference image 1 and an image to be synthesized 2 into a feature extraction network 3, and the feature extraction network 3 outputs a feature 4 of the reference image and a feature 5 of the image to be synthesized. Further, the executing subject extracts 6 the features 4 of the reference image and the features 5 of the image to be synthesized through attention features to obtain a skin color sample feature map 7 and a filling sample feature map 8; referring to fig. 5B, the features 5 of the image to be synthesized include a head feature 5a of the target person and a feature 5B of the area to be filled, and the execution subject calculates an attention matrix by using the head feature 5a of the target person and the feature 5B of the area to be filled and the feature 4 of the reference image, and multiplies the attention matrix by the feature 4 of the reference image to obtain a skin color sample feature map 7 and a filling sample feature map 8. Referring to fig. 5A again, after the execution subject acquires the head mask 9, the partial reference background 10, and the head gray-scale image 11 based on the image to be synthesized 2, the execution subject concatenates the three with the skin color sample feature map 7 and the filling sample feature map 8 obtained in the previous step, and then inputs the three into the pre-trained fusion network 12, and the pre-trained fusion network 12 outputs the synthesized image 13.
In the embodiment, the synthesized image obtained by the above method is targeted at the target person, is the background of the reference image, and further comprises the neck of the reference person in the reference image, and by the fact that the skin color of the target person in the synthesized image is the same as the skin color of the reference person, it is ensured that the head of the target person in the image to be synthesized has no difference from the skin color of the neck in the image, and the area around the head combined with the background is more real and natural, so that the attractiveness of the synthesized image is improved.
Referring further to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an image processing apparatus, and an embodiment of an apparatus for training an image style conversion model corresponds to an embodiment of a method for training an image style conversion model shown in fig. 2. The device can be applied to various electronic equipment.
As shown in fig. 5, the image processing apparatus 600 of the present embodiment may include: an acquisition module 601, a replacement module 602, a feature extraction module 603, and an image generation module 604. The acquiring module 601 is configured to acquire a reference image and a target person head image, where the reference image includes a reference background and a reference person head; a replacing module 602, configured to replace the head of the reference person in the reference image with the head image of the target person, so as to obtain an image to be synthesized, where the image to be synthesized includes a partial reference background, the head of the target person, and a region to be filled between the partial reference background and the head of the target person; a feature extraction module 603 configured to perform feature extraction on the reference image and the image to be synthesized to obtain a skin color sample feature map and a filling sample feature map; and the image generating module 604 is configured to generate a composite image based on the skin color sample feature map, the filling sample feature map and the image to be synthesized.
In the present embodiment, in the image processing apparatus 600: the specific processing and the technical effects thereof of the obtaining module 601, the replacing module 602, the feature extracting module 603 and the image generating module 604 can refer to the related descriptions of step 201 and step 204 in the corresponding embodiment of fig. 2, which are not described herein again.
In some optional implementations of this embodiment, the feature extraction module 603 includes:
a first extraction module configured to extract features of a reference image and features of an image to be synthesized using a feature extraction network;
and the second extraction module is configured to adopt an attention mechanism to extract a skin color sample feature map and a filling sample feature map from the features of the reference image and the features of the image to be synthesized.
In some optional implementations of this embodiment, the second extracting module includes:
the characteristic determination module is configured to determine the head characteristic of the target person and the characteristic of the region to be filled based on the characteristic of the image to be synthesized;
the first calculation module is configured to calculate an attention matrix by using the head characteristics of the target person and the characteristics of the reference image to obtain a color attention characteristic map;
the first multiplication module is configured to multiply the color attention feature map and the features of the reference image to obtain a skin color sample feature map;
the second calculation module is configured to calculate an attention matrix by using the characteristics of the region to be filled and the characteristics of the reference image to obtain a filled region attention characteristic map;
and the second multiplying module is configured to multiply the filled region attention feature map and the features of the reference image to obtain a filled sample feature map.
In some optional implementations of this embodiment, the replacing module 602 includes:
the segmentation module is configured to perform five-sense organ segmentation on the head of a reference person in the reference image to obtain a head mask, and the region except the head mask in the reference image is used as a reference background;
the expansion module is configured to expand the head mask to obtain an expansion area, wherein the area of the expansion area is larger than that of the image of the head of the target person;
the region determining module is configured to determine a region, which does not intersect with the expansion region, in the reference background as a partial reference background, add the head image of the target person to the expansion region, and determine a region, which does not intersect with the head of the target person, in the expansion region as a region to be filled;
and the integration module is configured to integrate the head image of the target person, part of the reference background and the region to be filled to obtain an image to be synthesized.
In some optional implementations of this embodiment, the image generating module 604 includes:
the color processing module is configured to perform color processing on the head image of the target person to obtain a head gray scale image;
and the synthesis module is configured to synthesize the skin color sample feature map, the filling sample feature map, the head mask, the head gray scale map and part of the reference background to obtain a synthesized image.
In some optional implementations of this embodiment, the synthesizing module includes:
the splicing module is configured to splice the skin color sample characteristic graph, the filling sample characteristic graph, the head mask, the head gray-scale graph and part of the reference background to obtain a spliced graph;
and the fusion module is configured to input the mosaic into a fusion network trained in advance for fusion to obtain a composite image.
In some optional implementations of the present embodiment, the reference image further includes an exposed skin area other than the head of the reference person.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 executes the respective methods and processes described above, such as an image processing method. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM 03 and executed by the computing unit 701, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the image processing method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1.一种图像处理方法,包括:1. An image processing method, comprising: 获取参考图像和目标人员头部图像,其中,所述参考图像包括参考背景和参考人员头部;acquiring a reference image and a head image of the target person, wherein the reference image includes a reference background and a head of the reference person; 利用所述目标人员头部图像替换所述参考图像中参考人员头部,得到待合成图像,其中,所述待合成图像包括部分参考背景、目标人员头部,以及位于所述部分参考背景和所述目标人员头部之间的待填充区域;The head of the reference person in the reference image is replaced by the head image of the target person to obtain an image to be synthesized, wherein the image to be synthesized includes a part of the reference background, the head of the target person, and the part of the reference background and the part of the head of the target person. Describe the area to be filled between the head of the target person; 对所述参考图像和所述待合成图像进行特征提取,得到肤色样例特征图和填充样例特征图;Feature extraction is performed on the reference image and the to-be-synthesized image to obtain a skin color sample feature map and a filling sample feature map; 基于所述肤色样例特征图、所述填充样例特征图和所述待合成图像生成合成图像。A composite image is generated based on the skin color sample feature map, the fill sample feature map, and the to-be-combined image. 2.根据权利要求1所述的方法,其中,所述对所述参考图像和所述待合成图像进行特征提取,得到肤色样例特征图和填充样例特征图,包括:2. The method according to claim 1, wherein the feature extraction is performed on the reference image and the to-be-synthesized image to obtain a skin color sample feature map and a filling sample feature map, comprising: 利用特征提取网络提取所述参考图像的特征和所述待合成图像的特征;Extract the feature of the reference image and the feature of the to-be-synthesized image by using a feature extraction network; 采用注意力机制从所述参考图像的特征和所述待合成图像的特征中提取肤色样例特征图和填充样例特征图。An attention mechanism is used to extract a skin color sample feature map and a filling sample feature map from the features of the reference image and the features of the to-be-synthesized image. 3.根据权利要求2所述的方法,其中,所述采用注意力机制从所提取的所述参考图像和所述待合成图像的特征中提取肤色样例特征图和填充样例特征图,包括:3. The method according to claim 2, wherein the use of an attention mechanism to extract a skin color sample feature map and a fill sample feature map from the extracted features of the reference image and the to-be-synthesized image, comprising: : 基于所述待合成图像的特征确定目标人员头部特征和待填充区域特征;Determine the head feature of the target person and the feature of the area to be filled based on the feature of the to-be-synthesized image; 利用所述目标人员头部特征和所述参考图像的特征计算注意力矩阵,得到颜色注意力特征图;Calculate the attention matrix by utilizing the head feature of the target person and the feature of the reference image to obtain a color attention feature map; 将所述颜色注意力特征图与所述参考图像的特征相乘,得到所述肤色样例特征图;Multiplying the color attention feature map and the feature of the reference image to obtain the skin color sample feature map; 利用所述待填充区域特征和所述参考图像的特征计算注意力矩阵,得到填充区域注意力特征图;The attention matrix is calculated by utilizing the feature of the to-be-filled area and the feature of the reference image to obtain an attention feature map of the filled area; 将所述填充区域注意力特征图与所述参考图像的特征相乘,得到所述填充样例特征图。Multiplying the filled region attention feature map with the feature of the reference image to obtain the filled sample feature map. 4.根据权利要求1-3任一项所述的方法,所述利用所述目标人员头部图像替换所述参考图像中参考人员头部,得到待合成图像,包括:4. The method according to any one of claims 1-3, wherein replacing the head of the reference person in the reference image with the head image of the target person to obtain an image to be synthesized, comprising: 对所述参考图像中的所述参考人员头部进行五官分割,得到头部掩膜,并将所述参考图像中除所述头部掩膜以外的区域作为所述参考背景;Perform facial features segmentation on the head of the reference person in the reference image to obtain a head mask, and use the area of the reference image other than the head mask as the reference background; 对所述头部掩膜进行膨胀,得到膨胀区域,其中,所述膨胀区域的面积大于所述目标人员头部图像的面积;Expanding the head mask to obtain an expanded area, wherein the area of the expanded area is larger than the area of the target person's head image; 将所述参考背景中与所述膨胀区域不相交的区域确定为所述部分参考背景,将所述目标人员头部图像添加至所述膨胀区域,以及将所述膨胀区域中与所述目标人员头部不相交的区域确定为所述待填充区域;determining an area of the reference background that does not intersect the inflated area as the partial reference background, adding the target person head image to the inflated area, and combining the inflated area with the target person The area where the heads do not intersect is determined as the area to be filled; 对所述目标人员头部图像、所述部分参考背景、所述待填充区域进行整合,得到所述待合成图像。The to-be-combined image is obtained by integrating the target person's head image, the partial reference background, and the to-be-filled area. 5.根据权利要求4所述的方法,其中,所述基于肤色样例特征图、填充样例特征图和所述待合成图像生成合成图像,包括:5. The method according to claim 4, wherein the generating a composite image based on the skin color sample feature map, the filling sample feature map and the to-be-synthesized image comprises: 对所述目标人员头部图像进行颜色处理,得到头部灰度图;Perform color processing on the head image of the target person to obtain a grayscale image of the head; 对所述肤色样例特征图、所述填充样例特征图、所述头部掩膜、所述头部灰度图以及所述部分参考背景进行合成处理,得到所述合成图像。The composite image is obtained by synthesizing the skin color sample feature map, the filling sample feature map, the head mask, the head grayscale map, and the partial reference background. 6.根据权利要求5所述的方法,其中,所述对所述肤色样例特征图、所述填充样例特征图、所述头部掩膜、所述头部灰度图以及所述部分参考背景进行合成处理,得到所述合成图像,包括:6. The method of claim 5, wherein the comparison of the skin color sample feature map, the fill sample feature map, the head mask, the head grayscale map, and the portion Perform synthesis processing with reference to the background to obtain the synthesized image, including: 对所述肤色样例特征图、所述填充样例特征图、所述头部掩膜、所述头部灰度图以及所述部分参考背景进行拼接,得到拼接图;splicing the skin color sample feature map, the filling sample feature map, the head mask, the head grayscale map and the partial reference background to obtain a mosaic map; 将所述拼接图输入至预先训练的融合网络中进行融合,得到所述合成图像。The mosaic image is input into a pre-trained fusion network for fusion to obtain the composite image. 7.根据权利要求1所述的方法,其中,所述参考图像还包括除参考人员头部以外的皮肤裸露区域。7. The method of claim 1 , wherein the reference image further includes exposed areas of skin other than the head of the reference person. 8.一种图像处理装置,包括:8. An image processing device, comprising: 获取模块,被配置为获取参考图像和目标人员头部图像,其中,所述参考图像包括参考背景和参考人员头部;an acquisition module configured to acquire a reference image and a head image of the target person, wherein the reference image includes a reference background and a reference person's head; 替换模块,被配置为利用所述目标人员头部图像替换所述参考图像中参考人员头部,得到待合成图像,其中,所述待合成图像包括部分参考背景、目标人员头部,以及位于所述部分参考背景和所述目标人员头部之间的待填充区域;The replacement module is configured to replace the head of the reference person in the reference image with the head image of the target person to obtain an image to be synthesized, wherein the image to be synthesized includes a part of the reference background, the head of the target person, and the image to be synthesized. the area to be filled between the part of the reference background and the head of the target person; 特征提取模块,被配置为对所述参考图像和所述待合成图像进行特征提取,得到肤色样例特征图和填充样例特征图;a feature extraction module, configured to perform feature extraction on the reference image and the to-be-synthesized image to obtain a skin color sample feature map and a filling sample feature map; 图像生成模块,被配置为基于所述肤色样例特征图、所述填充样例特征图和所述待合成图像生成合成图像。An image generation module configured to generate a composite image based on the skin color sample feature map, the fill sample feature map, and the to-be-synthesized image. 9.根据权利要求8所述的装置,其中,所述特征提取模块,包括:9. The apparatus according to claim 8, wherein the feature extraction module comprises: 第一提取模块,被配置为利用特征提取网络提取所述参考图像的特征和所述待合成图像的特征;a first extraction module, configured to extract the feature of the reference image and the feature of the to-be-synthesized image by using a feature extraction network; 第二提取模块,被配置为采用注意力机制从所述参考图像的特征和所述待合成图像的特征中提取肤色样例特征图和填充样例特征图。The second extraction module is configured to use an attention mechanism to extract a skin color sample feature map and a filling sample feature map from the features of the reference image and the features of the to-be-synthesized image. 10.根据权利要求9所述的装置,其中,所述第二提取模块,包括:10. The apparatus of claim 9, wherein the second extraction module comprises: 特征确定模块,被配置为基于所述待合成图像的特征确定目标人员头部特征和待填充区域特征;a feature determination module, configured to determine the head feature of the target person and the feature of the area to be filled based on the feature of the to-be-synthesized image; 第一计算模块,被配置为利用所述目标人员头部特征和所述参考图像的特征计算注意力矩阵,得到颜色注意力特征图;a first calculation module, configured to calculate an attention matrix using the head feature of the target person and the feature of the reference image to obtain a color attention feature map; 第一相乘模块,被配置为将所述颜色注意力特征图与所述参考图像的特征相乘,得到所述肤色样例特征图;a first multiplication module, configured to multiply the features of the color attention feature map and the reference image to obtain the skin color sample feature map; 第二计算模块,被配置为利用所述待填充区域特征和所述参考图像的特征计算注意力矩阵,得到填充区域注意力特征图;A second computing module, configured to calculate an attention matrix using the feature of the to-be-filled area and the feature of the reference image to obtain an attention feature map of the filled area; 第二相乘模块,被配置为将所述填充区域注意力特征图与所述参考图像的特征相乘,得到所述填充样例特征图。The second multiplication module is configured to multiply the filled region attention feature map with the feature of the reference image to obtain the filled sample feature map. 11.根据权利要求8-10任一项所述的装置,所述替换模块,包括:11. The apparatus of any one of claims 8-10, the replacement module comprising: 分割模块,被配置为对所述参考图像中的所述参考人员头部进行五官分割,得到头部掩膜,并将所述参考图像中除所述头部掩膜以外的区域作为所述参考背景;a segmentation module, configured to perform facial features segmentation on the head of the reference person in the reference image to obtain a head mask, and use the area in the reference image except for the head mask as the reference background; 膨胀模块,被配置为对所述头部掩膜进行膨胀,得到膨胀区域,其中,所述膨胀区域的面积大于所述目标人员头部图像的面积;an expansion module, configured to expand the head mask to obtain an expanded area, wherein the area of the expanded area is larger than the area of the target person's head image; 区域确定模块,被配置为将所述参考背景中与所述膨胀区域不相交的区域确定为所述部分参考背景,将所述目标人员头部图像添加至所述膨胀区域,以及将所述膨胀区域中与所述目标人员头部不相交的区域确定为所述待填充区域;an area determination module configured to determine an area of the reference background that does not intersect the inflated area as the partial reference background, add the target person's head image to the inflated area, and add the inflated area The area in the area that does not intersect with the head of the target person is determined as the area to be filled; 整合模块,被配置为对所述目标人员头部图像、所述部分参考背景、所述待填充区域进行整合,得到所述待合成图像。The integration module is configured to integrate the target person's head image, the partial reference background, and the to-be-filled area to obtain the to-be-synthesized image. 12.根据权利要求11所述的装置,其中,所述图像生成模块,包括:12. The apparatus of claim 11, wherein the image generation module comprises: 颜色处理模块,被配置为对所述目标人员头部图像进行颜色处理,得到头部灰度图;The color processing module is configured to perform color processing on the head image of the target person to obtain a grayscale image of the head; 合成模块,被配置为对所述肤色样例特征图、所述填充样例特征图、所述头部掩膜、所述头部灰度图以及所述部分参考背景进行合成处理,得到所述合成图像。A synthesis module, configured to perform synthesis processing on the skin color sample feature map, the filling sample feature map, the head mask, the head grayscale map, and the partial reference background, to obtain the Composite image. 13.根据权利要求12所述的装置,其中,所述合成模块,包括:13. The apparatus of claim 12, wherein the synthesis module comprises: 拼接模块,被配置为对所述肤色样例特征图、所述填充样例特征图、所述头部掩膜、所述头部灰度图以及所述部分参考背景进行拼接,得到拼接图;a splicing module, configured to splicing the skin color sample feature map, the filling sample feature map, the head mask, the head gray image and the partial reference background to obtain a splicing map; 融合模块,被配置为将所述拼接图输入至预先训练的融合网络中进行融合,得到所述合成图像。The fusion module is configured to input the mosaic image into a pre-trained fusion network for fusion to obtain the composite image. 14.根据权利要求8所述的装置,其中,所述参考图像还包括除参考人员头部以外的皮肤裸露区域。14. The apparatus of claim 8, wherein the reference image further comprises exposed areas of skin other than the head of the reference person. 15.一种电子设备,包括:15. An electronic device comprising: 至少一个处理器;以及at least one processor; and 与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein, 所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7任一项所述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1-7 method. 16.一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1-7中任一项所述的方法。16. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of any of claims 1-7. 17.一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-7中任一项所述的方法。17. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-7.
CN202111325365.8A 2021-11-10 2021-11-10 Image processing method, device, device and storage medium Pending CN114049290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111325365.8A CN114049290A (en) 2021-11-10 2021-11-10 Image processing method, device, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111325365.8A CN114049290A (en) 2021-11-10 2021-11-10 Image processing method, device, device and storage medium

Publications (1)

Publication Number Publication Date
CN114049290A true CN114049290A (en) 2022-02-15

Family

ID=80207957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111325365.8A Pending CN114049290A (en) 2021-11-10 2021-11-10 Image processing method, device, device and storage medium

Country Status (1)

Country Link
CN (1) CN114049290A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131202A (en) * 2022-05-25 2022-09-30 腾讯科技(深圳)有限公司 An image generation method, device, electronic device and storage medium
CN115131260A (en) * 2022-07-22 2022-09-30 北京字跳网络技术有限公司 Image processing method, apparatus, device, computer-readable storage medium and product
CN116385641A (en) * 2023-03-29 2023-07-04 北京百度网讯科技有限公司 Image processing method and device, electronic equipment and storage medium
CN117291979A (en) * 2023-09-26 2023-12-26 北京鹰之眼智能健康科技有限公司 Ear hole positioning method, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050286786A1 (en) * 2004-06-17 2005-12-29 Reiko Noda Apparatus and method for coding image based on level of visual attention and level of perceivable image quality distortion, and computer program product therefor
US20070031032A1 (en) * 2005-08-05 2007-02-08 Samsung Electronics Co., Ltd. Method and apparatus for performing conversion of skin color into preference color by applying face detection and skin area detection
CN101263721A (en) * 2005-07-13 2008-09-10 日本电气株式会社 Color correction method and color correction device
CN109784301A (en) * 2019-01-28 2019-05-21 广州酷狗计算机科技有限公司 Image processing method, device, computer equipment and storage medium
CN111027382A (en) * 2019-11-06 2020-04-17 华中师范大学 Attention mechanism-based lightweight face detection method and model
CN111063008A (en) * 2019-12-23 2020-04-24 北京达佳互联信息技术有限公司 Image processing method, device, equipment and storage medium
CN112967355A (en) * 2021-03-05 2021-06-15 北京百度网讯科技有限公司 Image filling method and device, electronic device and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050286786A1 (en) * 2004-06-17 2005-12-29 Reiko Noda Apparatus and method for coding image based on level of visual attention and level of perceivable image quality distortion, and computer program product therefor
CN101263721A (en) * 2005-07-13 2008-09-10 日本电气株式会社 Color correction method and color correction device
US20070031032A1 (en) * 2005-08-05 2007-02-08 Samsung Electronics Co., Ltd. Method and apparatus for performing conversion of skin color into preference color by applying face detection and skin area detection
CN109784301A (en) * 2019-01-28 2019-05-21 广州酷狗计算机科技有限公司 Image processing method, device, computer equipment and storage medium
CN111027382A (en) * 2019-11-06 2020-04-17 华中师范大学 Attention mechanism-based lightweight face detection method and model
CN111063008A (en) * 2019-12-23 2020-04-24 北京达佳互联信息技术有限公司 Image processing method, device, equipment and storage medium
CN112967355A (en) * 2021-03-05 2021-06-15 北京百度网讯科技有限公司 Image filling method and device, electronic device and medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
RENWANG CHEN ET AL: "SimSwap: An Efficient Framework For High Fidelity Face Swapping", 《MM \'20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》, 16 October 2020 (2020-10-16) *
刘喜荣;田启川;: "基于肤色模型和模板匹配的人脸检测方法研究", 太原科技大学学报, no. 05, 15 October 2010 (2010-10-15) *
邓波;吴炜;滕奇志;: "基于视觉注意机制的人脸区域预检测", 电视技术, no. 07, 17 July 2010 (2010-07-17) *
陈令, 汪亚明: "基于肤色平滑度的人脸分割", 测试技术学报, no. 01, 15 March 2004 (2004-03-15) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131202A (en) * 2022-05-25 2022-09-30 腾讯科技(深圳)有限公司 An image generation method, device, electronic device and storage medium
CN115131260A (en) * 2022-07-22 2022-09-30 北京字跳网络技术有限公司 Image processing method, apparatus, device, computer-readable storage medium and product
CN116385641A (en) * 2023-03-29 2023-07-04 北京百度网讯科技有限公司 Image processing method and device, electronic equipment and storage medium
CN116385641B (en) * 2023-03-29 2024-03-19 北京百度网讯科技有限公司 Image processing method and device, electronic equipment and storage medium
CN117291979A (en) * 2023-09-26 2023-12-26 北京鹰之眼智能健康科技有限公司 Ear hole positioning method, electronic equipment and storage medium
CN117291979B (en) * 2023-09-26 2024-04-26 北京鹰之眼智能健康科技有限公司 Ear hole positioning method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN113327278B (en) Three-dimensional face reconstruction method, device, equipment and storage medium
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
WO2020103700A1 (en) Image recognition method based on micro facial expressions, apparatus and related device
CN106682632B (en) Method and device for processing face image
CN114049290A (en) Image processing method, device, device and storage medium
US12260492B2 (en) Method and apparatus for training a three-dimensional face reconstruction model and method and apparatus for generating a three-dimensional face image
CN114187624B (en) Image generation method, device, electronic equipment and storage medium
CN108388889B (en) Method and device for analyzing face image
CN113379877B (en) Face video generation method and device, electronic equipment and storage medium
US20230036338A1 (en) Method and apparatus for generating image restoration model, medium and program product
US20240404018A1 (en) Image processing method and apparatus, device, storage medium and program product
US20220198828A1 (en) Method and apparatus for generating image
CN116634242A (en) Speech-driven speaking video generation method, system, equipment and storage medium
CN110874575A (en) A face image processing method and related equipment
CN114120413A (en) Model training method, image synthesis method, apparatus, equipment and program product
WO2024174422A1 (en) Model generation method and apparatus, electronic device, and storage medium
CN114862716A (en) Image enhancement method, device and equipment for face image and storage medium
CN116309983B (en) Training method and generating method and device of virtual character model and electronic equipment
CN112884889B (en) Model training method, model training device, human head reconstruction method, human head reconstruction device, human head reconstruction equipment and storage medium
CN113380269B (en) Video image generation method, apparatus, device, medium, and computer program product
CN115147261A (en) Image processing method, device, storage medium, equipment and product
KR102358145B1 (en) Method for transforming child's face using standard image generation
CN116563432B (en) Three-dimensional digital person generating method and device, electronic equipment and storage medium
CN116402914B (en) Method, device and product for determining stylized image generation model
CN113239867B (en) A Face Recognition Method Based on Mask Area Adaptive Enhancement for Illumination Changes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20250207