CN111967397A - Face image processing method and device, storage medium and electronic equipment - Google Patents

Face image processing method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN111967397A
CN111967397A CN202010832741.1A CN202010832741A CN111967397A CN 111967397 A CN111967397 A CN 111967397A CN 202010832741 A CN202010832741 A CN 202010832741A CN 111967397 A CN111967397 A CN 111967397A
Authority
CN
China
Prior art keywords
face
information
image
template
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010832741.1A
Other languages
Chinese (zh)
Inventor
李润祥
李啸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202010832741.1A priority Critical patent/CN111967397A/en
Publication of CN111967397A publication Critical patent/CN111967397A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The present disclosure relates to a method and an apparatus for processing a face image, a storage medium, and an electronic device, wherein the method comprises: acquiring image information uploaded or shot by a user; acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image; fusing the target face information and face template information to be fused to obtain a new face image; displaying image information including the new face image; the method and the device improve the visual effect of face fusion, and enable the generated new face image to be more natural and vivid.

Description

Face image processing method and device, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of image processing, and in particular, to a method and an apparatus for processing a human face image, a storage medium, and an electronic device.
Background
The face fusion technique can replace or adjust the face portion of different photos to show the effect that the background of one photo nests the face of another person, or the effect that one photo retains the background and the facial contour and nests the five sense organs of another person. The technology increases the interest of the user in shooting images on one hand, and provides training materials for the technology taking the figure images as training samples on the other hand.
However, the effect of the current face fusion technology is relatively harsh and not natural enough.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In a first aspect, the present disclosure provides a method for processing a face image, where the method includes: acquiring image information uploaded or shot by a user; acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image; fusing the target face information and face template information to be fused to obtain a new face image; displaying image information including the new face image; the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
In a second aspect, the present disclosure provides a human face image processing apparatus, the apparatus comprising: the acquisition module is used for acquiring image information uploaded or shot by a user; the extraction module is used for acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image; the fusion module is used for fusing the target face information and the face template information to be fused to obtain a new face image; the display module is used for displaying image information comprising the new face image; the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which, when executed by a processing apparatus, performs the steps of the method of the first aspect of the present disclosure.
In a fourth aspect, the present disclosure provides an electronic device, including a storage device having a computer program stored thereon, and a processing device configured to execute the computer program stored in the storage device, so as to implement the steps of the method of the first aspect.
By the technical scheme, the geometric features or visual features of the five sense organs of the target face can be obtained from the image information and are fused with the non-face area in the face template information and the visual features or the geometric features of the five sense organs in the face template information to obtain a new image with the geometric features of the five sense organs of the face template and the geometric features of the five sense organs of the target face or with the geometric features of the five sense organs of the face template and the visual features of the five sense organs of the target face, so that the visual effect of face fusion is improved, and the generated new face image is more natural and vivid.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale. In the drawings:
fig. 1 is a flow chart illustrating a method for processing a face image according to an exemplary disclosed embodiment.
Fig. 2 is a flow chart illustrating a method for processing a face image according to an exemplary disclosed embodiment.
FIG. 3 is a schematic diagram illustrating generation of a face image by real-time reconstruction of a model according to an exemplary disclosed embodiment.
Fig. 4 is a block diagram illustrating a face image processing apparatus according to an exemplary disclosed embodiment.
FIG. 5 is a block diagram illustrating an electronic device according to an exemplary disclosed embodiment.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Fig. 1 is a flowchart illustrating a method for processing a face image according to an exemplary disclosed embodiment, as shown in fig. 1, the method includes the following steps:
and S11, acquiring the image information uploaded or shot by the user.
The image information may be video information or image information, and the user may upload stored image information or may use a terminal having a shooting function to perform shooting. The shot image information may be image information stored after the user presses a shutter key or a save key, or influence information which is viewed in real time on a preview page by the user in the shooting process.
For example, after the user starts the camera function, the user can see the image shot by the camera in real time from the screen, and can fuse the real-time image and the face template information by starting the operation of the face fusion function and display the real-time image and the face template information in real time on the screen.
And S12, acquiring target face information from the image information, wherein the target face information is any one of geometric features of the five sense organs of the target face image and visual features of the five sense organs of the face image.
The model used for face fusion can extract geometric features of five sense organs or visual features of five sense organs from the image, wherein the geometric features of five sense organs are used for representing the positions, shapes and other features of the five sense organs which are related to the action of the character and are unrelated to the appearance of the character, such as whether the mouth is open, the angle of the mouth corner, the height between lips and the like; the irrelevant visual features are used for features irrelevant to the action and relevant to the appearance of the individual, such as eyelid shape and eyelid shape of the eye, pupils, wrinkles, and the like.
In one possible implementation mode, in response to the operation of selecting the first fusion mode by the user, acquiring geometric features of five sense organs of the face image from the image information; and fusing the geometric features of the facial image with the non-facial area of the facial template and the visual features of the facial template.
In one possible implementation mode, in response to the operation of selecting the second fusion mode by the user, acquiring the visual features of the five sense organs of the face image from the image information; and fusing the visual features of the facial image with the non-facial area of the facial template and the geometric features of the facial template.
In a possible implementation manner, the visual features and the geometric features of the five sense organs in the target face image can be extracted through the model, and based on the selection of the first fusion manner or the second fusion manner by the user, the visual features of the five sense organs are determined to be target face information, or the geometric features of the five sense organs are determined to be target face information.
It should be noted that more than one face region may be included in the target image, and therefore, in one possible implementation, the face image included in the region selected by the user in the target image may be determined as the target face based on the selection operation of the user, and the target face information of the target face may be extracted. When the target image is a video image or a real-time video image, after a user determines a target face in one frame, the target faces in other image frames in the video image can be determined through a target tracking technology, and target face information in the other image frames is extracted respectively.
And S13, fusing the target face information and the face template information to be fused to obtain a new face image.
The face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
The face template information may be pre-stored in a database, or may be extracted from a face image template uploaded or photographed by a user.
For example, the face template information may be non-face regions extracted in advance from video works such as movie works, animation works, and fine art works, and geometric features or visual features of five sense organs, where the non-face regions refer to image regions including features such as background, hairstyle, and wearing, except for the face regions. The face template information extracted in advance can be stored in a server, and the terminal equipment can download the template information from the server for use; the face template information extracted in advance can also be stored in the terminal equipment as an initial data packet.
The face template information may also be extracted from a face image template uploaded or photographed by a user, for example, the user may take a picture and upload the picture as the face image template, and extract a non-face region and geometric features or visual features of five sense organs from the face image template through a model. After the fusion, the face template information uploaded by the user can be added into a database after the consent of the user is obtained, and the face template information is used as the pre-stored face template information.
In a possible implementation manner, the face template information may also be extracted from image information uploaded or captured by a user, for example, the user uploads a picture including two face regions, and the user may select a face image included in one of the face regions as a target face image and select another face region as a template face image, and extract the target face information from the target face image based on a model and extract the face template information from the template face image.
In one possible implementation manner, in response to a fusion operation of a user, a plurality of pieces of face template information are presented, and in response to a selection operation of the user, the face template information to be fused is determined from the plurality of pieces of face template information.
When the face template information is displayed, the face template information can be sorted according to the selected times of the face template information, recommended and displayed to the user in sequence according to the selected times, the face template information can be classified, for example, movies, television shows, animations, cartoons, portraits and the like, and the face template information under the classification is displayed based on the selection of the user on the classification.
In a possible implementation manner, the step of obtaining the target face information from the image information and the step of fusing the target face information and the face template information to be fused are executed through a real-time reconstruction model.
The real-time reconstruction model comprises an encoding and decoding network and a generating network, a face appearance code and a face geometric code which correspond to the target face image are generated through the encoding and decoding network, the face appearance code is decoded to obtain the visual characteristics of five sense organs, and the face geometric code is decoded to obtain the geometric characteristics of five sense organs; taking the visual features of the five sense organs or the geometric features of the five sense organs as the target face information; and fusing the target face information and the face template information through the generation network to obtain the new face image.
It should be noted that the face template information may also be generated by the real-time reconstruction model, the face appearance code and the face geometric code of the face template are generated by the coding and decoding network, the face appearance code is decoded to obtain the facial features and the face regions, the face geometric code is decoded to obtain the facial features, and the face regions and the face template image are subjected to subtraction processing to obtain the non-face regions.
The generating network can be set to use the decoded non-face area, the face geometric characteristic and the face appearance characteristic (wherein, the face geometric characteristic and the face appearance characteristic are respectively from one of the face template information and the target face information) to fuse to obtain a new face image, and the generating network can also directly generate a new face image based on the face appearance coding and the face geometric coding of the face template and the face appearance coding and the face geometric coding of the target face image.
The generation network may have two branches, which are respectively used for processing face fusion in two different fusion modes, for example, the first branch of the generation network is used for generating a new face image based on a face geometric code of a target face image and a face appearance code of a face template, and corresponds to a first fusion mode that a user can select; the second branch of the generation network is used for generating a new face image based on the face appearance coding of the target face image, the face appearance coding (used for obtaining the non-face region) of the face template and the face geometric coding, and corresponds to a second fusion mode.
In terms of effects, the face image generated based on the face geometric coding of the target face image and the face appearance coding of the face template has the geometric features of the five sense organs of the target face and has the non-face background and the visual features of the five sense organs of the face template, for example, when the target face image is an image with squinting eyes and mouths, the generated face image is an image with squinting eyes and mouths, which keeps the appearance and the background of the face template but has the same action as that of the target face image, namely squinting eyes and mouths. The face image generated based on the face appearance code of the target face image, the face appearance code of the face template and the face geometric code has the visual characteristics of the facial features of the target face and the geometric characteristics of the facial features of the face template and a non-face background, for example, the face template is a white sculpture photo with one face being non-expressive, the target face is a self-portrait of the user, and the generated face image is a white sculpture photo with the same facial features as the user and the non-expressive face.
And S14, displaying image information including the new face image.
The content of the image information including the new face image is consistent with the image information uploaded or shot by the user, that is, when the image information uploaded or shot by the user is an image, the image information including the new face image is also an image, and when the image information uploaded or shot by the user is a video, the image information including the new face image is also a video. When the image information uploaded or shot by the user is a picture acquired by the camera displayed in real time on the display interface, the image information including the new face image can also be displayed in real time on the display interface.
Based on the technical scheme, at least the following technical effects can be realized:
geometric features or visual features of the five sense organs of the target face can be obtained from the image information and are fused with the non-face region in the face template information and the geometric features or the geometric features of the five sense organs in the face template information to obtain new images with the geometric features of the five sense organs of the face template and the geometric features of the five sense organs of the target face or with the geometric features of the five sense organs of the face template and the visual features of the five sense organs of the target face, so that the visual effect of face fusion is improved, and the generated new face images are more natural and vivid.
Fig. 2 is a flowchart illustrating a method for processing a face image according to an exemplary disclosed embodiment, as shown in fig. 2, the method includes the following steps:
and S21, acquiring the image information uploaded or shot by the user.
The image information may be video information or image information, and the user may upload stored image information or may use a terminal having a shooting function to perform shooting. The shot image information may be image information stored after the user presses a shutter key or a save key, or influence information which is viewed in real time on a preview page by the user in the shooting process.
For example, after the user starts the camera function, the user can see the image shot by the camera in real time from the screen, and can fuse the real-time image and the face template information by starting the operation of the face fusion function and display the real-time image and the face template information in real time on the screen.
And S22, acquiring target face information from the image information, wherein the target face information comprises the geometric characteristics of the five sense organs of the face image and the visual characteristics of the five sense organs of the face image.
The geometric characteristics of the five sense organs are used for representing the positions, shapes and other characteristics of the five sense organs, which are related to the action of the character and are not related to the appearance of the character, such as whether the mouth is open, the angle of the mouth corner, the height between lips and the like; the irrelevant visual features are used for features irrelevant to the action and relevant to the appearance of the individual, such as eyelid shape and eyelid shape of the eye, pupils, wrinkles, and the like.
The target face information may be obtained by reconstructing the model in real time.
The real-time reconstruction model comprises an encoding and decoding network and a generating network, wherein the encoding and decoding network is used for extracting a face appearance code and a face geometric code from an image, and the generating network is used for generating a face image based on the face appearance code and the face geometric code.
The face appearance code can be decoded to obtain the visual characteristics of the five sense organs, and the face geometric code can be decoded to obtain the geometric characteristics of the five sense organs.
And S23, judging the fusion mode selected by the user, and determining the face template information selected by the user.
When the user selects the first fusion manner, step S24 is performed, and when the user selects the second fusion manner, step S25 is performed.
The face template information may be pre-stored in a database, or may be extracted from a face image template uploaded or photographed by a user.
For example, the face template information may be non-face regions extracted in advance from video works such as movie works, animation works, and fine art works, as well as geometric features and visual features of five sense organs, wherein the non-face regions refer to image regions including features such as background, hairstyle, wearing, and the like, except for the face regions. The face template information extracted in advance can be stored in a server, and the terminal equipment can download the template information from the server for use; the face template information extracted in advance can also be stored in the terminal equipment as an initial data packet.
The face template information may also be extracted from a face image template uploaded or photographed by a user, for example, the user may take a picture and upload the picture as the face image template, and extract a non-face region and geometric features or visual features of five sense organs from the face image template through a model. After the fusion, the face template information uploaded by the user can be added into a database after the consent of the user is obtained, and the face template information is used as the pre-stored face template information.
In a possible implementation manner, the face template information may also be extracted from image information uploaded or captured by a user, for example, the user uploads a picture including two face regions, and the user may select a face image included in one of the face regions as a target face image and select another face region as a template face image, and extract the target face information from the target face image based on a model and extract the face template information from the template face image. In one possible implementation, the face template information may be extracted from a codec network of the real-time reconstruction model.
In one possible implementation manner, in response to a fusion operation of a user, a plurality of pieces of face template information are presented, and in response to a selection operation of the user, the face template information to be fused is determined from the plurality of pieces of face template information.
When the face template information is displayed, the face template information can be sorted according to the selected times of the face template information, recommended and displayed to the user in sequence according to the selected times, the face template information can be classified, for example, movies, television shows, animations, cartoons, portraits and the like, and the face template information under the classification is displayed based on the selection of the user on the classification.
In a possible implementation manner, a user may select more than one face template information to perform fusion, so as to obtain a plurality of face images.
And S24, fusing the geometric features of the five sense organs of the face image with the non-face area of the face template and the visual features of the five sense organs of the face template.
The generation network in the real-time reconstruction model can be based on human face geometric coding comprising the geometric characteristics of five sense organs, human face appearance coding comprising the visual characteristics of five sense organs, and non-human face areas generated by the images of the human face template and the human face areas in the human face appearance coding of the human face template, and a new human face image is obtained by fusion. The face image generated based on the face geometric coding of the target face image and the face appearance coding of the face template has the geometric features of the five sense organs of the target face and has the non-face background and the visual features of the five sense organs of the face template, for example, when the target face image is an image with squinting mouth, the generated face image is an image with squinting mouth, which keeps the appearance and the background of the face template but has the same action as that of the target face image, namely, the squinting mouth.
And S25, fusing the visual features of the five sense organs of the face image with the non-face area of the face template and the geometric features of the five sense organs of the face template.
The face image generated based on the face appearance code of the target face image, the face appearance code of the face template and the face geometric code has the visual characteristics of the facial features of the target face and the geometric characteristics of the facial features of the face template and a non-face background, for example, the face template is a white sculpture photo with one face being non-expressive, the target face is a self-portrait of the user, and the generated face image is a white sculpture photo with the same facial features as the user and the non-expressive face.
And S26, displaying image information including the new face image.
The content of the image information including the new face image is consistent with the image information uploaded or shot by the user, that is, when the image information uploaded or shot by the user is an image, the image information including the new face image is also an image, and when the image information uploaded or shot by the user is a video, the image information including the new face image is also a video. When the image information uploaded or shot by the user is a picture acquired by the camera displayed in real time on the display interface, the image information including the new face image can also be displayed in real time on the display interface.
When the user selects a plurality of face template information, the face images generated based on the face template information can be sequentially displayed to the user, the display area can be divided into a plurality of display positions, and the face images generated based on the face template information are sequentially displayed in the display positions.
As shown in fig. 3, a schematic diagram of generating a face image by real-time reconstruction of a model is shown, as shown in the figure, a target face image generates a face appearance code x (black bar representation) and a face geometric code y (white bar representation) of the target face image through a coding and decoding network, a template image generates a face appearance code x '(black bar representation) and a face geometric code y' (white bar representation) of a face template through a coding and decoding network, the generation network can be fused based on the combined face code xy 'or x' y (the face geometric code y 'of the face template selected in fig. 3 and the face appearance code x of the target face image are combined to generate a face code xy'), and a non-face region z generated based on the face appearance codes of the template image and the template image is fused to obtain a new face image.
Based on the technical scheme, at least the following technical effects can be realized:
the geometric features and visual features of the five sense organs of the target face can be obtained from the image information, and the geometric features and the visual features of the five sense organs are fused with the non-face area in the face template information and the visual features or the geometric features of the five sense organs in the face template information based on the fusion mode selected by the user to obtain a new image with the geometric features of the five sense organs of the face template and the geometric features of the five sense organs of the target face or with the geometric features of the five sense organs of the face template and the visual features of the five sense organs of the target face, so that the visual effect of face fusion is improved, the generated new face image is more natural and vivid, more fusion choices can be provided for the user, and the use experience of the user is improved.
Fig. 4 is a block diagram illustrating a face image processing apparatus 400 according to an exemplary disclosed embodiment.
As shown in fig. 4, the face image processing apparatus 400 includes:
the obtaining module 410 is configured to obtain image information uploaded or captured by a user.
An extracting module 420, configured to obtain target face information from the image information, where the target face information is any one of geometric features of five sense organs of a target face image and visual features of five sense organs of the face image.
And the fusion module 430 is configured to fuse the target face information and the face template information to be fused to obtain a new face image.
And a display module 440, configured to display image information including the new face image.
The face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
In a possible implementation manner, the face template information is pre-stored in a database, or is extracted from a face image template uploaded or shot by a user.
In a possible implementation manner, the device further comprises a selection module, a fusion module and a fusion module, wherein the selection module is used for responding to the fusion operation of the user and displaying a plurality of pieces of face template information; and responding to the selection operation of the user, and determining the face template information to be fused from the plurality of face template information.
In a possible implementation manner, the obtaining module 410 is configured to obtain geometric features of five sense organs of a face image from the image information in response to a user operation of selecting a first fusion manner; the fusion module 430 is configured to fuse the geometric features of the facial image with the non-facial region of the facial template and the visual features of the facial template.
In a possible implementation manner, the obtaining module 410 is configured to obtain, from the image information, a five sense organs visual feature of a human face image in response to a user operation of selecting the second fusion manner; the fusion module 430 is configured to fuse the visual features of the facial image with the non-facial region of the facial template and the geometric features of the facial template.
In a possible implementation, the extracting module 420 is configured to obtain target face information from the image information based on a real-time reconstruction model; the fusion module 430 is configured to fuse the target face information and the face template information to be fused based on a real-time reconstruction model.
In a possible implementation manner, the real-time reconstruction model includes an encoding and decoding network and a generating network, and the extracting module 420 is configured to generate a face appearance code and a face geometric code corresponding to the target face image through the encoding and decoding network, decode the face appearance code to obtain facial features, and decode the face geometric code to obtain facial features; taking the visual features of the five sense organs or the geometric features of the five sense organs as the target face information; the fusion module 430 is configured to fuse the target face information and the face template information through the generation network to obtain the new face image.
Based on the technical scheme, at least the following technical effects can be realized:
geometric features or visual features of the five sense organs of the target face can be obtained from the image information and are fused with the non-face region in the face template information and the geometric features or the geometric features of the five sense organs in the face template information to obtain new images with the geometric features of the five sense organs of the face template and the geometric features of the five sense organs of the target face or with the geometric features of the five sense organs of the face template and the visual features of the five sense organs of the target face, so that the visual effect of face fusion is improved, and the generated new face images are more natural and vivid.
Referring now to FIG. 5, a block diagram of an electronic device 500 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 5, electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Generally, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 501.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The name of the module does not in some cases form a limitation of the module itself, for example, the first obtaining module may also be described as a "module for obtaining at least two internet protocol addresses".
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Example 1 provides a face image processing method according to one or more embodiments of the present disclosure, the method including: acquiring image information uploaded or shot by a user; acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image; fusing the target face information and face template information to be fused to obtain a new face image; displaying image information including the new face image; the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
Example 2 provides the method of example 1, and the face template information is pre-stored in a database or extracted from a face image template uploaded or photographed by a user.
Example 3 provides the method of example 1, before the fusing the target face information with the face template information to be fused, according to one or more embodiments of the present disclosure, the method further including: responding to the fusion operation of a user, and displaying a plurality of face template information; and responding to the selection operation of the user, and determining the face template information to be fused from the plurality of face template information.
Example 4 provides the method of examples 1-3, wherein the obtaining target face information from the image information, according to one or more embodiments of the present disclosure, includes: responding to the operation of selecting a first fusion mode of a user, and acquiring the geometric features of the five sense organs of the face image from the image information; the fusing the target face information and the face template information to be fused comprises the following steps: and fusing the geometric features of the facial image with the non-facial area of the facial template and the visual features of the facial template.
Example 5 provides the method of examples 1-3, wherein the obtaining target face information from the image information, according to one or more embodiments of the present disclosure, includes: responding to the operation of selecting a second fusion mode of a user, and acquiring the visual characteristics of the five sense organs of the face image from the image information; the fusing the target face information and the face template information to be fused comprises the following steps: and fusing the visual features of the facial image with the non-facial area of the facial template and the geometric features of the facial template.
Example 6 provides the method of examples 1-3, the image information being video image information taken in real-time, the method further comprising: and executing the step of acquiring the target face information from the image information and the step of fusing the target face information and the face template information to be fused through a real-time reconstruction model.
Example 7 provides the method of example 6, wherein the real-time reconstruction model includes a codec network and a generation network, and the obtaining target face information from the image information includes: generating a face appearance code and a face geometric code corresponding to the target face image through the coding and decoding network, decoding the face appearance code to obtain the five sense organs visual characteristic, and decoding the face geometric code to obtain the five sense organs geometric characteristic; taking the visual features of the five sense organs or the geometric features of the five sense organs as the target face information; the fusing the target face information and the face template information to be fused comprises the following steps: and fusing the target face information and the face template information through the generation network to obtain the new face image.
Example 8 provides, in accordance with one or more embodiments of the present disclosure, an apparatus for facial image processing, the apparatus comprising: the acquisition module is used for acquiring image information uploaded or shot by a user; the extraction module is used for acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image; the fusion module is used for fusing the target face information and the face template information to be fused to obtain a new face image; the display module is used for displaying image information comprising the new face image; the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
Example 9 provides the apparatus of example 8, wherein the face template information is pre-stored in a database or extracted from a face image template uploaded or photographed by a user
Example 10 provides the apparatus of example 8, further comprising a selection module to present a plurality of face template information in response to a user's fusion operation, in accordance with one or more embodiments of the present disclosure; and responding to the selection operation of the user, and determining the face template information to be fused from the plurality of face template information.
Example 11 provides the apparatus of examples 8-10, wherein the obtaining module is configured to obtain geometric features of five sense organs of the face image from the image information in response to a user's operation of selecting the first fusion mode; and the fusion module is used for fusing the geometric features of the five sense organs of the face image, the non-face area of the face template and the visual features of the five sense organs of the face template.
Example 12 provides the apparatus of examples 8-10, wherein the obtaining module is configured to obtain the visual features of the five sense organs of the facial image from the image information in response to a user selecting the second fusion mode; and the fusion module is used for fusing the visual features of the five sense organs of the face image with the non-face area of the face template and the geometric features of the five sense organs of the face template.
Example 13 provides the apparatus of examples 8-10, the extraction module to obtain target face information from the image information based on a real-time reconstruction model, according to one or more embodiments of the present disclosure; and the fusion module is used for fusing the target face information and the face template information to be fused based on the real-time reconstruction model.
According to one or more embodiments of the present disclosure, example 14 provides the apparatus of example 13, where the real-time reconstruction model includes an encoding and decoding network and a generating network, and the extracting module is configured to generate a face appearance code and a face geometric code corresponding to the target face image through the encoding and decoding network, decode the face appearance code to obtain the visual features of five sense organs, and decode the face geometric code to obtain the geometric features of five sense organs; taking the visual features of the five sense organs or the geometric features of the five sense organs as the target face information; and the fusion module is used for fusing the target face information and the face template information through the generation network to obtain the new face image.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Claims (10)

1. A method for processing a face image, the method comprising:
acquiring image information uploaded or shot by a user;
acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image;
fusing the target face information and face template information to be fused to obtain a new face image;
displaying image information including the new face image;
the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
2. The method according to claim 1, wherein the face template information is pre-stored in a database or extracted from a face image template uploaded or photographed by a user.
3. The method according to claim 1, wherein before the fusing the target face information with the face template information to be fused, the method further comprises:
responding to the fusion operation of a user, and displaying a plurality of face template information;
and responding to the selection operation of the user, and determining the face template information to be fused from the plurality of face template information.
4. The method according to any one of claims 1-3, wherein the obtaining target face information from the image information comprises:
responding to the operation of selecting a first fusion mode of a user, and acquiring the geometric features of the five sense organs of the face image from the image information;
the fusing the target face information and the face template information to be fused comprises the following steps:
and fusing the geometric features of the facial image with the non-facial area of the facial template and the visual features of the facial template.
5. The method according to any one of claims 1-3, wherein the obtaining target face information from the image information comprises:
responding to the operation of selecting a second fusion mode of a user, and acquiring the visual characteristics of the five sense organs of the face image from the image information;
the fusing the target face information and the face template information to be fused comprises the following steps:
and fusing the visual features of the facial image with the non-facial area of the facial template and the geometric features of the facial template.
6. The method according to any one of claims 1-3, wherein the image information is video image information taken in real-time, the method further comprising:
and executing the step of acquiring the target face information from the image information and the step of fusing the target face information and the face template information to be fused through a real-time reconstruction model.
7. The method of claim 6, wherein the real-time reconstruction model comprises a codec network and a generation network, and the obtaining the target face information from the image information comprises:
generating a face appearance code and a face geometric code corresponding to the target face image through the coding and decoding network, decoding the face appearance code to obtain the five sense organs visual characteristic, and decoding the face geometric code to obtain the five sense organs geometric characteristic;
taking the visual features of the five sense organs or the geometric features of the five sense organs as the target face information;
the fusing the target face information and the face template information to be fused comprises the following steps:
and fusing the target face information and the face template information through the generation network to obtain the new face image.
8. A human face image processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring image information uploaded or shot by a user;
the extraction module is used for acquiring target face information from the image information, wherein the target face information is any one of geometric features of five sense organs of a target face image and visual features of the five sense organs of the face image;
the fusion module is used for fusing the target face information and the face template information to be fused to obtain a new face image;
the display module is used for displaying image information comprising the new face image;
the face template information comprises a non-face area of a face template and facial features of the face template under the condition that the target face information is the facial features of the target face image, and the face template information comprises the non-face area of the face template and the facial features of the face template under the condition that the target face information is the facial features of the target face image.
9. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the steps of the method of any one of claims 1 to 7.
10. An electronic device, comprising:
a storage device having a computer program stored thereon;
processing means for executing the computer program in the storage means to carry out the steps of the method according to any one of claims 1 to 7.
CN202010832741.1A 2020-08-18 2020-08-18 Face image processing method and device, storage medium and electronic equipment Pending CN111967397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010832741.1A CN111967397A (en) 2020-08-18 2020-08-18 Face image processing method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010832741.1A CN111967397A (en) 2020-08-18 2020-08-18 Face image processing method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111967397A true CN111967397A (en) 2020-11-20

Family

ID=73387828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010832741.1A Pending CN111967397A (en) 2020-08-18 2020-08-18 Face image processing method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111967397A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488085A (en) * 2020-12-28 2021-03-12 深圳市慧鲤科技有限公司 Face fusion method, device, equipment and storage medium
CN113361471A (en) * 2021-06-30 2021-09-07 平安普惠企业管理有限公司 Image data processing method, image data processing device, computer equipment and storage medium
CN116071804A (en) * 2023-01-18 2023-05-05 北京六律科技有限责任公司 Face recognition method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712080A (en) * 2018-10-12 2019-05-03 迈格威科技有限公司 Image processing method, image processing apparatus and storage medium
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image
CN111008927A (en) * 2019-08-07 2020-04-14 深圳华侨城文化旅游科技集团有限公司 Face replacement method, storage medium and terminal equipment
CN111368796A (en) * 2020-03-20 2020-07-03 北京达佳互联信息技术有限公司 Face image processing method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712080A (en) * 2018-10-12 2019-05-03 迈格威科技有限公司 Image processing method, image processing apparatus and storage medium
CN111008927A (en) * 2019-08-07 2020-04-14 深圳华侨城文化旅游科技集团有限公司 Face replacement method, storage medium and terminal equipment
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image
CN111368796A (en) * 2020-03-20 2020-07-03 北京达佳互联信息技术有限公司 Face image processing method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488085A (en) * 2020-12-28 2021-03-12 深圳市慧鲤科技有限公司 Face fusion method, device, equipment and storage medium
CN113361471A (en) * 2021-06-30 2021-09-07 平安普惠企业管理有限公司 Image data processing method, image data processing device, computer equipment and storage medium
CN116071804A (en) * 2023-01-18 2023-05-05 北京六律科技有限责任公司 Face recognition method and device and electronic equipment

Similar Documents

Publication Publication Date Title
US11670015B2 (en) Method and apparatus for generating video
CN112967212A (en) Virtual character synthesis method, device, equipment and storage medium
CN114331820A (en) Image processing method, image processing device, electronic equipment and storage medium
CN111967397A (en) Face image processing method and device, storage medium and electronic equipment
CN111669502B (en) Target object display method and device and electronic equipment
CN112637517B (en) Video processing method and device, electronic equipment and storage medium
US20240013359A1 (en) Image processing method, model training method, apparatus, medium and device
CN112839223B (en) Image compression method, image compression device, storage medium and electronic equipment
CN112182299A (en) Method, device, equipment and medium for acquiring highlight segments in video
US11893770B2 (en) Method for converting a picture into a video, device, and storage medium
CN113870133A (en) Multimedia display and matching method, device, equipment and medium
CN115311178A (en) Image splicing method, device, equipment and medium
CN113012082A (en) Image display method, apparatus, device and medium
CN114937192A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112906553B (en) Image processing method, apparatus, device and medium
CN112785669A (en) Virtual image synthesis method, device, equipment and storage medium
CN112714263A (en) Video generation method, device, equipment and storage medium
CN115757933A (en) Recommendation information generation method, device, equipment, medium and program product
CN110619602A (en) Image generation method and device, electronic equipment and storage medium
CN115272151A (en) Image processing method, device, equipment and storage medium
CN115953597A (en) Image processing method, apparatus, device and medium
CN114422698A (en) Video generation method, device, equipment and storage medium
CN114913061A (en) Image processing method and device, storage medium and electronic equipment
CN114418835A (en) Image processing method, apparatus, device and medium
CN115937356A (en) Image processing method, apparatus, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination