CN108961149B - Image processing method, device and system and storage medium - Google Patents

Image processing method, device and system and storage medium Download PDF

Info

Publication number
CN108961149B
CN108961149B CN201710390286.2A CN201710390286A CN108961149B CN 108961149 B CN108961149 B CN 108961149B CN 201710390286 A CN201710390286 A CN 201710390286A CN 108961149 B CN108961149 B CN 108961149B
Authority
CN
China
Prior art keywords
dimensional
face
contour
head model
key points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710390286.2A
Other languages
Chinese (zh)
Other versions
CN108961149A (en
Inventor
周而进
黄志翱
刘研绎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd, Beijing Megvii Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN201710390286.2A priority Critical patent/CN108961149B/en
Publication of CN108961149A publication Critical patent/CN108961149A/en
Application granted granted Critical
Publication of CN108961149B publication Critical patent/CN108961149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T3/06
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/564Depth or shape recovery from multiple images from contours

Abstract

The embodiment of the invention provides an image processing method, an image processing device, an image processing system and a storage medium. The method comprises the following steps: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data. By utilizing the method, the device and the system as well as the storage medium, a large number of face images and face key annotation data under various different scenes can be generated very simply and rapidly.

Description

Image processing method, device and system and storage medium
Technical Field
The present invention relates to the field of Artificial Intelligence (AI), and more particularly, to an image processing method, apparatus and system, and a storage medium.
Background
The positioning of the key points of the human face has important application in face recognition, expression recognition and gesture recognition. At present, the mainstream human face key point positioning method mainly utilizes a machine learning algorithm to learn from the human face key point marking data. The current face key point labeling data mainly depends on manual labeling, that is, a large amount of face key point labeling data under different scenes (such as various different postures, illumination and expressions) need to be obtained through manual labeling. The manual marking mode has low efficiency and high cost.
Disclosure of Invention
The present invention has been made in view of the above problems. The invention provides an image processing method, an image processing device, an image processing system and a storage medium.
According to an aspect of the present invention, there is provided an image processing method. The method comprises the following steps: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data.
Illustratively, before rendering the transformed three-dimensional head model into a two-dimensional face image, the image processing method further comprises: adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair and illumination.
Illustratively, adding additional information on the transformed three-dimensional human head model includes: and randomly adding additional information on the transformed three-dimensional human head model.
Illustratively, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
Illustratively, the interior point information includes position information of a three-dimensional interior key point in the three-dimensional human head model, and determining a two-dimensional human face key point in the two-dimensional human face image and obtaining position information of the two-dimensional human face key point as human face key point annotation data according to the projection relationship between the transformed three-dimensional human head model and the two-dimensional human face image and the human face key point information includes: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
Illustratively, the face keypoint information comprises contour point information related to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
Illustratively, the contour point information includes position information of a three-dimensional contour point set which may become a face contour key point in the three-dimensional head model, and determining a two-dimensional face key point in the two-dimensional face image and obtaining the position information of the two-dimensional face key point as face key point annotation data according to a projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information includes: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
Exemplarily, determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining location information of the two-dimensional contour key points comprises: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Exemplarily, determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining location information of the two-dimensional contour key points comprises: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, transforming the three-dimensional human head model includes: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model.
Illustratively, the three-dimensional head model includes a mesh model and a texture model.
Illustratively, the image processing method further includes: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
According to another aspect of the present invention, there is provided an image processing apparatus. The device includes: the initial image acquisition module is used for acquiring an initial face image; the three-dimensional reconstruction module is used for performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; the pre-defining module is used for pre-defining the key point information of the human face on the three-dimensional human head model; the transformation module is used for transforming the three-dimensional human head model; the rendering module is used for rendering the transformed three-dimensional human head model into a two-dimensional human face image; and the key point determining module is used for determining two-dimensional face key points in the two-dimensional face image and acquiring position information of the two-dimensional face key points as face key point marking data according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information.
Illustratively, the image processing apparatus further includes: and the adding module is used for adding additional information on the transformed three-dimensional human head model before the rendering module renders the transformed three-dimensional human head model into the two-dimensional human face image, wherein the additional information comprises one or more items of background, hair and illumination.
Illustratively, the adding module includes: and the random adding submodule is used for randomly adding additional information on the transformed three-dimensional human head model.
Illustratively, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
Illustratively, the interior point information includes position information of a three-dimensional interior key point in the three-dimensional head model, and the key point determination module includes: and the internal key point determining submodule is used for determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points and obtaining the position information of the two-dimensional internal key points.
Illustratively, the face keypoint information comprises contour point information related to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
Illustratively, the contour point information includes position information of a three-dimensional contour point set which can become a face contour key point in the three-dimensional head model, and the key point determining module includes: the three-dimensional determining submodule is used for determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; the two-dimensional determining submodule is used for determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; a selection submodule for selecting a contour start point and a contour end point from the two-dimensional start point candidate set; the contour line determining submodule is used for determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and the contour key point determining submodule is used for determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
Illustratively, the contour keypoint determination submodule includes: the face contour line removing unit is used for removing points which belong to specific face parts in the face contour line according to predefined face part removing conditions to obtain a new face contour line; the first equally dividing unit is used for equally dividing the new face contour line; and the first contour key point determining unit is used for determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the contour keypoint determination submodule includes: the second halving unit is used for halving the face contour line; and the second contour key point determining unit is used for determining the obtained bisector points to be two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the transformation module includes: the rigid body transformation submodule is used for carrying out rigid body transformation on the three-dimensional head model; and/or the non-rigid body transformation submodule is used for carrying out non-rigid body transformation on the three-dimensional human head model.
Illustratively, the three-dimensional head model includes a mesh model and a texture model.
Illustratively, the image processing apparatus further includes: and the training module is used for training a regressor for a face key point positioning algorithm by utilizing the two-dimensional face image and the face key point mark data.
According to another aspect of the present invention, there is provided an image processing system comprising a processor and a memory, wherein the memory has stored therein computer program instructions for execution by the processor to perform the steps of: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data.
Illustratively, the computer program instructions, when executed by the processor, are further operable to perform the steps of, prior to the step of rendering the transformed three-dimensional head model into the two-dimensional face image being performed by the computer program instructions when executed by the processor: adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair and illumination.
Illustratively, the step of adding additional information on the transformed three-dimensional head model for execution by the processor instructions of the computer program when executed by the processor comprises: and randomly adding additional information on the transformed three-dimensional human head model.
Illustratively, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
Illustratively, the step of determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed by the processor, includes: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
Illustratively, the face keypoint information comprises contour point information related to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
Illustratively, the contour point information includes position information of a three-dimensional contour point set which may become a face contour key point in the three-dimensional head model, and the step of determining a two-dimensional face key point in the two-dimensional face image and obtaining position information of the two-dimensional face key point as face key point annotation data according to a projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed by the processor, includes: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
Illustratively, the steps of determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining position information of the two-dimensional contour keypoints by the computer program instructions when executed by the processor comprise: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the steps of determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining position information of the two-dimensional contour keypoints by the computer program instructions when executed by the processor comprise: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the step of transforming the three-dimensional head model for execution by the computer program instructions when executed by the processor comprises: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model.
Illustratively, the three-dimensional head model includes a mesh model and a texture model.
Illustratively, the computer program instructions when executed by the processor are further for performing the steps of: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
According to another aspect of the present invention, there is provided a storage medium having stored thereon program instructions operable when executed to perform the steps of: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data.
Illustratively, the program instructions, when executed by the processor, are further operable to perform the steps of, prior to the step of rendering the transformed three-dimensional head model into the two-dimensional face image for execution at runtime: adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair and illumination.
Illustratively, the step of adding additional information on the transformed three-dimensional head model for execution by the program instructions at runtime comprises: and randomly adding additional information on the transformed three-dimensional human head model.
Illustratively, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
Illustratively, the interior point information includes position information of a three-dimensional interior key point in the three-dimensional head model, and the step of determining a two-dimensional face key point in the two-dimensional face image and obtaining position information of the two-dimensional face key point as face key point annotation data according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed by the program instructions when running, includes: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
Illustratively, the face keypoint information comprises contour point information related to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
Illustratively, the step of determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed by the program instructions during the operation, includes: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
Illustratively, the steps for determining two-dimensional contour keypoints in the two-dimensional face image based on the face contour line and obtaining position information of the two-dimensional contour keypoints, performed by the program instructions when executed, include: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the steps for determining two-dimensional contour keypoints in the two-dimensional face image based on the face contour line and obtaining position information of the two-dimensional contour keypoints, performed by the program instructions when executed, include: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the step of transforming the three-dimensional head model for execution by the program instructions at runtime comprises: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model.
Illustratively, the three-dimensional head model includes a mesh model and a texture model.
Illustratively, the program instructions are further operable when executed to perform the steps of: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
According to the image processing method, the image processing device, the image processing system and the storage medium, a large number of face images and face key annotation data in various different scenes can be generated very simply and rapidly. Compared with the manual labeling mode, the method, the device, the system and the storage medium have high efficiency and low cost, and the face image and the face key labeling data obtained by the method, the device and the system storage medium are used as training data, so that the face key point detector with more robust performance can be trained.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail embodiments of the present invention with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 shows a schematic block diagram of an example electronic device for implementing an image processing method and apparatus in accordance with embodiments of the present invention;
FIG. 2 shows a schematic flow diagram of an image processing method according to an embodiment of the invention;
FIGS. 3a and 3b show schematic diagrams of a texture model and a mesh model, respectively, obtained based on the same initial face image, according to an embodiment of the invention;
FIGS. 3c and 3d show schematic diagrams of the transformed texture model and mesh model corresponding to FIGS. 3a and 3b, respectively;
FIGS. 4a-4d are schematic diagrams of a mesh model according to an embodiment of the invention;
FIG. 5 shows a schematic block diagram of an image processing apparatus according to an embodiment of the present invention; and
FIG. 6 shows a schematic block diagram of an image processing system according to one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of embodiments of the invention and not all embodiments of the invention, with the understanding that the invention is not limited to the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the invention described herein without inventive step, shall fall within the scope of protection of the invention.
In order to solve the above-mentioned problems, embodiments of the present invention provide an image processing method, apparatus and system, and a storage medium, which can obtain face images in different scenes (for example, in different poses, illuminations, and expressions) through transformation and rendering of a three-dimensional human head model, and obtain corresponding face key point annotation data. By utilizing the image processing method provided by the embodiment of the invention, a large amount of face images and face key annotation data in various different scenes can be simply, conveniently and quickly generated.
First, an exemplary electronic device 100 for implementing an image processing method and apparatus according to an embodiment of the present invention is described with reference to fig. 1.
As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, an output device 108, and an image capture device 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.
The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.
The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 108 may output various information (e.g., images and/or sounds) to an external (e.g., user), and may include one or more of a display, a speaker, etc.
The image capture device 110 may capture facial images (including video frames) and store the captured facial images in the storage device 104 for use by other components. The image capture device 110 may be a conventional camera. It should be understood that the image capture device 110 is merely an example, and the electronic device 100 may not include the image capture device 110. In this case, the other image capturing device may be used to capture a face image and transmit the captured face image to the electronic apparatus 100.
Exemplary electronic devices for implementing the image processing method and apparatus according to embodiments of the present invention may be implemented on devices such as personal computers or remote servers, for example.
Next, an image processing method according to an embodiment of the present invention will be described with reference to fig. 2. FIG. 2 shows a schematic flow diagram of an image processing method 200 according to one embodiment of the invention. As shown in fig. 2, the image processing method 200 includes the following steps.
In step S210, an initial face image is acquired.
The initial face image may be any suitable image containing a face. The initial face image may be an original image acquired by an image acquisition device such as a camera, or may be an image obtained after preprocessing the original image.
The initial facial image may be sent to the electronic device 100 by an external device (e.g., a cloud server) to be processed by the processor 102 of the electronic device 100, may be acquired by an image acquisition device 110 (e.g., a camera) included in the electronic device 100 and transmitted to the processor 102 for processing, and may be stored by a storage device 104 included in the electronic device 100 and transmitted to the processor 102 for processing.
In step S220, three-dimensional reconstruction is performed based on the initial face image to obtain a three-dimensional head model.
By way of example and not limitation, step S220 may be implemented using any suitable existing or future possible implementation of three-dimensional reconstruction techniques for human faces. Illustratively, the three-dimensional human head model obtained at step S220 may include a mesh model and a texture model. The data structure and meaning of the mesh model and the texture model can be understood by those skilled in the art, and are not described in detail herein.
In step S230, face keypoint information is predefined on the three-dimensional head model.
Illustratively, the face keypoints may include face interior keypoints and/or face contour keypoints. Accordingly, the face keypoint information may include interior point information related to face interior keypoints and/or contour point information related to face contour keypoints. Illustratively, the key points inside the human face are points on the eyebrows, eyes, nose, mouth, and the like.
Illustratively, face key points or points that may become face key points may be predefined on the mesh model. For example, if the face keypoint annotation data that is expected to be finally obtained includes the position information of the face internal keypoints, which points are the face internal keypoints may be predefined on the mesh model. For another example, if the face key point annotation data finally obtained is expected to include the position information of the face contour key points, it may be predefined on the mesh model which points may become the face contour key points.
In step S240, the three-dimensional head model is transformed.
By way of example and not limitation, step S240 may include: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model. Illustratively, the rigid body transformation may include one or more of rotation, scaling, and translation. The posture, the position and the like of the human face can be changed by carrying out rigid body transformation on the three-dimensional human head model. Illustratively, the non-rigid body transformations may include expression transformations. The expression of the human face can be changed by carrying out non-rigid body transformation on the three-dimensional human head model.
In step S250, the transformed three-dimensional head model is rendered into a two-dimensional face image.
Step S250 may be implemented using conventional rendering techniques. For example, the mesh model and the texture model may be used as input of a rendering algorithm, and the rendering algorithm may output a two-dimensional face image obtained by rendering.
In step S260, according to the projection relationship between the transformed three-dimensional human head model and the two-dimensional human face image and the human face key point information, two-dimensional human face key points in the two-dimensional human face image are determined and the positions of the two-dimensional human face key points are obtained as human face key point annotation data.
As described above, the three-dimensional head model has predefined face keypoint information. For example, face key points or points that may become face key points may be predefined on the three-dimensional head model. The information of which points on the three-dimensional head model are the key points of the face or the points which can become the key points of the face is not changed along with the transformation of the three-dimensional head model. Thus, the transformed three-dimensional head model may maintain predefined face keypoint information.
The transformed three-dimensional head model has a projection relationship with the two-dimensional face image. Based on the projection relationship, the position information (e.g., coordinates) of each point on the transformed three-dimensional human head model after being projected to the imaging plane where the two-dimensional human face image is located can be determined. For example, if location information of face interior key points is predefined on the three-dimensional head model, location information of corresponding face interior key points on the two-dimensional face image may be determined. In the following description, for convenience of distinction, the face interior key points on the three-dimensional face model are referred to as three-dimensional interior key points, and the face interior key points in the two-dimensional face image are referred to as two-dimensional interior key points. For the face contour key points, if the position information of points which can become the face contour key points is predefined on the three-dimensional human head model, the position information of the face contour key points in the two-dimensional face image can be determined according to the projection relation.
Those skilled in the art will appreciate that the execution order of the steps of the image processing method 200 shown in fig. 2 is merely an example and not a limitation, for example, step S250 may be executed after or simultaneously with step S260.
According to the image processing method provided by the embodiment of the invention, the three-dimensional human head model is generated based on the initial human face image, and the two-dimensional human face image and the corresponding human face key point annotation data are obtained through the transformation and rendering of the three-dimensional human head model. The method can be used for generating a large amount of face images and face key annotation data under various different scenes very simply and quickly. Compared with the manual labeling mode, the method is high in efficiency and low in cost, and the human face image and the human face key labeling data obtained through the method are used as training data, so that the human face key point detector with more robust performance can be obtained through training.
Illustratively, the image processing method according to the embodiment of the present invention may be implemented in a device, apparatus, or system having a memory and a processor.
The image processing method according to the embodiment of the present invention may be deployed at a stand-alone device, for example, at a personal computer or a server. Alternatively, the image processing method according to the embodiment of the present invention may also be distributively deployed at the server side (or the cloud side) and the client side. For example, an initial face image may be acquired or stored at a client, and the client transmits the acquired or stored initial face image to a server (or a cloud), and the server (or the cloud) performs image processing.
According to an embodiment of the present invention, before step S250, the image processing method 200 may further include: additional information is added to the transformed three-dimensional head model.
Illustratively, the additional information may include one or more of background, hair, and lighting. After the additional information is added, the rendered two-dimensional face image may contain additional information other than a face, so that the face image in various scenes such as different backgrounds, lighting conditions, and the like can be conveniently obtained. If the face key point detector is trained by using the two-dimensional face image containing various additional information, the training effect can be improved, and the face key point detector with higher precision can be obtained.
In one example, adding additional information on the transformed three-dimensional human head model may include: and randomly adding additional information on the transformed three-dimensional human head model. The two-dimensional face image and the face key point annotation data which are richer and have larger variability can be obtained by adding the additional information in a random mode, so that the training effect of the face key point detector which adopts the two-dimensional face image and the face key point annotation data as the training data is better. Of course, preset additional information may also be added to the transformed three-dimensional head model.
According to the embodiment of the invention, the face key point information can comprise internal point information related to internal key points of the face, and the two-dimensional face key points can comprise two-dimensional internal key points. If the position information of the key points inside the face is expected to be included in the finally obtained face key point marking data, the information of the internal points related to the key points inside the face can be predefined on the three-dimensional head model. As described below, the interior point information may include location information of three-dimensional interior key points in the three-dimensional head model.
According to the embodiment of the present invention, the interior point information may include position information of a three-dimensional interior key point in the three-dimensional head model, and step S260 may include: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points. The face keypoint annotation data comprises position information of two-dimensional internal keypoints.
FIGS. 3a and 3b show schematic diagrams of a texture model and a mesh model, respectively, obtained based on the same initial face image, according to an embodiment of the invention; fig. 3c and 3d show schematic diagrams of the transformed texture model and mesh model corresponding to fig. 3a and 3b, respectively. As shown in fig. 3a and 3b, the three-dimensional human head model has predefined human face internal key points, i.e. three-dimensional internal key points. It is understood that the definition of the internal key points of the human face on the three-dimensional human head model is relative to the three-dimensional human head model, and the definition of the internal key points of the human face does not change along with the change of the pose and the expression of the human face. For example, assuming that the pupil center point is defined as one of the key points in the face, even if the three-dimensional head model is rotated, as shown in fig. 3c and 3d, so that the face faces other directions, the definition of the pupil center point as the key point in the face is unchanged. Therefore, after the rotation of the three-dimensional human head model is finished, the coordinates of the pupil center point in the two-dimensional human face image (i.e. the position information of the pupil center point) can be calculated and obtained according to the orientation of the three-dimensional human head model and the projection relation between the three-dimensional human head model and the two-dimensional human face image at the moment, so as to determine the position of the key point in the human face. The conditions of key points in other faces are the same, and are not described in detail.
According to the embodiment of the invention, the face key point information may include contour point information related to face contour key points, and the two-dimensional face key points may include two-dimensional contour key points. If the position information of the face contour key points is expected to be included in the finally obtained face key point labeling data, the contour point information related to the face contour key points can be predefined on the three-dimensional human head model. As described below, the contour point information may include position information of a three-dimensional contour point set that may become a key point of the face contour in the three-dimensional head model.
According to the embodiment of the present invention, the contour point information may include position information of a three-dimensional contour point set that may become a key point of a face contour in the three-dimensional head model, and step S260 may include: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points. The face key point annotation data includes position information of two-dimensional contour key points.
The face contour keypoints may change as the face pose changes. If the orientation of the face in the rendered two-dimensional face image is different from that in the original face image, the key points of the face contour in the two-dimensional face image may change. Therefore, on the three-dimensional human head model, a three-dimensional contour point set which can become a key point of the human face contour can be defined in advance. Corresponding relations exist between the key points of the face contour (namely the key points of the two-dimensional contour) in the finally obtained two-dimensional face image and some points in the three-dimensional contour point set.
Fig. 4a-4d show schematic diagrams of a mesh model according to an embodiment of the invention. Referring to fig. 4a, a three-dimensional contour point set S that may become a key point of a face contour is predefined on a three-dimensional head model, that is, position information of the three-dimensional contour point set S in the three-dimensional head model is predefined. Furthermore, on the three-dimensional head model, a "starting plane" may also be defined in advance. In one example, the origin plane is defined as a horizontal plane passing through the line connecting the centers of the two eyes in the front view of the three-dimensional head model, as shown in FIG. 4 b. The start plane may be used to define left and right start points of the face contour (i.e., contour start point and contour end point). The intersection of the starting point plane and the three-dimensional contour point set S is referred to as a three-dimensional starting point candidate set T. It will be appreciated that the above definition of the origin plane is merely exemplary and not limiting of the invention, and that the origin plane may have other suitable definitions.
After the three-dimensional human head model is rotated by a certain angle (as shown in fig. 4c or 4 d), the currently obtained three-dimensional human head model (i.e. the transformed three-dimensional human head model) is projected on an imaging plane to obtain a two-dimensional human face image. The imaging plane is the plane where the two-dimensional face image is located. The three-dimensional contour point set S and the three-dimensional starting point candidate set T correspond to the two-dimensional contour point set S 'and the two-dimensional starting point candidate set T' under the imaging plane, respectively. Exemplarily, as shown in fig. 4c, a leftmost point a and a rightmost point B of the two-dimensional starting point candidate set T' may be defined as a contour starting point and a contour ending point, respectively. The face contour L (in fig. 4C, a curve with a and B as end points and passing C) can be determined based on the contour starting point a, the contour end point B and the contour line of the two-dimensional contour point set S'. Then, some points on the face contour L can be regarded as face contour key points to obtain the required two-dimensional contour key points and the position information thereof.
According to the embodiment of the present invention, determining two-dimensional contour key points in a two-dimensional face image based on a face contour line and obtaining position information of the two-dimensional contour key points may include: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
The specific face region may be predefined by the user and may be any suitable region. By way of example and not limitation, the particular face region may include a nose. That is, in the face contour line, the points of the nose portion are eliminated. Continuing with FIG. 4d, a new face contour line is shown that is obtained after the points of the nose region are removed. The new face contour line shown in fig. 4d can be equally divided, and the equally divided points can be regarded as the key points of the required two-dimensional contour. The method can remove points belonging to a specific face part (such as a nose part), so that key points of the face contour which are more in line with the reality can be obtained, and the contour information of the face can be reflected more truly and accurately.
According to the embodiment of the present invention, determining two-dimensional contour key points in a two-dimensional face image based on a face contour line and obtaining position information of the two-dimensional contour key points may include: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
Illustratively, the two-dimensional contour key points can be obtained by directly equally dividing the face contour line without removing points belonging to a specific face part.
According to an embodiment of the present invention, the image processing method 200 may further include: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
Illustratively, the regressor (which may be used to make up the face keypoint detector described above) may comprise a random forest or a neural network. Those skilled in the art can understand the training method of the regressor, and the detailed description is omitted here. The regressor obtained by training two-dimensional face images and face key point marking data has high positioning precision and high speed on the face key points.
According to another aspect of the present invention, there is provided an image processing apparatus. Fig. 5 shows a schematic block diagram of an image processing apparatus 500 according to an embodiment of the present invention.
As shown in fig. 5, the image processing apparatus 500 according to an embodiment of the present invention includes an initial image acquisition module 510, a three-dimensional reconstruction module 520, a pre-definition module 530, a transformation module 540, a rendering module 550, and a keypoint determination module 560. The various modules may perform the various steps/functions of the image processing method described above in connection with fig. 2-4, respectively. Only the main functions of the respective components of the image processing apparatus 500 will be described below, and details that have been described above will be omitted.
The initial image obtaining module 510 is used for obtaining an initial face image. The initial image acquisition module 510 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The three-dimensional reconstruction module 520 is configured to perform three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model. The three-dimensional reconstruction module 520 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The pre-defining module 530 is configured to pre-define face keypoint information on the three-dimensional head model. The pre-defining module 530 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The transformation module 540 is used for transforming the three-dimensional human head model. The transformation module 540 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The rendering module 550 is configured to render the transformed three-dimensional head model into a two-dimensional face image. The rendering module 550 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The key point determining module 560 is configured to determine two-dimensional face key points in the two-dimensional face image according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtain position information of the two-dimensional face key points as face key point annotation data. The keypoint determination module 560 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
According to an embodiment of the present invention, the image processing apparatus 500 further includes: an adding module (not shown) for adding additional information on the transformed three-dimensional head model before the rendering module renders the transformed three-dimensional head model into the two-dimensional face image, wherein the additional information comprises one or more of background, hair and illumination.
According to an embodiment of the present invention, the adding module includes: and the random adding submodule is used for randomly adding additional information on the transformed three-dimensional human head model.
According to the embodiment of the invention, the face key point information comprises internal point information related to internal key points of the face, and the two-dimensional face key points comprise two-dimensional internal key points.
According to the embodiment of the present invention, the interior point information includes position information of a three-dimensional interior key point in the three-dimensional head model, and the key point determining module 560 includes: and the internal key point determining submodule is used for determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points and obtaining the position information of the two-dimensional internal key points.
According to the embodiment of the invention, the face key point information comprises contour point information related to face contour key points, and the two-dimensional face key points comprise two-dimensional contour key points.
According to the embodiment of the present invention, the contour point information includes position information of a three-dimensional contour point set that may become a face contour key point in the three-dimensional head model, and the key point determining module 560 includes: the three-dimensional determining submodule is used for determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; the two-dimensional determining submodule is used for determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; a selection submodule for selecting a contour start point and a contour end point from the two-dimensional start point candidate set; the contour line determining submodule is used for determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and the contour key point determining submodule is used for determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
According to the embodiment of the invention, the contour key point determining submodule comprises: the face contour line removing unit is used for removing points which belong to specific face parts in the face contour line according to predefined face part removing conditions to obtain a new face contour line; the first equally dividing unit is used for equally dividing the new face contour line; and the first contour key point determining unit is used for determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
According to the embodiment of the invention, the contour key point determining submodule comprises: the second halving unit is used for halving the face contour line; and the second contour key point determining unit is used for determining the obtained bisector points to be two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
According to an embodiment of the present invention, the transformation module 540 includes: the rigid body transformation submodule is used for carrying out rigid body transformation on the three-dimensional head model; and/or the non-rigid body transformation submodule is used for carrying out non-rigid body transformation on the three-dimensional human head model.
According to the embodiment of the invention, the three-dimensional human head model comprises a grid model and a texture model.
According to an embodiment of the present invention, the image processing apparatus 500 further includes: a training module (not shown) for training a regressor for a face keypoint localization algorithm using the two-dimensional face image and the face keypoint labeling data.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
FIG. 6 shows a schematic block diagram of an image processing system 600 according to one embodiment of the present invention. Image processing system 600 includes an image capture device 610, a memory 620, and a processor 630.
The image capturing device 610 is used for capturing an initial face image. Image capture device 610 is optional and image processing system 600 may not include image capture device 610. In this case, other image capturing devices may be used to capture the desired initial face image and send the captured initial face image to the image processing system 600. The initial face image may also be stored in memory 620 or transmitted to the image processing system 600 by other external storage devices.
The memory 620 stores program codes for implementing respective steps in the image processing method according to the embodiment of the present invention.
The processor 630 is configured to execute the program codes (i.e., computer program instructions) stored in the memory 620 to perform the corresponding steps of the image processing method according to the embodiment of the present invention, and is configured to implement the initial image acquisition module 510, the three-dimensional reconstruction module 520, the pre-definition module 530, the transformation module 540, the rendering module 550, and the keypoint determination module 560 in the image processing apparatus 500 according to the embodiment of the present invention.
In one embodiment, the program code when executed by the processor 630 is configured to perform the steps of: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data.
In one embodiment, before the step of rendering the transformed three-dimensional head model into a two-dimensional face image is performed, the program code when executed by the processor 630 is further configured to perform: adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair and illumination.
In one embodiment, the step of adding additional information on the transformed three-dimensional head model, performed by the program code when executed by the processor 630, comprises: and randomly adding additional information on the transformed three-dimensional human head model.
In one embodiment, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
In one embodiment, the interior point information includes position information of a three-dimensional interior key point in a three-dimensional head model, and the step of determining a two-dimensional face key point in a two-dimensional face image and obtaining position information of the two-dimensional face key point as face key point annotation data according to a projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed by the processor 630, includes: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
In one embodiment, the face keypoint information comprises contour point information relating to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
In one embodiment, the contour point information includes position information of a three-dimensional contour point set in the three-dimensional head model, which may become a face contour key point, and the step of determining a two-dimensional face key point in the two-dimensional face image and obtaining position information of the two-dimensional face key point as face key point annotation data, which is executed by the processor 630, according to a projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, includes: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the program code for execution by the processor 630 for determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining location information of the two-dimensional contour keypoints comprises: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the program code for execution by the processor 630 for determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining location information of the two-dimensional contour keypoints comprises: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the step of transforming a three-dimensional head model performed by the program code when executed by the processor 630 comprises: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model.
In one embodiment, the three-dimensional head model includes a mesh model and a texture model.
In one embodiment, the program code when executed by the processor 630 is further configured to perform: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
Further, according to an embodiment of the present invention, there is also provided a storage medium on which program instructions are stored, which when executed by a computer or a processor, are used to perform the respective steps of the image processing method according to an embodiment of the present invention, and to implement the respective modules in the image processing apparatus according to an embodiment of the present invention. The storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, or any combination of the above storage media.
In one embodiment, the computer program instructions, when executed by a computer or a processor, may cause the computer or the processor to implement the respective functional modules of the image processing apparatus according to the embodiment of the present invention and/or may perform the image processing method according to the embodiment of the present invention.
In one embodiment, the computer program instructions are operable to perform the steps of: acquiring an initial face image; performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model; predefining face key point information on the three-dimensional head model; transforming the three-dimensional human head model; rendering the transformed three-dimensional human head model into a two-dimensional human face image; and determining two-dimensional face key points in the two-dimensional face image according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtaining the position information of the two-dimensional face key points as face key point annotation data.
In one embodiment, prior to the step of rendering the transformed three-dimensional head model into a two-dimensional face image, the computer program instructions are further operable when executed to perform: adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair and illumination.
In one embodiment, the step of adding additional information on the transformed three-dimensional head model for execution by the computer program instructions when executed comprises: and randomly adding additional information on the transformed three-dimensional human head model.
In one embodiment, the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
In one embodiment, the interior point information comprises position information of a three-dimensional interior key point in a three-dimensional head model, and the step of determining two-dimensional face key points in a two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to a projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, which is executed when the computer program instructions are executed, comprises: and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
In one embodiment, the face keypoint information comprises contour point information relating to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
In one embodiment, the contour point information includes position information of a three-dimensional contour point set in the three-dimensional head model that may become a face contour key point, and the step of determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data, which is executed by the computer program instructions when running, based on the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, includes: determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set; determining the positions of a two-dimensional contour point set, a two-dimensional starting point candidate set, a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional contour point set; selecting a contour starting point and a contour end point from the two-dimensional starting point candidate set; determining a face contour line based on a contour starting point, a contour end point and an outer contour line of the two-dimensional contour point set; and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the computer program instructions, when executed, for performing the steps of determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining location information of the two-dimensional contour keypoints, comprise: points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line; equally dividing the new face contour line; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the computer program instructions, when executed, for performing the steps of determining two-dimensional contour keypoints in a two-dimensional face image based on face contour lines and obtaining location information of the two-dimensional contour keypoints, comprise: equally dividing the contour line of the human face; and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
In one embodiment, the step of transforming a three-dimensional model of the human head for execution by the computer program instructions when executed comprises: carrying out rigid body transformation on the three-dimensional human head model; and/or performing non-rigid body transformation on the three-dimensional human head model.
In one embodiment, the three-dimensional head model includes a mesh model and a texture model.
In one embodiment, the computer program instructions when executed are further operable to perform: and training a regressor for the face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
The modules in the image processing system according to the embodiment of the present invention may be implemented by a processor of an electronic device implementing image processing according to the embodiment of the present invention running computer program instructions stored in a memory, or may be implemented when computer instructions stored in a computer-readable storage medium of a computer program product according to the embodiment of the present invention are run by a computer.
According to the image processing method and the image processing device, a large number of face images and face key annotation data in various different scenes can be generated very simply and rapidly. Compared with the manual labeling mode, the method and the device have high efficiency and low cost, and the human face image and the human face key labeling data obtained by the method and the device are used as training data, so that the human face key point detector with more robust performance can be obtained by training.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another device, or some features may be omitted, or not executed.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the present invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where such features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some of the blocks in an image processing apparatus according to embodiments of the present invention. The present invention may also be embodied as apparatus programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (13)

1. An image processing method comprising:
acquiring an initial face image;
performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model, wherein the three-dimensional head model comprises a grid model and a texture model;
predefining face key point information on the three-dimensional head model;
transforming the three-dimensional human head model;
rendering the transformed three-dimensional human head model into a two-dimensional human face image; and
determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information;
wherein transforming the three-dimensional head model comprises:
carrying out rigid body transformation on the three-dimensional human head model; and/or
And carrying out non-rigid body transformation on the three-dimensional human head model.
2. The image processing method of claim 1, wherein prior to said rendering the transformed three-dimensional head model as a two-dimensional face image, the image processing method further comprises:
adding additional information on the transformed three-dimensional human head model, wherein the additional information comprises one or more of background, hair, and lighting.
3. The image processing method of claim 2, wherein said adding additional information on said transformed three-dimensional head model comprises:
randomly adding the additional information on the transformed three-dimensional head model.
4. The image processing method of claim 1, wherein the face keypoint information comprises interior point information related to face interior keypoints, and the two-dimensional face keypoints comprise two-dimensional interior keypoints.
5. The image processing method of claim 4, wherein the interior point information includes position information of a three-dimensional interior keypoint in the three-dimensional head model,
determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, wherein the step of determining the two-dimensional face key points in the two-dimensional face image and the position information of the two-dimensional face key points comprises the following steps:
and determining two-dimensional internal key points obtained by projecting the three-dimensional internal key points in an imaging plane where the two-dimensional face image is located according to the projection relation and the position information of the three-dimensional internal key points, and obtaining the position information of the two-dimensional internal key points.
6. The image processing method of any of claims 1 to 5, wherein the face keypoint information comprises contour point information related to face contour keypoints, and the two-dimensional face keypoints comprise two-dimensional contour keypoints.
7. The image processing method according to claim 6, wherein the contour point information includes position information of a three-dimensional contour point set that is likely to become a key point of a face contour in the three-dimensional head model,
determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, wherein the step of determining the two-dimensional face key points in the two-dimensional face image and the position information of the two-dimensional face key points comprises the following steps:
determining the intersection of the three-dimensional contour point set and a predefined starting point plane in the three-dimensional head model as a three-dimensional starting point candidate set;
determining a two-dimensional contour point set and a two-dimensional starting point candidate set which are obtained by projecting the three-dimensional contour point set and the three-dimensional starting point candidate set in an imaging plane where the two-dimensional face image is located and positions of the two-dimensional contour point set and the two-dimensional starting point candidate set according to the projection relation and the position information of the three-dimensional contour point set;
selecting a contour starting point and a contour ending point from the two-dimensional starting point candidate set;
determining a face contour line based on the contour starting point, the contour end point and an outer contour line of the two-dimensional contour point set; and
and determining two-dimensional contour key points in the two-dimensional face image based on the face contour line and obtaining the position information of the two-dimensional contour key points.
8. The image processing method of claim 7, wherein the determining two-dimensional contour keypoints in the two-dimensional face image based on the face contour lines and obtaining position information of the two-dimensional contour keypoints comprises:
points belonging to specific human face parts in the human face contour line are removed according to predefined human face part removing conditions to obtain a new human face contour line;
equally dividing the new face contour line; and
and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
9. The image processing method of claim 7, wherein the determining two-dimensional contour keypoints in the two-dimensional face image based on the face contour lines and obtaining position information of the two-dimensional contour keypoints comprises:
equally dividing the face contour line; and
and determining the obtained bisector points as two-dimensional contour key points in the two-dimensional face image and obtaining the position information of the two-dimensional contour key points.
10. The image processing method of claim 1, wherein the image processing method further comprises:
and training a regressor for a face key point positioning algorithm by using the two-dimensional face image and the face key point marking data.
11. An image processing apparatus comprising:
the initial image acquisition module is used for acquiring an initial face image;
the three-dimensional reconstruction module is used for performing three-dimensional reconstruction on the basis of the initial face image to obtain a three-dimensional human head model, wherein the three-dimensional human head model comprises a grid model and a texture model;
the pre-defining module is used for pre-defining the key point information of the human face on the three-dimensional human head model;
the transformation module is used for transforming the three-dimensional human head model;
the rendering module is used for rendering the transformed three-dimensional human head model into a two-dimensional human face image; and
a key point determining module, configured to determine two-dimensional face key points in the two-dimensional face image according to the projection relationship between the transformed three-dimensional head model and the two-dimensional face image and the face key point information, and obtain position information of the two-dimensional face key points as face key point annotation data;
wherein the transformation module comprises:
the rigid body transformation submodule is used for carrying out rigid body transformation on the three-dimensional head model; and/or
And the non-rigid body transformation submodule is used for carrying out non-rigid body transformation on the three-dimensional head model.
12. An image processing system comprising a processor and a memory, wherein the memory has stored therein computer program instructions which, when executed by the processor, are operable to perform the steps of:
acquiring an initial face image;
performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model, wherein the three-dimensional head model comprises a grid model and a texture model;
predefining face key point information on the three-dimensional head model;
transforming the three-dimensional human head model;
rendering the transformed three-dimensional human head model into a two-dimensional human face image; and
determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information;
wherein the step of transforming the three-dimensional head model performed by the computer program instructions when executed by the processor comprises:
carrying out rigid body transformation on the three-dimensional human head model; and/or
And carrying out non-rigid body transformation on the three-dimensional human head model.
13. A storage medium having stored thereon program instructions which when executed are for performing the steps of:
acquiring an initial face image;
performing three-dimensional reconstruction based on the initial face image to obtain a three-dimensional head model, wherein the three-dimensional head model comprises a grid model and a texture model;
predefining face key point information on the three-dimensional head model;
transforming the three-dimensional human head model;
rendering the transformed three-dimensional human head model into a two-dimensional human face image; and
determining two-dimensional face key points in the two-dimensional face image and obtaining position information of the two-dimensional face key points as face key point annotation data according to the projection relation between the transformed three-dimensional head model and the two-dimensional face image and the face key point information;
wherein the step of transforming the three-dimensional head model for execution by the program instructions when executed comprises:
carrying out rigid body transformation on the three-dimensional human head model; and/or
And carrying out non-rigid body transformation on the three-dimensional human head model.
CN201710390286.2A 2017-05-27 2017-05-27 Image processing method, device and system and storage medium Active CN108961149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710390286.2A CN108961149B (en) 2017-05-27 2017-05-27 Image processing method, device and system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710390286.2A CN108961149B (en) 2017-05-27 2017-05-27 Image processing method, device and system and storage medium

Publications (2)

Publication Number Publication Date
CN108961149A CN108961149A (en) 2018-12-07
CN108961149B true CN108961149B (en) 2022-01-07

Family

ID=64495071

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710390286.2A Active CN108961149B (en) 2017-05-27 2017-05-27 Image processing method, device and system and storage medium

Country Status (1)

Country Link
CN (1) CN108961149B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111754385A (en) * 2019-03-26 2020-10-09 深圳中科飞测科技有限公司 Data point model processing method and system, detection method and system and readable medium
CN110189406B (en) * 2019-05-31 2023-11-28 创新先进技术有限公司 Image data labeling method and device
CN111652978B (en) * 2019-06-26 2024-03-05 广州虎牙科技有限公司 Grid generation method and device, electronic equipment and storage medium
CN110363175A (en) * 2019-07-23 2019-10-22 厦门美图之家科技有限公司 Image processing method, device and electronic equipment
CN111754415B (en) 2019-08-28 2022-09-27 北京市商汤科技开发有限公司 Face image processing method and device, image equipment and storage medium
CN111563959B (en) * 2020-05-06 2023-04-28 厦门美图之家科技有限公司 Updating method, device, equipment and medium of three-dimensional deformable model of human face
CN113744384B (en) * 2020-05-29 2023-11-28 北京达佳互联信息技术有限公司 Three-dimensional face reconstruction method and device, electronic equipment and storage medium
CN111695628B (en) * 2020-06-11 2023-05-05 北京百度网讯科技有限公司 Key point labeling method and device, electronic equipment and storage medium
CN111832648B (en) * 2020-07-10 2024-02-09 北京百度网讯科技有限公司 Key point labeling method and device, electronic equipment and storage medium
CN112163509A (en) * 2020-09-25 2021-01-01 咪咕文化科技有限公司 Image processing method, image processing device, network equipment and storage medium
CN113763440A (en) * 2021-04-26 2021-12-07 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN113421182B (en) * 2021-05-20 2023-11-28 北京达佳互联信息技术有限公司 Three-dimensional reconstruction method, three-dimensional reconstruction device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103210424A (en) * 2010-09-30 2013-07-17 皇家飞利浦电子股份有限公司 Image and annotation display
CN103390282A (en) * 2013-07-30 2013-11-13 百度在线网络技术(北京)有限公司 Image tagging method and device
EP2672423A1 (en) * 2012-06-08 2013-12-11 Realeyes OÜ Method and apparatus for locating features of an object using deformable models
CN104899563A (en) * 2015-05-29 2015-09-09 深圳大学 Two-dimensional face key feature point positioning method and system
CN105354531A (en) * 2015-09-22 2016-02-24 成都通甲优博科技有限责任公司 Marking method for facial key points
CN105374055A (en) * 2014-08-20 2016-03-02 腾讯科技(深圳)有限公司 Image processing method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9324638D0 (en) * 1993-12-01 1994-01-19 Philips Electronics Uk Ltd Image processing
US20100315424A1 (en) * 2009-06-15 2010-12-16 Tao Cai Computer graphic generation and display method and system
US9851877B2 (en) * 2012-02-29 2017-12-26 JVC Kenwood Corporation Image processing apparatus, image processing method, and computer program product
CN103632129A (en) * 2012-08-28 2014-03-12 腾讯科技(深圳)有限公司 Facial feature point positioning method and device
CN104157010B (en) * 2014-08-29 2017-04-12 厦门幻世网络科技有限公司 3D human face reconstruction method and device
CN104598936B (en) * 2015-02-28 2018-07-27 北京畅景立达软件技术有限公司 The localization method of facial image face key point

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103210424A (en) * 2010-09-30 2013-07-17 皇家飞利浦电子股份有限公司 Image and annotation display
EP2672423A1 (en) * 2012-06-08 2013-12-11 Realeyes OÜ Method and apparatus for locating features of an object using deformable models
CN103390282A (en) * 2013-07-30 2013-11-13 百度在线网络技术(北京)有限公司 Image tagging method and device
CN105374055A (en) * 2014-08-20 2016-03-02 腾讯科技(深圳)有限公司 Image processing method and device
CN104899563A (en) * 2015-05-29 2015-09-09 深圳大学 Two-dimensional face key feature point positioning method and system
CN105354531A (en) * 2015-09-22 2016-02-24 成都通甲优博科技有限责任公司 Marking method for facial key points

Also Published As

Publication number Publication date
CN108961149A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108961149B (en) Image processing method, device and system and storage medium
CN108875524B (en) Sight estimation method, device, system and storage medium
US10529137B1 (en) Machine learning systems and methods for augmenting images
JP7249390B2 (en) Method and system for real-time 3D capture and live feedback using a monocular camera
CN108875523B (en) Human body joint point detection method, device, system and storage medium
CN108875633B (en) Expression detection and expression driving method, device and system and storage medium
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
US11816926B2 (en) Interactive augmented reality content including facial synthesis
CN114816041A (en) Providing contextual augmented reality photo gesture guidance
US20220319127A1 (en) Facial synthesis in augmented reality content for third party applications
US20220319231A1 (en) Facial synthesis for head turns in augmented reality content
CN106650743B (en) Image strong reflection detection method and device
CN111008935A (en) Face image enhancement method, device, system and storage medium
US11875600B2 (en) Facial synthesis in augmented reality content for online communities
US20220319060A1 (en) Facial synthesis in augmented reality content for advertisements
CN108268863B (en) Image processing method and device and computer storage medium
US20220321804A1 (en) Facial synthesis in overlaid augmented reality content
US20160110909A1 (en) Method and apparatus for creating texture map and method of creating database
CN110728172B (en) Point cloud-based face key point detection method, device and system and storage medium
CN109829380B (en) Method, device and system for detecting dog face characteristic points and storage medium
KR20220156062A (en) Joint rotation inferences based on inverse kinematics
US20240135581A1 (en) Three dimensional hand pose estimator
US20230401777A1 (en) Method and apparatus for creating avatar based on body shape
AU2021455408A1 (en) Three dimensional hand pose estimator
CN108875528B (en) Face shape point positioning method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant