WO2021174939A1 - 人脸图像的获取方法与系统 - Google Patents

人脸图像的获取方法与系统 Download PDF

Info

Publication number
WO2021174939A1
WO2021174939A1 PCT/CN2020/135077 CN2020135077W WO2021174939A1 WO 2021174939 A1 WO2021174939 A1 WO 2021174939A1 CN 2020135077 W CN2020135077 W CN 2020135077W WO 2021174939 A1 WO2021174939 A1 WO 2021174939A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
depth information
face structure
processed
target
Prior art date
Application number
PCT/CN2020/135077
Other languages
English (en)
French (fr)
Inventor
陈卓均
陆进
陈斌
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174939A1 publication Critical patent/WO2021174939A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the embodiments of the present application relate to the field of image processing, and in particular, to a method and system for acquiring a face image.
  • Model-based 3D face reconstruction method is currently the more popular 3D face reconstruction method; 3D models are mainly represented by triangular meshes or point clouds, and popular models include CANDIDE-3 and 3D deformation Model (3DMM) and its variant models, 3D face reconstruction algorithms based on them include both traditional algorithms and deep learning algorithms.
  • 3D models are mainly represented by triangular meshes or point clouds, and popular models include CANDIDE-3 and 3D deformation Model (3DMM) and its variant models, 3D face reconstruction algorithms based on them include both traditional algorithms and deep learning algorithms.
  • 3DMM 3D deformation Model
  • the three-dimensional deformation model essentially uses the principal component analysis method to construct the statistical model, and the principal component analysis method is essentially a low-pass filter. Therefore, this type of method is still unsatisfactory in terms of restoring the detailed features of the face. More specifically, the inventor realized that, for example, in order to present complex facial expressions, countless small wrinkles and wrinkles, and small changes in colors and stripes cannot be ignored, and the three-dimensional deformation model adopts a low-pass filtering method, which cannot be ignored. Accurately capture and restore tiny details, resulting in a relatively weak ability to present facial expressions. And if you want to use the three-dimensional deformation model to rotate the established three-dimensional face, the result is not ideal and the accuracy is not enough.
  • the purpose of the embodiments of the present application is to provide a method and system for acquiring a face image, which improves the accuracy of acquiring a face image and image rotation.
  • an embodiment of the present application provides a method for acquiring a face image, including:
  • the embodiment of the present application also provides a face image acquisition system, including:
  • the first obtaining module is configured to obtain a picture to be processed, where the picture to be processed includes the face image of the user to be processed;
  • the second acquisition module is configured to input the face image into a key point detection model to obtain key points of the face and key point coordinates corresponding to the key points;
  • the third acquisition module is configured to input the face image and face key points into a depth prediction model to obtain depth information of the face key points;
  • a reconstruction module configured to reconstruct the three-dimensional face structure of the user to be processed according to the depth information and the key point coordinates
  • the calculation module is used to calculate the target face structure according to the three-dimensional face structure and the preset rotation angle
  • the projection module is used to project the target face structure to obtain a target image.
  • the embodiments of the present application also provide a computer device, the computer device includes a memory and a processor, the memory stores a computer program that can run on the processor, and the computer program is The processor implements the following methods when executing:
  • the embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and the computer program can be executed by at least one processor to enable the At least one processor executes the following methods:
  • This application obtains the depth information of the two-dimensional picture through the depth prediction model, then reconstructs the three-dimensional face structure according to the depth information and key point coordinates, and rotates the three-dimensional face structure according to the preset rotation angle to obtain the target picture, which improves the face image The accuracy of the acquisition and image rotation.
  • FIG. 1 is a flowchart of Embodiment 1 of the applicant's method for acquiring a face image.
  • Fig. 2 is a flowchart of training the depth prediction network in the first embodiment of the application.
  • FIG. 3 is a flowchart of step S106 in the first embodiment of this application.
  • Fig. 4 is a flowchart of step S106A in the first embodiment of the application.
  • Fig. 5 is a flowchart of step S106B in the first embodiment of the application.
  • Fig. 6 is a flowchart of step S108 in the first embodiment of the application.
  • FIG. 7 is a flowchart of step S110 in Embodiment 1 of this application.
  • FIG. 8 is a schematic diagram of program modules of Embodiment 2 of the applicant's face image acquisition system.
  • FIG. 9 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of this application.
  • the technical solution of this application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology, such as deep learning technology.
  • the data involved in this application such as face images, sample information, and/or face structure, etc.
  • FIG. 1 there is shown a flow chart of the steps of the method for acquiring a face image according to the first embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the server as the execution subject. details as follows.
  • Step S100 Obtain a picture to be processed, where the picture to be processed includes a face image of a user to be processed.
  • a picture to be processed taken by a user to be processed through a camera or mobile phone camera software is acquired, and the picture to be processed includes a front face image of the user to be processed.
  • Step S102 Input the face image to a key point detection model to obtain a face image including key points of the face and key point coordinates corresponding to the key points of the face.
  • the keypoint detection model divides the face keypoints into internal keypoints and contour keypoints according to the face image.
  • the internal keypoints include a total of 51 keypoints for eyebrows, eyes, nose, and mouth, and the contour keypoints include 17 Key points (in the embodiment of the application, CNN algorithm is used for training, and other key point values can also be obtained by training using other algorithms).
  • Root uses a four-level cascade network to detect 51 key points inside.
  • Level-1 the main function of Level-1 is to obtain the bounding box of facial organs;
  • Level-2 is the predicted position of 51 key points, which plays a role of coarse positioning here, and the purpose is to initialize Level-3;
  • Level-3 will According to different organs, the positioning is carried out from coarse to fine;
  • the input of Level-4 is to rotate the output of Level-3 to a certain extent, and finally the positions of 51 key points are output.
  • a two-level cascade network is used for detection.
  • Level-1 has the same function as internal key point detection, it is mainly to obtain the bounding box of the contour; Level-2 directly predicts 17 key points, without the process of positioning from coarse to fine, because the area of the contour key points is larger, if you add Going to Level-3 and Level-4 will take more time.
  • the 68 key points of the final face are obtained by superimposing the outputs of two cascaded CNNs. The face image is subjected to reference positioning, and the key point coordinates corresponding to the key points of the face are obtained.
  • Step S104 Input the picture to be processed into a depth prediction model to obtain depth information of the picture to be processed.
  • the depth prediction model has the characteristic of outputting corresponding depth information according to the input picture to be processed, and is obtained by pre-training.
  • training the depth prediction network includes:
  • Step S104A Acquire sample depth information and sample pictures of multiple sample users through the depth camera.
  • the sample picture obtained by the depth camera has depth information
  • the depth camera of the depth camera has three routes to obtain the sample picture: monocular structured light, TOF (time of flight) and binocular vision.
  • TOF time of flight
  • binocular vision The principle of TOF is that the sensor emits modulated near-infrared light, which is reflected after encountering an object, and the distance of the object being photographed is converted by calculating the time difference or phase difference between light emission and reflection.
  • Structured Light (Structured Light) technology is relatively more complicated. The technology projects encoded gratings or line light sources onto the measured object, and demodulates the three-dimensional information of the measured object according to the distortions they produce.
  • Binocular vision uses two ordinary cameras to calculate the distance of the measured object by means of parallax like the human eye.
  • step S104B each of the sample pictures and multiple sample depth information is used as the input of the deep learning network model, and the target depth information corresponding to the maximum degree of each sample picture is output.
  • the sample pictures are input to the deep learning network model, and the depth information of each sample picture is input to one of the cells of the first network layer of the deep learning network model, until the key points of each face are input into one-to-one correspondence In the cell; the decoder of the deep learning network model is initialized so that the calculated value of the previous sample picture does not affect the sample picture currently being tested; the decoder of the deep learning network model is set to each input
  • the sample picture outputs a single target parameter, and the parameter is depth information; before outputting the target depth information, the confidence of each depth information and the sample picture is calculated through the softmax function, and the target depth information is the highest confidence.
  • Step S104C Determine whether the target depth information is sample depth information corresponding to each sample picture.
  • the target depth information is the sample depth information corresponding to each sample picture, so as to prepare for the subsequent convergence model.
  • step S104D if yes, it means that the depth prediction model is successfully trained; if not, the confidence is recalculated through the loss function, so that the target depth information corresponding to the maximum degree of confidence is the sample depth information.
  • the target depth information is the sample depth information corresponding to the sample picture, it means that the training of the depth prediction model is successful. If the target depth information is not the sample depth information corresponding to the sample picture, the L2 loss function is further used to maximize the confidence of the sample picture and the corresponding depth information, so that each sample picture gets its corresponding depth information, and the depth prediction network is obtained .
  • Step S106 Reconstruct the three-dimensional face structure of the user to be processed according to the depth information and the key point coordinates.
  • the key point coordinates are transformed into the three-dimensional model according to the depth information, and the correction is performed to obtain the three-dimensional face structure.
  • step S106 further includes:
  • Step S106A input the depth information and the key point coordinates into a three-dimensional model to obtain a rough three-dimensional face structure of the user to be processed.
  • the coarse three-dimensional face structure obtained according to the depth information and the key point coordinates does not process the edges.
  • the obtained picture will not be accurate, so further processing is required.
  • step S106A further includes:
  • step S106A1 a two-dimensional face model is established in the three-dimensional model according to the coordinates of the key points.
  • the coordinates of the key points are input into the three-dimensional model, and the coordinates correspond to the x-plane and the y-plane of the three-dimensional model to establish a two-dimensional face model.
  • Step S106A2 input the depth information to the three-dimensional model to obtain the coarse three-dimensional face structure according to the two-dimensional face model and the depth information.
  • the depth information is used as the z-plane of the three-dimensional model, and the two-dimensional face model is processed to obtain a rough three-dimensional face structure. That is, the two-dimensional coordinates of the key point coordinates are converted into three-dimensional coordinates by adding depth information, and displayed in the three-dimensional model.
  • Step S106B Perform affine transformation on the key point coordinates and the depth information according to the three-dimensional model to obtain reconstructed point coordinates.
  • the key point coordinates and depth information are linearly transformed in the three-dimensional model to make the coarse three-dimensional face structure more three-dimensional.
  • the affine transformation maps the key point coordinates and depth information from the original face image to the three-dimensional model to obtain the reconstructed point coordinates.
  • step S106B further includes:
  • Step S106B1 Determine the vertex coordinates of the key point coordinates.
  • the vertex coordinates are the maximum value of the key point coordinates, and the vertex coordinates are used to transform the face key points based on the vertices to obtain the reconstructed point coordinates.
  • the affine transformation can convert the key point coordinates Perform multi-angle transformation.
  • Step S106B2 based on the vertex coordinates, perform affine transformation on the key point coordinates to obtain reconstruction point coordinates corresponding to the key point coordinates.
  • affine transformation is performed on each key point coordinate, and it is mapped to the three-dimensional model to obtain the reconstructed point coordinate corresponding to the key point coordinate.
  • the affine transformation is the prior art, and will not be repeated here.
  • Step S106B3 Perform affine transformation on the depth information to obtain the reconstruction point coordinates corresponding to the depth information.
  • the depth information is mapped to a vector on the Z axis, and affine transformation is performed on the vector to obtain the reconstruction point coordinates corresponding to the depth information in the three-dimensional model.
  • Step S106C Input the reconstruction point coordinates into a three-dimensional model to correct the coarse three-dimensional face structure to obtain the three-dimensional face structure of the user to be processed.
  • the reconstructed point coordinates are used to correct the coarse three-dimensional face structure to obtain the correspondingly changed three-dimensional face structure of the user to be processed in the three-dimensional model.
  • the affine transformation is to map the face image to the three-dimensional face structure of the three-dimensional model for correction.
  • Step S108 According to the three-dimensional face structure and the preset rotation angle, the target face structure is calculated.
  • the preset rotation angle is set according to user needs, such as 10 degrees, 20 degrees, etc., preferably within 30 degrees. At this time, the accuracy of the face image obtained after the three-dimensional face structure is rotated is high.
  • the rotation matrix calculates the preset rotation angle.
  • step S108 further includes:
  • Step S108A Determine the Euler angle of the three-dimensional face structure according to the preset rotation angle.
  • three angles (Eulerian angles) of yaw, pitch, and roll are determined according to the preset rotation angles, and the corresponding values respectively represent the rotation angles of the three-dimensional face structure around the three axes (x, y, z axis) of the coordinate system,
  • the preset rotation angle is only 30 degrees on the x-axis
  • the corresponding Euler angle is (30, 0, 0).
  • Step S108B Calculate the Euler angles according to the rotation matrix to obtain the rotation center of gravity value of the three-dimensional face structure.
  • R represents the center value
  • is the preset rotation angle
  • R z ( ⁇ ), R y ( ⁇ ), and R x ( ⁇ ) respectively represent the value of Euler angle, which is different from R x ( ⁇ ), R y ( ⁇ ) ), R z ( ⁇ ) corresponds.
  • Step S108C Rotate the three-dimensional face structure by the preset rotation angle by the center of gravity value to obtain the target face structure.
  • the rotation of the preset rotation angle is performed to obtain the target face structure.
  • Step S110 Project the target face structure to obtain a target image.
  • the rotated target face structure is two-dimensionally projected to obtain a two-dimensional picture, that is, the target picture.
  • the target image obtained after the picture to be processed is rotated does not change the color of the image.
  • step S110 further includes:
  • Step S110A Obtain the two-dimensional RGB information of the picture to be processed.
  • the two-dimensional RGB information of the picture to be processed is acquired, and the two-dimensional RGB information is the gray value.
  • Step S110B filling the two-dimensional RGB information into the target face structure to obtain a three-dimensional face image.
  • the two-dimensional RGB information is filled into the target face structure, and the pixels corresponding to the target face structure are filled.
  • the corresponding key point coordinates after rotation are found, and the pixels are filled to obtain Three-dimensional face image.
  • step S110C the three-dimensional face image is two-dimensionally projected and corrected by difference calculation to obtain the target image.
  • the transformed corresponding coordinate is a decimal number, which is corrected by difference calculation, and the outline and bright and dark areas of the image Show it more clearly.
  • the difference calculation methods that can be used are nearest neighbor interpolation, bilinear interpolation, and cubic polynomial interpolation.
  • FIG. 8 shows a schematic diagram of program modules of the second embodiment of the applicant's face image acquisition system.
  • the face image acquisition system 20 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors.
  • the program module referred to in the embodiments of the present application refers to a series of computer program instruction segments capable of completing specific functions, and is more suitable for describing the execution process of the facial image acquisition system 20 in the storage medium than the program itself. The following description will specifically introduce the functions of each program module in this embodiment:
  • the first obtaining module 200 is configured to obtain a picture to be processed, and the picture to be processed includes a face image of a user to be processed.
  • a picture to be processed taken by a user to be processed through a camera or mobile phone camera software is acquired, and the picture to be processed includes a front face image of the user to be processed.
  • the second acquisition module 202 is configured to input the face image into a key point detection model to obtain key points of the face and key point coordinates corresponding to the key points.
  • the keypoint detection model divides the face keypoints into internal keypoints and contour keypoints according to the face image.
  • the internal keypoints include a total of 51 keypoints for eyebrows, eyes, nose, and mouth, and the contour keypoints include 17 Key points (in the embodiment of the application, CNN algorithm is used for training, and other key point values can also be obtained by training using other algorithms).
  • Root uses a four-level cascade network to detect 51 key points inside.
  • Level-1 the main function of Level-1 is to obtain the bounding box of facial organs;
  • Level-2 is the predicted position of 51 key points, which plays a role of rough positioning here, and the purpose is to initialize Level-3;
  • Level-3 will According to different organs, the positioning is carried out from coarse to fine;
  • the input of Level-4 is to rotate the output of Level-3 to a certain extent, and finally the positions of 51 key points are output.
  • a two-level cascade network is used for detection.
  • Level-1 has the same function as internal key point detection, it is mainly to obtain the bounding box of the contour; Level-2 directly predicts 17 key points, without the process of positioning from coarse to fine, because the area of the contour key points is larger, if you add Going to Level-3 and Level-4 will take more time.
  • the 68 key points of the final face are obtained by superimposing the outputs of two cascaded CNNs. The face image is subjected to reference positioning, and the key point coordinates corresponding to the key points of the face are obtained.
  • the third obtaining module 204 is configured to input the face image and the face key points into the depth prediction model to obtain the depth information of the face key points.
  • the depth prediction model has the characteristic of outputting corresponding depth information according to the input picture to be processed, and is obtained by pre-training.
  • the third acquisition module 204 is also used to train the deep prediction network:
  • the sample picture obtained by the depth camera has depth information
  • the depth camera of the depth camera has three routes to obtain the sample picture: monocular structured light, TOF (time of flight) and binocular vision.
  • TOF time of flight
  • binocular vision The principle of TOF is that the sensor emits modulated near-infrared light, which is reflected after encountering an object, and the distance of the object being photographed is converted by calculating the time difference or phase difference between light emission and reflection.
  • Structured Light (Structured Light) technology is relatively more complicated. The technology projects encoded gratings or line light sources onto the measured object, and demodulates the three-dimensional information of the measured object according to the distortions they produce.
  • Binocular vision uses two ordinary cameras to calculate the distance of the measured object by means of parallax like the human eye.
  • Each of the sample pictures and multiple sample depth information is used as the input of the deep learning network model, and the target depth information corresponding to the maximum set degree of each of the sample pictures is output.
  • the sample pictures are input to the deep learning network model, and the depth information of each sample picture is input to one of the cells of the first network layer of the deep learning network model, until the key points of each face are input into one-to-one correspondence In the cell; the decoder of the deep learning network model is initialized so that the calculated value of the previous sample picture does not affect the sample picture currently being tested; the decoder of the deep learning network model is set to each input
  • the sample picture outputs a single target parameter, and the parameter is depth information; before outputting the target depth information, the confidence of each depth information and the sample picture is calculated through the softmax function, and the target depth information is the highest confidence.
  • the target depth information is the sample depth information corresponding to each sample picture, so as to prepare for the subsequent convergence model.
  • the confidence is recalculated through the loss function, so that the target depth information corresponding to the maximum degree of confidence is the sample depth information.
  • the target depth information is the sample depth information corresponding to the sample picture, it means that the training of the depth prediction model is successful. If the target depth information is not the sample depth information corresponding to the sample picture, the L2 loss function is further used to maximize the confidence of the sample picture and the corresponding depth information, so that each sample picture gets its corresponding depth information, and the depth prediction network is obtained .
  • the reconstruction module 206 is configured to reconstruct the three-dimensional face structure of the user to be processed according to the depth information and the key point coordinates.
  • the key point coordinates are transformed into the three-dimensional model according to the depth information, and the correction is performed to obtain the three-dimensional face structure.
  • the reconstruction module 206 is further used for:
  • the depth information and the key point coordinates are input into a three-dimensional model to obtain the rough three-dimensional face structure of the user to be processed.
  • the coarse three-dimensional face structure obtained according to the depth information and the key point coordinates does not process the edges.
  • the obtained picture will not be accurate, so further processing is required.
  • the key point coordinates and depth information are linearly transformed in the three-dimensional model to make the coarse three-dimensional face structure more three-dimensional.
  • the affine transformation maps the key point coordinates and depth information from the original face image to the three-dimensional model to obtain the reconstructed point coordinates.
  • the reconstructed point coordinates are used to correct the coarse three-dimensional face structure to obtain the correspondingly changed three-dimensional face structure of the user to be processed in the three-dimensional model.
  • the affine transformation is to map the face image to the three-dimensional face structure of the three-dimensional model for correction.
  • the calculation module 208 is configured to calculate the target face structure according to the three-dimensional face structure and the preset rotation angle.
  • the preset rotation angle is set according to user needs, such as 10 degrees, 20 degrees, etc., preferably within 30 degrees. At this time, the accuracy of the face image obtained after the three-dimensional face structure is rotated is high.
  • the rotation matrix calculates the preset rotation angle.
  • calculation module 208 is further used for:
  • the Euler angle of the three-dimensional face structure is determined according to the preset rotation angle.
  • three angles (Eulerian angles) of yaw, pitch, and roll are determined according to the preset rotation angles, and the corresponding values respectively represent the rotation angles of the three-dimensional face structure around the three axes (x, y, z axis) of the coordinate system,
  • the preset rotation angle is only 30 degrees on the x-axis
  • the corresponding Euler angle is (30, 0, 0).
  • the Euler angle is calculated according to the rotation matrix to obtain the rotation center of gravity value of the three-dimensional face structure.
  • R represents the center value
  • is the preset rotation angle
  • R z ( ⁇ ), R y ( ⁇ ), and R x ( ⁇ ) respectively represent the value of Euler angle, which is different from R x ( ⁇ ), R y ( ⁇ ) ), R z ( ⁇ ) corresponds.
  • the rotation of the preset rotation angle is performed to obtain the target face structure.
  • the projection module 210 is used to project the target face structure to obtain a target image.
  • the rotated target face structure is two-dimensionally projected to obtain a two-dimensional picture, that is, the target picture.
  • the target image obtained after the picture to be processed is rotated does not change the color of the image.
  • the projection module 210 is further used for:
  • the two-dimensional RGB information of the picture to be processed is acquired, and the two-dimensional RGB information is the gray value.
  • the two-dimensional RGB information is filled into the target face structure to obtain a three-dimensional face image.
  • the two-dimensional RGB information is filled into the target face structure, and the pixels corresponding to the target face structure are filled.
  • the corresponding key point coordinates after rotation are found, and the pixels are filled to obtain Three-dimensional face image.
  • the three-dimensional face image is two-dimensionally projected and corrected by difference calculation to obtain the target image.
  • the transformed corresponding coordinate is a decimal number, which is corrected by difference calculation, and the outline and bright and dark areas of the image Show it more clearly.
  • the difference calculation methods that can be used are nearest neighbor interpolation, bilinear interpolation, and cubic polynomial interpolation.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • the computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers).
  • the computer device 2 at least includes, but is not limited to, a memory and a processor.
  • the memory stores a computer program that can run on the processor.
  • the computer program is used by the processor. Part or all of the steps in the above method are implemented during execution.
  • the computer device may also include a network interface and/or a facial image acquisition system.
  • the computer device may include a memory 21, a processor 22, a network interface 23, and a facial image acquisition system 20.
  • the memory 21, the processor 22, the network interface 23, and the facial image can be connected to each other through a system bus. Get the system 20. in:
  • the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory ( RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 21 may be an internal storage unit of the computer device 2, for example, the hard disk or memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SMC) equipped on the computer device 2. SD) card, flash card (Flash Card), etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store an operating system and various application software installed in the computer device 2, for example, the program code of the facial image acquisition system 20 in the second embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 22 is generally used to control the overall operation of the computer device 2.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the face image acquisition system 20, so as to implement the face image acquisition method of the first embodiment.
  • the network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the server 2 and other electronic devices.
  • the network interface 23 is used to connect the server 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the server 2 and the external terminal.
  • the network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 9 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the face image acquisition system 20 stored in the memory 21 can also be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and are One or more processors (the processor 22 in this embodiment) are executed to complete the application.
  • FIG. 8 shows a schematic diagram of the program modules of the second embodiment of the face image acquisition system 20.
  • the face image acquisition system 20 can be divided into a first acquisition module 200 and a first acquisition module 200.
  • the program module referred to in the present application refers to a series of computer program instruction segments capable of completing specific functions, and is more suitable than a program to describe the execution process of the facial image acquisition system 20 in the computer device 2.
  • the specific functions of the program modules 200-210 have been described in detail in the second embodiment, and will not be repeated here.
  • This embodiment also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), only Readable memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., on which computer programs are stored, The corresponding function is realized when the program is executed by the processor.
  • the computer-readable storage medium of this embodiment is used to store the facial image acquisition system 20, and when executed by a processor, it implements the facial image acquisition method of the first embodiment.
  • the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种人脸图像的获取方法,包括:获取待处理图片,所述待处理图片包括待处理用户的人脸图像(S100);将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像(S102);将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息(S104);根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构(S106);根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构(S108);对所述目标人脸结构进行投影,得到目标图像(S110)。所述方法提高了人脸图像的获取及图片旋转的精确度。

Description

人脸图像的获取方法与系统
本申请要求于2020年3月3日提交中国专利局、申请号为202010141606.2,发明名称为“人脸图像的获取方法与系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像处理领域,尤其涉及一种人脸图像的获取方法与系统。
背景技术
传统3D人脸重建方法,大多是立足于图像信息,如基于图像亮度、边缘信息、线性透视、颜色、相对高度、视差等等一种或多种信息建模技术进行3D人脸重建。基于模型的3D人脸重建方法,是目前较为流行的3D人脸重建方法;3D模型主要用三角网格或点云来表示,现下流行的模型有通用人脸模型(CANDIDE-3)和三维变形模型(3DMM)及其变种模型,基于它们的3D人脸重建算法既有传统算法也有深度学习算法。
本领域的技术人员应知晓,三维形变模型本质上采用主成分分析方法构建统计模型,而主成分分析方法本质上是一种低通滤波。因而,这类方法在恢复人脸的细节特征方面效果仍然不理想。更具体地说,发明人意识到,例如,为了呈现人脸复杂的表情,数不胜数的微小褶皱和皱纹以及色彩和条纹的微小的变化皆不可忽略,而三维形变模型采用低通滤波的方法,无法精确捕捉并还原微小的细节,导致人脸表情的呈现能力相对较弱。且如果要使用三维形变模型对建立的三维人脸进行旋转,得到的效果不也不理想,精确度不够。
发明内容
有鉴于此,本申请实施例的目的是提供一种人脸图像的获取方法与系统,提高了人脸图像的获取及图片旋转的精确度。
为实现上述目的,本申请实施例提供了一种人脸图像的获取方法,包括:
获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像;
将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
对所述目标人脸结构进行投影,得到目标图像。
为实现上述目的,本申请实施例会还提供了一种人脸图像的获取系统,包括:
第一获取模块,用于获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
第二获取模块,用于将所述人脸图像输入至关键点检测模型,以得到人脸关键点及所述关键点对应的关键点坐标;
第三获取模块,用于将所述人脸图像与人脸关键点输入至深度预测模型,以获取所述人脸关键点的深度信息;
重建模块,用于根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
计算模块,用于根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
投影模块,用于对所述目标人脸结构进行投影,得到目标图像。
为实现上述目的,本申请实施例会还提供了一种计算机设备,所述计算机设备包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现以下方法:
获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对 应的关键点坐标的人脸图像;
将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
对所述目标人脸结构进行投影,得到目标图像。
为实现上述目的,本申请实施例会还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序可被至少一个处理器所执行,以使所述至少一个处理器执行以下方法:
获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像;
将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
对所述目标人脸结构进行投影,得到目标图像。
本申请通过深度预测模型得到二维图片的深度信息,再根据深度信息及关键点坐标重建三维人脸结构,根据预设旋转角度将三维人脸结构进行旋转,得到目标图片,提高了人脸图像的获取及图片旋转的精确度。
附图说明
图1为本申请人脸图像的获取方法实施例一的流程图。
图2为本申请实施例一中训练所述深度预测网络的流程图。
图3为本申请实施例一中步骤S106的流程图。
图4为本申请实施例一中步骤S106A的流程图。
图5为本申请实施例一中步骤S106B的流程图。
图6为本申请实施例一中步骤S108的流程图。
图7为本申请实施例一中步骤S110的流程图。
图8为本申请人脸图像的获取系统实施例二的程序模块示意图。
图9为本申请计算机设备实施例三的硬件结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的技术方案可应用于人工智能、智慧城市、区块链和/或大数据技术领域,如可涉及深度学习技术。可选的,本申请涉及的数据如人脸图像、样本信息和/或人脸结构等可存储于数据库中,或者可以存储于区块链中,如通过区块链分布式存储,本申请不做限定。
实施例一
参阅图1,示出了本申请实施例一之人脸图像的获取方法的步骤流程图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以服务器为执行主体进行示例性描述。具体如下。
步骤S100,获取待处理图片,所述待处理图片包括待处理用户的人脸图像。
具体地,获取待处理用户通过相机或者手机照相软件拍摄的待处理图片,待处理图片包括有待处理用户的人脸正面图像。
步骤S102,将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人 脸关键点对应的关键点坐标的人脸图像。
具体地,关键点检测模型根据人脸图像,将人脸关键点分为内部关键点和轮廓关键点,内部关键点包含眉毛、眼睛、鼻子、嘴巴共计51个关键点,轮廓关键点包含17个关键点(本申请实施例采用CNN算法进行训练,也可使用其他算法训练得到其他的关键点值)。根针对内部51个关键点,采用四个层级的级联网络进行检测。其中,Level-1主要作用是获得面部器官的边界框;Level-2的输出是51个关键点预测位置,这里起到一个粗定位作用,目的是为了给Level-3进行初始化;Level-3会依据不同器官进行从粗到精的定位;Level-4的输入是将Level-3的输出进行一定的旋转,最终将51个关键点的位置进行输出。针对外部17个关键点,仅采用两个层级的级联网络进行检测。Level-1与内部关键点检测的作用一样,主要是获得轮廓的bounding box;Level-2直接预测17个关键点,没有从粗到精定位的过程,因为轮廓关键点的区域较大,若加上Level-3和Level-4,会比较耗时间。最终面部68个关键点由两个级联CNN的输出进行叠加得到。将人脸图像进行基准定位,得到人脸关键点对应的关键点坐标。
步骤S104,将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息。
具体地,深度预测模型具有根据输入的待处理图片输出对应的深度信息的特性,预先进行训练得到。
示例性地,参阅图2,训练所述深度预测网络包括:
步骤S104A,通过深度相机获取多个样本用户的样本深度信息以及样本图片。
具体地,通过深度相机获取的样本图片带有深度信息,深度相机的深度摄像头有三个路线获取样本图片:单目结构光、TOF(飞行时间)和双目视觉。TOF原理是传感器发出经调制的近红外光,遇物体后反射,通过计算光线发射和反射时间差或相位差来换算被拍摄物体的距离。结构光(Structured Light)技术则要相对复杂一些,该技术将编码的光栅或线光源等投射到被测物上,根据它们产生的畸变来解调出被测物的三维信息。双目视觉则是和人眼一样用两个普通摄像头以视差的方式来计算被测物距离。
步骤S104B,将各个所述样本图片与多个样本深度信息作为深度学习网络模型的输入,输出各个所述样本图片的置性度最大对应的目标深度信息。
具体地,将样本图片输入到深度学习网络模型,每个样本图片的深度信息输入到深度学习网络模型的第一网络层的其中一个细胞中,直至将每个人脸关键点一一对应地输入到细胞中;对深度学习网络模型的解码器进行初始化处理,以使前一幅样本图片的计算值不影响当前进行测试的样本图片;将深度学习网络模型的解码器的设置为对每个输入的样本图片输出一个单一的目标参数,参数为深度信息;在输出目标深度信息之前,经过softmax函数计算各个深度信息与样本图片对应的置信度,置信度最大的即为目标深度信息。
步骤S104C,判断所述目标深度信息是否为各个所述样本图片对应的样本深度信息。
具体地,对目标深度信息是否为各个样本图片对应的样本深度信息进行判断,为后续收敛模型做准备。
步骤S104D,若是,则表示所述深度预测模型训练成功;若否,则通过损失函数重新计算置信度,以使置性度最大对应的目标深度信息为样本深度信息。
具体地,若目标深度信息是样本图片对应的样本深度信息,则表示深度预测模型训练成功。若目标深度信息不是样本图片对应的样本深度信息,则进一步使用L2损失函数将样本图片与对应的深度信息的置信度最大化,以使每个样本图片得到其对应的深度信息,得到深度预测网络。
步骤S106,根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构。
具体地,根据深度信息将关键点坐标转化到三维模型中,并进行修正以得到三维人脸 结构。
示例性地,参阅图3,步骤S106进一步包括:
步骤S106A,将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构。
具体地,根据深度信息与关键点坐标得到的粗三维人脸结构,并没有对边缘进行处理,在旋转时,得到的图片就不会准确,因此需要进一步处理。
示例性地,参阅图4,步骤S106A进一步包括:
步骤S106A1,根据所述关键点坐标在所述三维模型中建立二维人脸模型。
具体地,将关键点坐标输入至三维模型中,坐标对应三维模型的x平面与y平面,建立二维人脸模型。
步骤S106A2,将所述深度信息输入至所述三维模型,以根据所述二维人脸模型及所述深度信息得到所述粗三维人脸结构。
具体地,将深度信息作为三维模型的z平面,对二维人脸模型进行处理,得到粗三维人脸结构。即将关键点坐标的二维坐标,加上深度信息,转化为三维坐标,在三维模型中进行显示。
步骤S106B,根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标。
具体地,将关键点坐标与深度信息在三维模型中,进行线性变换,使得粗三维人脸结构更加立体。仿射变换将关键点坐标与深度信息,由原来的人脸图像,映射到三维模型上,得到重建点坐标。
示例性地,参阅图5,步骤S106B进一步包括:
步骤S106B1,确定所述关键点坐标的顶点坐标。
具体地,顶点坐标为关键点坐标的最大值,顶点坐标用于将人脸关键点做放射变换时,使得人脸图像基于该顶点进行变换,得到重建点坐标,仿射变换可以将关键点坐标进行多角度变换。
步骤S106B2,基于所述顶点坐标,对所述关键点坐标进行仿射变换,以得所述关键点坐标对应的重建点坐标。
具体地,基于顶点坐标,对每个关键点坐标进行仿射变换,将其映射到三维模型中,得到关键点坐标对应的重建点坐标。仿射变换为现有技术,在此不做赘述。
步骤S106B3,对所述深度信息进行仿射变换,以得所述深度信息对应的重建点坐标。
具体地,将深度信息映射为Z轴上的向量,并且对该向量进行仿射变换,得到三维模型中的深度信息对应的重建点坐标。
步骤S106C,将所述重建点坐标输入至三维模型中,以修正所述粗三维人脸结构得到所述待处理用户的三维人脸结构。
具体地,重建点坐标对粗三维人脸结构进行修正,得到三维模型中待处理用户对应变化的三维人脸结构。仿射变换即将人脸图像映射到三维模型的三维人脸结构进行修正。
步骤S108,根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构。
具体地,预设旋转角度为根据用户需求进行设置的,例如10度、20度等角度,优选为30度以内,此时三维人脸结构旋转后得到的人脸图像的精确度高,再通过旋转矩阵对预设旋转角度进行计算。
示例性地,参阅图6,步骤S108进一步包括:
步骤S108A,根据所述预设旋转角度确定所述三维人脸结构的欧拉角。
具体地,根据预设旋转角度确定yaw,pitch,roll三个角度(欧拉角),对应的值分别代表三维人脸结构绕坐标系三个轴(x,y,z轴)的旋转角度,当预设旋转角度是只在x轴 旋转30度时,对应的欧拉角为(30,0,0)。
步骤S108B,根据旋转矩阵计算所述欧拉角,以得到所述三维人脸结构的旋转重心值。
具体地,通过以下公式进行计算,得到三维人脸结构进行旋转的重心值:
Figure PCTCN2020135077-appb-000001
Figure PCTCN2020135077-appb-000002
Figure PCTCN2020135077-appb-000003
R=R z(α)R y(β)R x(γ),
其中,R表示中心值,θ为预设旋转角度,R z(α)、R y(β)、R x(γ)分别表示欧拉角度的值,与R x(θ)、R y(θ)、R z(θ)对应。
步骤S108C,将所述三维人脸结构以所述重心值旋转所述预设旋转角度,得到目标人脸结构。
具体地,根据三维人脸结构的重心值对应的点,进行预设旋转角度的旋转,得到目标人脸结构。
步骤S110,对所述目标人脸结构进行投影,得到目标图像。
具体地,将旋转后的目标人脸结构进行二维投影,得到二维图片,即目标图片,待处理图片经旋转后得到的目标图像,不改变图像的色彩。
示例性地,参阅图7,步骤S110进一步包括:
步骤S110A,获取所述待处理图片的二维RGB信息。
具体地,获取待处理图片的二维RGB信息,二维RGB信息即灰度值。
步骤S110B,将所述二维RGB信息填充至所述目标人脸结构,得到三维人脸图像。
具体地,将二维RGB信息填充至目标人脸结构上,将目标人脸结构对应的像素填充好,二维RGB信息进行填充时,找到旋转后对应的关键点坐标,将像素进行填充,得到三维人脸图像。
步骤S110C,将所述三维人脸图像进行二维投影,并通过差值运算矫正,得到所述目标图像。
具体地,做投影转换时,对二维RGB信息的像素进行坐标变换到目标图像上对应的点时,变换出来的对应的坐标是一个小数,通过差值运算进行矫正,图像的轮廓及明暗区域更明显的显示出来。可以采用的差值运算方法为最近邻插值法、双线性插值法与三次多项式插值法。
实施例二
请继续参阅图8,示出了本申请人脸图像的获取系统实施例二的程序模块示意图。在本实施例中,人脸图像的获取系统20可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述人脸图像的获取方法。本申请实施例所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序本身更适合于描述人脸图像的获取系统20在存储介质中的 执行过程。以下描述将具体介绍本实施例各程序模块的功能:
第一获取模块200,用于获取待处理图片,所述待处理图片包括待处理用户的人脸图像。
具体地,获取待处理用户通过相机或者手机照相软件拍摄的待处理图片,待处理图片包括有待处理用户的人脸正面图像。
第二获取模块202,用于将所述人脸图像输入至关键点检测模型,以得到人脸关键点及所述关键点对应的关键点坐标。
具体地,关键点检测模型根据人脸图像,将人脸关键点分为内部关键点和轮廓关键点,内部关键点包含眉毛、眼睛、鼻子、嘴巴共计51个关键点,轮廓关键点包含17个关键点(本申请实施例采用CNN算法进行训练,也可使用其他算法训练得到其他的关键点值)。根针对内部51个关键点,采用四个层级的级联网络进行检测。其中,Level-1主要作用是获得面部器官的边界框;Level-2的输出是51个关键点预测位置,这里起到一个粗定位作用,目的是为了给Level-3进行初始化;Level-3会依据不同器官进行从粗到精的定位;Level-4的输入是将Level-3的输出进行一定的旋转,最终将51个关键点的位置进行输出。针对外部17个关键点,仅采用两个层级的级联网络进行检测。Level-1与内部关键点检测的作用一样,主要是获得轮廓的bounding box;Level-2直接预测17个关键点,没有从粗到精定位的过程,因为轮廓关键点的区域较大,若加上Level-3和Level-4,会比较耗时间。最终面部68个关键点由两个级联CNN的输出进行叠加得到。将人脸图像进行基准定位,得到人脸关键点对应的关键点坐标。
第三获取模块204,用于将所述人脸图像与人脸关键点输入至深度预测模型,以获取所述人脸关键点的深度信息。
具体地,深度预测模型具有根据输入的待处理图片输出对应的深度信息的特性,预先进行训练得到。
示例性地,第三获取模块204还用于训练所述深度预测网络:
通过深度相机获取多个样本用户的样本深度信息以及样本图片。
具体地,通过深度相机获取的样本图片带有深度信息,深度相机的深度摄像头有三个路线获取样本图片:单目结构光、TOF(飞行时间)和双目视觉。TOF原理是传感器发出经调制的近红外光,遇物体后反射,通过计算光线发射和反射时间差或相位差来换算被拍摄物体的距离。结构光(Structured Light)技术则要相对复杂一些,该技术将编码的光栅或线光源等投射到被测物上,根据它们产生的畸变来解调出被测物的三维信息。双目视觉则是和人眼一样用两个普通摄像头以视差的方式来计算被测物距离。
将各个所述样本图片与多个样本深度信息作为深度学习网络模型的输入,输出各个所述样本图片的置性度最大对应的目标深度信息。
具体地,将样本图片输入到深度学习网络模型,每个样本图片的深度信息输入到深度学习网络模型的第一网络层的其中一个细胞中,直至将每个人脸关键点一一对应地输入到细胞中;对深度学习网络模型的解码器进行初始化处理,以使前一幅样本图片的计算值不影响当前进行测试的样本图片;将深度学习网络模型的解码器的设置为对每个输入的样本图片输出一个单一的目标参数,参数为深度信息;在输出目标深度信息之前,经过softmax函数计算各个深度信息与样本图片对应的置信度,置信度最大的即为目标深度信息。
判断所述目标深度信息是否为各个所述样本图片对应的样本深度信息。
具体地,对目标深度信息是否为各个样本图片对应的样本深度信息进行判断,为后续收敛模型做准备。
若是,则表示所述深度预测模型训练成功;若否,则通过损失函数重新计算置信度,以使置性度最大对应的目标深度信息为样本深度信息。
具体地,若目标深度信息是样本图片对应的样本深度信息,则表示深度预测模型训练成功。若目标深度信息不是样本图片对应的样本深度信息,则进一步使用L2损失函数将样本图片与对应的深度信息的置信度最大化,以使每个样本图片得到其对应的深度信息,得到深度预测网络。
重建模块206,用于根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构。
具体地,根据深度信息将关键点坐标转化到三维模型中,并进行修正以得到三维人脸结构。
示例性地,所述重建模块206还用于:
将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构。
具体地,根据深度信息与关键点坐标得到的粗三维人脸结构,并没有对边缘进行处理,在旋转时,得到的图片就不会准确,因此需要进一步处理。
根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标。
具体地,将关键点坐标与深度信息在三维模型中,进行线性变换,使得粗三维人脸结构更加立体。仿射变换将关键点坐标与深度信息,由原来的人脸图像,映射到三维模型上,得到重建点坐标。
将所述重建点坐标输入至三维模型中,以修正所述粗三维人脸结构得到所述待处理用户的三维人脸结构。
具体地,重建点坐标对粗三维人脸结构进行修正,得到三维模型中待处理用户对应变化的三维人脸结构。仿射变换即将人脸图像映射到三维模型的三维人脸结构进行修正。
计算模块208,用于根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构。
具体地,预设旋转角度为根据用户需求进行设置的,例如10度、20度等角度,优选为30度以内,此时三维人脸结构旋转后得到的人脸图像的精确度高,再通过旋转矩阵对预设旋转角度进行计算。
示例性地,所述计算模块208还用于:
根据所述预设旋转角度确定所述三维人脸结构的欧拉角。
具体地,根据预设旋转角度确定yaw,pitch,roll三个角度(欧拉角),对应的值分别代表三维人脸结构绕坐标系三个轴(x,y,z轴)的旋转角度,当预设旋转角度是只在x轴旋转30度时,对应的欧拉角为(30,0,0)。
根据旋转矩阵计算所述欧拉角,以得到所述三维人脸结构的旋转重心值。
具体地,通过以下公式进行计算,得到三维人脸结构进行旋转的重心值:
Figure PCTCN2020135077-appb-000004
Figure PCTCN2020135077-appb-000005
Figure PCTCN2020135077-appb-000006
R=R z(α)R y(β)R x(γ),
其中,R表示中心值,θ为预设旋转角度,R z(α)、R y(β)、R x(γ)分别表示欧拉角度的 值,与R x(θ)、R y(θ)、R z(θ)对应。
将所述三维人脸结构以所述重心值旋转所述预设旋转角度,得到目标人脸结构。
具体地,根据三维人脸结构的重心值对应的点,进行预设旋转角度的旋转,得到目标人脸结构。
投影模块210,用于对所述目标人脸结构进行投影,得到目标图像。
具体地,将旋转后的目标人脸结构进行二维投影,得到二维图片,即目标图片,待处理图片经旋转后得到的目标图像,不改变图像的色彩。
示例性地,所述投影模块210还用于:
获取所述待处理图片的二维RGB信息。
具体地,获取待处理图片的二维RGB信息,二维RGB信息即灰度值。
将所述二维RGB信息填充至所述目标人脸结构,得到三维人脸图像。
具体地,将二维RGB信息填充至目标人脸结构上,将目标人脸结构对应的像素填充好,二维RGB信息进行填充时,找到旋转后对应的关键点坐标,将像素进行填充,得到三维人脸图像。
将所述三维人脸图像进行二维投影,并通过差值运算矫正,得到所述目标图像。
具体地,做投影转换时,对二维RGB信息的像素进行坐标变换到目标图像上对应的点时,变换出来的对应的坐标是一个小数,通过差值运算进行矫正,图像的轮廓及明暗区域更明显的显示出来。可以采用的差值运算方法为最近邻插值法、双线性插值法与三次多项式插值法。
实施例三
参阅图9,是本申请实施例三之计算机设备的硬件架构示意图。本实施例中,所述计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。该计算机设备2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。如图9所示,所述计算机设备2至少包括,但不限于,存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述方法中的部分或全部步骤。可选的,该计算机设备还可包括网络接口和/或人脸图像的获取系统。例如,该计算机设备可包括存储器21、处理器22、网络接口23以及人脸图像的获取系统20,如可通过系统总线相互通信连接存储器21、处理器22、网络接口23、以及人脸图像的获取系统20。其中:
本实施例中,存储器21至少包括一种类型的计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例二的人脸图像的获取系统20的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备2的总 体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行人脸图像的获取系统20,以实现实施例一的人脸图像的获取方法。
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述服务器2与其他电子装置之间建立通信连接。例如,所述网络接口23用于通过网络将所述服务器2与外部终端相连,在所述服务器2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。
需要指出的是,图9仅示出了具有部件20-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。
在本实施例中,存储于存储器21中的所述人脸图像的获取系统20还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。
例如,图8示出了所述实现人脸图像的获取系统20实施例二的程序模块示意图,该实施例中,所述人脸图像的获取系统20可以被划分为第一获取模块200、第二获取模块202、第三获取模块204、重建模块206、计算模块208以及投影模块210。其中,本申请所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述所述人脸图像的获取系统20在所述计算机设备2中的执行过程。所述程序模块200-210的具体功能在实施例二中已有详细描述,在此不再赘述。
实施例四
本实施例还提供一种计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器执行时实现相应功能。本实施例的计算机可读存储介质用于存储人脸图像的获取系统20,被处理器执行时实现实施例一的人脸图像的获取方法。
可选的,本申请涉及的存储介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种人脸图像的获取方法,其中,包括:
    获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
    将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像;
    将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
    根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
    根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
    对所述目标人脸结构进行投影,得到目标图像。
  2. 根据权利要求1所述的方法,其中,训练所述深度预测网络包括:
    通过深度相机获取多个样本用户的样本深度信息以及样本图片;
    将各个所述样本图片与多个样本深度信息作为深度学习网络模型的输入,输出各个所述样本图片的置性度最大对应的目标深度信息;
    判断所述目标深度信息是否为各个所述样本图片对应的样本深度信息;
    若是,则表示所述深度预测模型训练成功;若否,则通过损失函数重新计算置信度,以使置性度最大对应的目标深度信息为样本深度信息。
  3. 根据权利要求1所述的方法,其中,根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构包括:
    将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构;
    根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标;
    将所述重建点坐标输入至三维模型中,以修正所述粗三维人脸结构得到所述待处理用户的三维人脸结构。
  4. 根据权利要求3所述的方法,其中,将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构包括:
    根据所述关键点坐标在所述三维模型中建立二维人脸模型;
    将所述深度信息输入至所述三维模型,以根据所述二维人脸模型及所述深度信息得到所述粗三维人脸结构。
  5. 根据权利要求3所述的方法,其中,根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标包括:
    确定所述关键点坐标的顶点坐标;
    基于所述顶点坐标,对所述关键点坐标进行仿射变换,以得所述关键点坐标对应的重建点坐标;
    对所述深度信息进行仿射变换,以得所述深度信息对应的重建点坐标。
  6. 根据权利要求1所述的方法,其中,根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构包括:
    根据所述预设旋转角度确定所述三维人脸结构的欧拉角;
    根据旋转矩阵计算所述欧拉角,以得到所述三维人脸结构的旋转重心值;
    将所述三维人脸结构以所述重心值旋转所述预设旋转角度,得到目标人脸结构。
  7. 根据权利要求1所述的方法,其中,对所述目标人脸结构进行投影,得到目标图像包括:
    获取所述待处理图片的二维RGB信息;
    将所述二维RGB信息填充至所述目标人脸结构,得到三维人脸图像;
    将所述三维人脸图像进行二维投影,并通过差值运算矫正,得到所述目标图像。
  8. 一种人脸图像的获取系统,其中,包括:
    第一获取模块,用于获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
    第二获取模块,用于将所述人脸图像输入至关键点检测模型,以得到人脸关键点及所述关键点对应的关键点坐标;
    第三获取模块,用于将所述人脸图像与人脸关键点输入至深度预测模型,以获取所述人脸关键点的深度信息;
    重建模块,用于根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
    计算模块,用于根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
    投影模块,用于对所述目标人脸结构进行投影,得到目标图像。
  9. 一种计算机设备,其中,所述计算机设备包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现以下方法:
    获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
    将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像;
    将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
    根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
    根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
    对所述目标人脸结构进行投影,得到目标图像。
  10. 根据权利要求9所述的计算机设备,其中,所述计算机程序被所述处理器执行时还用于实现:训练所述深度预测网络;其中,训练所述深度预测网络包括:
    通过深度相机获取多个样本用户的样本深度信息以及样本图片;
    将各个所述样本图片与多个样本深度信息作为深度学习网络模型的输入,输出各个所述样本图片的置性度最大对应的目标深度信息;
    判断所述目标深度信息是否为各个所述样本图片对应的样本深度信息;
    若是,则表示所述深度预测模型训练成功;若否,则通过损失函数重新计算置信度,以使置性度最大对应的目标深度信息为样本深度信息。
  11. 根据权利要求9所述的计算机设备,其中,根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构时,具体实现:
    将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构;
    根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标;
    将所述重建点坐标输入至三维模型中,以修正所述粗三维人脸结构得到所述待处理用户的三维人脸结构。
  12. 根据权利要求11所述的计算机设备,其中,将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构时,具体实现:
    根据所述关键点坐标在所述三维模型中建立二维人脸模型;
    将所述深度信息输入至所述三维模型,以根据所述二维人脸模型及所述深度信息得到所述粗三维人脸结构。
  13. 根据权利要求9所述的计算机设备,其中,根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构时,具体实现:
    根据所述预设旋转角度确定所述三维人脸结构的欧拉角;
    根据旋转矩阵计算所述欧拉角,以得到所述三维人脸结构的旋转重心值;
    将所述三维人脸结构以所述重心值旋转所述预设旋转角度,得到目标人脸结构。
  14. 根据权利要求9所述的计算机设备,其中,对所述目标人脸结构进行投影,得到目标图像时,具体实现:
    获取所述待处理图片的二维RGB信息;
    将所述二维RGB信息填充至所述目标人脸结构,得到三维人脸图像;
    将所述三维人脸图像进行二维投影,并通过差值运算矫正,得到所述目标图像。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质内存储有计算机程序,所述计算机程序可被至少一个处理器所执行,以使所述至少一个处理器执行以下方法:
    获取待处理图片,所述待处理图片包括待处理用户的人脸图像;
    将所述人脸图像输入至关键点检测模型,以得到包含人脸关键点及所述人脸关键点对应的关键点坐标的人脸图像;
    将所述待处理图片输入至深度预测模型,以得到所述待处理图片的深度信息;
    根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构;
    根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构;
    对所述目标人脸结构进行投影,得到目标图像。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述计算机程序被至少一个处理器所执行时还用于:训练所述深度预测网络;其中,训练所述深度预测网络包括:
    通过深度相机获取多个样本用户的样本深度信息以及样本图片;
    将各个所述样本图片与多个样本深度信息作为深度学习网络模型的输入,输出各个所述样本图片的置性度最大对应的目标深度信息;
    判断所述目标深度信息是否为各个所述样本图片对应的样本深度信息;
    若是,则表示所述深度预测模型训练成功;若否,则通过损失函数重新计算置信度,以使置性度最大对应的目标深度信息为样本深度信息。
  17. 根据权利要求15所述的计算机可读存储介质,其中,根据所述深度信息与所述关键点坐标重建所述待处理用户的三维人脸结构时,具体执行:
    将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构;
    根据所述三维模型对所述关键点坐标与所述深度信息进行仿射变换,得到重建点坐标;
    将所述重建点坐标输入至三维模型中,以修正所述粗三维人脸结构得到所述待处理用户的三维人脸结构。
  18. 根据权利要求17所述的计算机可读存储介质,其中,将所述深度信息与所述关键点坐标输入至三维模型,得到所述待处理用户的粗三维人脸结构时,具体执行:
    根据所述关键点坐标在所述三维模型中建立二维人脸模型;
    将所述深度信息输入至所述三维模型,以根据所述二维人脸模型及所述深度信息得到所述粗三维人脸结构。
  19. 根据权利要求15所述的计算机可读存储介质,其中,根据所述三维人脸结构及预设旋转角度,计算得到目标人脸结构时,具体执行:
    根据所述预设旋转角度确定所述三维人脸结构的欧拉角;
    根据旋转矩阵计算所述欧拉角,以得到所述三维人脸结构的旋转重心值;
    将所述三维人脸结构以所述重心值旋转所述预设旋转角度,得到目标人脸结构。
  20. 根据权利要求15所述的计算机可读存储介质,其中,对所述目标人脸结构进行投影,得到目标图像时,具体执行:
    获取所述待处理图片的二维RGB信息;
    将所述二维RGB信息填充至所述目标人脸结构,得到三维人脸图像;
    将所述三维人脸图像进行二维投影,并通过差值运算矫正,得到所述目标图像。
PCT/CN2020/135077 2020-03-03 2020-12-10 人脸图像的获取方法与系统 WO2021174939A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010141606.2 2020-03-03
CN202010141606.2A CN111428579A (zh) 2020-03-03 2020-03-03 人脸图像的获取方法与系统

Publications (1)

Publication Number Publication Date
WO2021174939A1 true WO2021174939A1 (zh) 2021-09-10

Family

ID=71547535

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/135077 WO2021174939A1 (zh) 2020-03-03 2020-12-10 人脸图像的获取方法与系统

Country Status (2)

Country Link
CN (1) CN111428579A (zh)
WO (1) WO2021174939A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920282A (zh) * 2021-11-15 2022-01-11 广州博冠信息科技有限公司 图像处理方法和装置、计算机可读存储介质、电子设备
CN114266860A (zh) * 2021-12-22 2022-04-01 西交利物浦大学 三维人脸模型建立方法、装置、电子设备及存储介质
CN114373056A (zh) * 2021-12-17 2022-04-19 云南联合视觉科技有限公司 一种三维重建方法、装置、终端设备及存储介质
CN114373043A (zh) * 2021-12-16 2022-04-19 聚好看科技股份有限公司 一种头部三维重建方法及设备
CN114581627A (zh) * 2022-03-04 2022-06-03 合众新能源汽车有限公司 基于arhud的成像方法和系统
CN114758076A (zh) * 2022-04-22 2022-07-15 北京百度网讯科技有限公司 一种用于建立三维模型的深度学习模型的训练方法及装置
CN115620094A (zh) * 2022-12-19 2023-01-17 南昌虚拟现实研究院股份有限公司 关键点的标注方法、装置、电子设备及存储介质
CN116503524A (zh) * 2023-04-11 2023-07-28 广州赛灵力科技有限公司 一种虚拟形象的生成方法、系统、装置及存储介质
CN116758124A (zh) * 2023-06-16 2023-09-15 北京代码空间科技有限公司 一种3d模型修正方法及终端设备
CN114373056B (zh) * 2021-12-17 2024-08-02 云南联合视觉科技有限公司 一种三维重建方法、装置、终端设备及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428579A (zh) * 2020-03-03 2020-07-17 平安科技(深圳)有限公司 人脸图像的获取方法与系统
CN111985384A (zh) * 2020-08-14 2020-11-24 深圳地平线机器人科技有限公司 获取脸部关键点的3d坐标及3d脸部模型的方法和装置
CN112163509B (zh) * 2020-09-25 2024-05-07 咪咕文化科技有限公司 图像处理方法、装置、网络设备及存储介质
CN112233161B (zh) * 2020-10-15 2024-05-17 北京达佳互联信息技术有限公司 手部图像深度确定方法、装置、电子设备及存储介质
CN112487923A (zh) * 2020-11-25 2021-03-12 奥比中光科技集团股份有限公司 一种人脸头部姿态训练数据的获取方法及系统
CN112613357B (zh) * 2020-12-08 2024-04-09 深圳数联天下智能科技有限公司 人脸测量方法、装置、电子设备和介质
CN112541484B (zh) * 2020-12-28 2024-03-19 平安银行股份有限公司 人脸抠图方法、系统、电子装置及存储介质
CN113435342B (zh) * 2021-06-29 2022-08-12 平安科技(深圳)有限公司 活体检测方法、装置、设备及存储介质
CN113627394B (zh) * 2021-09-17 2023-11-17 平安银行股份有限公司 人脸提取方法、装置、电子设备及可读存储介质
CN113961734B (zh) * 2021-12-22 2022-04-01 松立控股集团股份有限公司 基于停车数据和app操作日志的用户和车辆画像构建方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102054291A (zh) * 2009-11-04 2011-05-11 厦门市美亚柏科信息股份有限公司 一种基于单幅人脸图像实现三维人脸重建的方法及其装置
CN108197587A (zh) * 2018-01-18 2018-06-22 中科视拓(北京)科技有限公司 一种通过人脸深度预测进行多模态人脸识别的方法
CN108376421A (zh) * 2018-02-28 2018-08-07 浙江神造科技有限公司 一种基于阴影恢复形状法生成人脸三维模型的方法
CN109697688A (zh) * 2017-10-20 2019-04-30 虹软科技股份有限公司 一种用于图像处理的方法和装置
US20190164341A1 (en) * 2017-11-27 2019-05-30 Fotonation Limited Systems and Methods for 3D Facial Modeling
CN109978930A (zh) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 一种基于单幅图像的风格化人脸三维模型自动生成方法
CN111428579A (zh) * 2020-03-03 2020-07-17 平安科技(深圳)有限公司 人脸图像的获取方法与系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005755B (zh) * 2014-04-25 2019-03-29 北京邮电大学 三维人脸识别方法和系统
CN108549873B (zh) * 2018-04-19 2019-12-24 北京华捷艾米科技有限公司 三维人脸识别方法和三维人脸识别系统
WO2020037676A1 (zh) * 2018-08-24 2020-02-27 太平洋未来科技(深圳)有限公司 三维人脸图像生成方法、装置及电子设备
CN109508678B (zh) * 2018-11-16 2021-03-30 广州市百果园信息技术有限公司 人脸检测模型的训练方法、人脸关键点的检测方法和装置
CN109377556B (zh) * 2018-11-22 2022-11-01 厦门美图之家科技有限公司 人脸图像特征处理方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102054291A (zh) * 2009-11-04 2011-05-11 厦门市美亚柏科信息股份有限公司 一种基于单幅人脸图像实现三维人脸重建的方法及其装置
CN109697688A (zh) * 2017-10-20 2019-04-30 虹软科技股份有限公司 一种用于图像处理的方法和装置
US20190164341A1 (en) * 2017-11-27 2019-05-30 Fotonation Limited Systems and Methods for 3D Facial Modeling
CN108197587A (zh) * 2018-01-18 2018-06-22 中科视拓(北京)科技有限公司 一种通过人脸深度预测进行多模态人脸识别的方法
CN108376421A (zh) * 2018-02-28 2018-08-07 浙江神造科技有限公司 一种基于阴影恢复形状法生成人脸三维模型的方法
CN109978930A (zh) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 一种基于单幅图像的风格化人脸三维模型自动生成方法
CN111428579A (zh) * 2020-03-03 2020-07-17 平安科技(深圳)有限公司 人脸图像的获取方法与系统

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920282A (zh) * 2021-11-15 2022-01-11 广州博冠信息科技有限公司 图像处理方法和装置、计算机可读存储介质、电子设备
CN114373043A (zh) * 2021-12-16 2022-04-19 聚好看科技股份有限公司 一种头部三维重建方法及设备
CN114373056A (zh) * 2021-12-17 2022-04-19 云南联合视觉科技有限公司 一种三维重建方法、装置、终端设备及存储介质
CN114373056B (zh) * 2021-12-17 2024-08-02 云南联合视觉科技有限公司 一种三维重建方法、装置、终端设备及存储介质
CN114266860A (zh) * 2021-12-22 2022-04-01 西交利物浦大学 三维人脸模型建立方法、装置、电子设备及存储介质
CN114581627A (zh) * 2022-03-04 2022-06-03 合众新能源汽车有限公司 基于arhud的成像方法和系统
CN114581627B (zh) * 2022-03-04 2024-04-16 合众新能源汽车股份有限公司 基于arhud的成像方法和系统
CN114758076A (zh) * 2022-04-22 2022-07-15 北京百度网讯科技有限公司 一种用于建立三维模型的深度学习模型的训练方法及装置
CN115620094A (zh) * 2022-12-19 2023-01-17 南昌虚拟现实研究院股份有限公司 关键点的标注方法、装置、电子设备及存储介质
CN116503524A (zh) * 2023-04-11 2023-07-28 广州赛灵力科技有限公司 一种虚拟形象的生成方法、系统、装置及存储介质
CN116503524B (zh) * 2023-04-11 2024-04-12 广州赛灵力科技有限公司 一种虚拟形象的生成方法、系统、装置及存储介质
CN116758124A (zh) * 2023-06-16 2023-09-15 北京代码空间科技有限公司 一种3d模型修正方法及终端设备

Also Published As

Publication number Publication date
CN111428579A (zh) 2020-07-17

Similar Documents

Publication Publication Date Title
WO2021174939A1 (zh) 人脸图像的获取方法与系统
CN110910486B (zh) 室内场景光照估计模型、方法、装置、存储介质以及渲染方法
US11514593B2 (en) Method and device for image processing
US11302064B2 (en) Method and apparatus for reconstructing three-dimensional model of human body, and storage medium
CN109859296B (zh) Smpl参数预测模型的训练方法、服务器及存储介质
CN106940704B (zh) 一种基于栅格地图的定位方法及装置
CN111243093B (zh) 三维人脸网格的生成方法、装置、设备及存储介质
CN108305312B (zh) 3d虚拟形象的生成方法和装置
US9679192B2 (en) 3-dimensional portrait reconstruction from a single photo
CN113269862B (zh) 场景自适应的精细三维人脸重建方法、系统、电子设备
WO2023284713A1 (zh) 一种三维动态跟踪方法、装置、电子设备和存储介质
WO2024007478A1 (zh) 基于单手机的人体三维建模数据采集与重建方法及系统
CN111382618B (zh) 一种人脸图像的光照检测方法、装置、设备和存储介质
CN113689578A (zh) 一种人体数据集生成方法及装置
CN116563493A (zh) 基于三维重建的模型训练方法、三维重建方法及装置
CN113223137B (zh) 透视投影人脸点云图的生成方法、装置及电子设备
CN117115358B (zh) 数字人自动建模方法及装置
CN113435367A (zh) 社交距离评估方法、装置及存储介质
CN109166176B (zh) 三维人脸图像的生成方法与装置
CN111597963A (zh) 用于图像中人脸的补光方法、系统、介质以及电子设备
US10861174B2 (en) Selective 3D registration
CN108921908B (zh) 表面光场的采集方法、装置及电子设备
TWI819639B (zh) 深度估計模型之訓練方法、裝置、電子設備及存儲介質
CN116958449B (zh) 城市场景三维建模方法、装置及电子设备
CN117876609B (zh) 一种多特征三维人脸重建方法、系统、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20922787

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20922787

Country of ref document: EP

Kind code of ref document: A1