WO2021232690A1 - 一种视频生成方法、装置、电子设备及存储介质 - Google Patents

一种视频生成方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2021232690A1
WO2021232690A1 PCT/CN2020/126223 CN2020126223W WO2021232690A1 WO 2021232690 A1 WO2021232690 A1 WO 2021232690A1 CN 2020126223 W CN2020126223 W CN 2020126223W WO 2021232690 A1 WO2021232690 A1 WO 2021232690A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame
dimensional
face image
processed
Prior art date
Application number
PCT/CN2020/126223
Other languages
English (en)
French (fr)
Inventor
刘晓强
张国鑫
马里千
金博
张博宁
孙佳佳
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2021232690A1 publication Critical patent/WO2021232690A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular, to a video generation method, device, electronic device, and storage medium.
  • a facial video with facial expression changes is generated based on a two-dimensional face image
  • the two-dimensional facial image is often adjusted manually or the designer uses an animation tool to create multiple frames of facial expression images, and then based on multiple frames
  • the facial expression image generates a facial video with facial expression changes.
  • the inventor realizes that the above-mentioned facial video generation process is complicated and consumes a lot of manpower. It is impossible to generate facial videos with facial expression changes on a large scale.
  • the face video relies on the designer's technology, and the quality of the generated face video cannot be guaranteed.
  • the embodiments of the present application provide a video generation method, device, electronic device, and storage medium, which are used to simplify the process of generating a dynamic facial video based on a two-dimensional facial image.
  • a video generation method including:
  • 3DMM parameters of the two-dimensional face image to be processed where the 3DMM parameters include face shape parameters and facial feature parameters;
  • the facial feature parameters of the to-be-processed two-dimensional face image are respectively adjusted according to the facial feature parameters of each frame of image to obtain the adjusted image corresponding to each frame of image
  • the facial feature parameters of the to-be-processed two-dimensional face image; and the adjusted facial feature parameters of the to-be-processed two-dimensional face image corresponding to each frame of image, and the to-be-processed two-dimensional face image Constructing a three-dimensional model based on the face shape parameters of the face shape parameters and the facial feature parameters of the two-dimensional face image to be processed to obtain a target frame face image corresponding to each frame of image;
  • a target face video corresponding to the two-dimensional face image to be processed is obtained.
  • a video generation device including:
  • the parameter acquisition unit is configured to perform key point recognition and three-dimensional reconstruction of the two-dimensional face image to be processed to obtain the three-dimensional face deformation 3DMM parameters of the two-dimensional face image to be processed, where the 3DMM parameters include face shape parameters And facial feature parameters;
  • the target frame face image acquisition unit is configured to perform, for each frame image in the face video template, the facial feature parameters of the two-dimensional face image to be processed are adjusted according to the facial feature parameters of each frame image, Obtain the adjusted facial feature parameters of the two-dimensional face image to be processed corresponding to each frame of image; and based on the face of the adjusted two-dimensional face image to be processed corresponding to each frame of image
  • the feature parameters, the face shape parameters of the two-dimensional face image to be processed, and the facial feature parameters of the two-dimensional face image to be processed are constructed for a three-dimensional model to obtain a target frame face image corresponding to each frame of image ;
  • the video generation unit is configured to execute a target face video corresponding to the two-dimensional face image to be processed based on the target frame face image corresponding to each frame of image.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor is configured to execute the following: The process described in any of the aspects and possible implementations.
  • a computer-readable storage medium In a fourth aspect of the present application, a computer-readable storage medium is provided.
  • the computer-readable storage medium carries one or more computer instruction programs.
  • the One or more processors execute the method described in any one of the first aspect and a possible implementation manner.
  • the embodiment of the application can generate a target face video consistent with the facial feature information of the face video template for the two-dimensional face image to be processed, which simplifies the process of generating a dynamic target face video based on the two-dimensional face image to be processed. And it improves the efficiency of generating the target face video.
  • FIG. 1 is a schematic flowchart of a video generation method provided by an embodiment of this application.
  • FIG. 2 is a schematic diagram of a two-dimensional grid model provided by an embodiment of this application.
  • FIG. 3 is a schematic diagram of a process for obtaining facial feature parameters of each frame of an image in a face video template provided by an embodiment of the application;
  • FIG. 4 is a schematic diagram of attitude angle information provided by an embodiment of this application.
  • FIG. 5 is a schematic diagram of an adjusted oral area of a face image of a target frame provided by an embodiment of the application
  • FIG. 6 is a schematic diagram of a process of obtaining a target frame face image corresponding to any frame image in a face video template according to an embodiment of the application;
  • FIG. 7 is a schematic diagram of a to-be-processed two-dimensional face image provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of a frame of image in a face video template provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of a face image of a target frame provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a video generation device provided by an embodiment of this application.
  • FIG. 11 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • Basel face model 2009 version (base face model 2009, bfm2009): a three-dimensional mesh model (3D face model) used for face recognition with unchanged pose and illumination.
  • base face model 2009, bfm2009 a three-dimensional mesh model (3D face model) used for face recognition with unchanged pose and illumination.
  • 3D Morphable Models It is a three-dimensional face morphing model. This model is defined by a series of parameters. These parameters are divided into: shape, reflection, projection, identity, etc., through a given set of such Parameters, generate a three-dimensional model, of course, two-dimensional pictures can also be generated; two-dimensional pictures can also be used to predict such a set of 3DMM parameters, so as to predict the three-dimensional model corresponding to the two-dimensional picture.
  • the related technology needs to add dynamic expressions to the two-dimensional face image uploaded by the user to form a dynamic expression package, but the inventor realizes that it is based on a two-dimensional face image.
  • designers often manually adjust two-dimensional face images or use animation tools to make multiple frames of facial expression images, and then generate a face video with expression changes. The process is complicated and complicated.
  • the embodiments of the present application design a video generation method, device, electronic device, and storage medium to simplify the process of generating a face video based on a two-dimensional face image, including: obtaining a to-be-processed two-dimensional face image based on a 3DMM model
  • the 3DMM parameters include face shape parameters and facial feature parameters; and based on the facial feature parameters of each frame image in the face template video, the 3DMM parameters of the two-dimensional face image to be processed are adjusted to obtain the same parameters as each frame image.
  • the corresponding adjusted facial feature parameters of the two-dimensional face image to be processed are further based on the adjusted facial feature parameters of the two-dimensional face image to be processed and the two-dimensional face image to be processed corresponding to each frame of image.
  • the facial feature parameters of the target frame face image and the corresponding frame image in the face template video tend to be consistent, so that the facial feature information in the target frame face image obtained is the facial feature information of the corresponding frame image in the face template video Converge.
  • an embodiment of the present application provides a video generation method, which specifically includes the following steps:
  • Step S101 Perform key point recognition and three-dimensional reconstruction on the two-dimensional face image to be processed to obtain the three-dimensional face deformation 3DMM parameters of the two-dimensional face image to be processed.
  • the 3DMM parameters include face shape parameters and facial feature information.
  • the above-mentioned key point recognition may, but is not limited to, using a more mature neural network model inference to obtain a first set number of two-dimensional key points of the face in the to-be-processed two-dimensional face image, where Neural network models can include, but are not limited to, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and Deep Neural Networks (DNN), etc.; in order to ensure the final generated target face
  • CNN Convolutional Neural Networks
  • RNN Recurrent Neural Networks
  • DNN Deep Neural Networks
  • the recognized two-dimensional key points can be three-dimensionally reconstructed.
  • the three-dimensional reconstruction can be performed by using 3dmm of bfm2009 or 3dmm of bfm2017 or ordinary 3dmm method to obtain the to-be-processed
  • the 3DMM parameters of the two-dimensional face image can be used.
  • Step S102 For each frame of image in the face video template, the facial feature parameters of the to-be-processed two-dimensional face image are respectively adjusted according to the facial feature parameters of each frame of image to obtain the adjusted image corresponding to each frame of image.
  • the parameters and the facial feature parameters of the above-mentioned two-dimensional face image to be processed are constructed for a three-dimensional model to obtain a target frame face image corresponding to each of the above-mentioned images.
  • the purpose of adjusting the facial feature parameters of the two-dimensional face image to be processed is: try to make the facial features represented by the facial feature parameters of the two-dimensional face image to be processed after adjustment, and the facial features of each frame of image
  • the facial features represented by the parameters tend to be consistent, that is, the facial feature information represented by the adjusted facial feature parameters corresponding to each frame image tends to be consistent with the facial feature information represented by the facial feature parameters of each frame image.
  • the facial feature parameters of the two-dimensional human face to be processed can be separately adjusted according to the facial feature parameters of each frame of image in the following manner. Make adjustments to obtain the adjusted facial feature parameters of the to-be-processed two-dimensional face image corresponding to each frame of image:
  • the expression parameters of each frame of image and the posture angle information of the adjusted two-dimensional face image to be processed corresponding to each frame of image are respectively determined as the adjusted two-dimensional face image to be processed corresponding to each frame of image.
  • the facial feature parameters of the face image are respectively determined as the adjusted two-dimensional face image to be processed corresponding to each frame of image.
  • the expression parameters in the facial feature parameters of each frame of image are taken as part of the facial feature parameters of the adjusted two-dimensional face image to be processed corresponding to each frame of image, that is, each frame of image corresponds to the modified two-dimensional face to be processed.
  • the facial feature parameters of the face image retain the facial expression features in each frame of the face video template; adjust the posture angle information corresponding to the two-dimensional face image to be processed based on the posture angle information corresponding to each frame image, and obtain The posture angle information of the adjusted two-dimensional face image to be processed corresponding to each frame of image; and the adjusted posture angle information corresponding to each frame of image is used as the adjusted posture angle information corresponding to each frame of image to be processed Part of the facial feature parameters of the face image. That is, the adjusted facial feature parameters corresponding to each frame of image retain the posture feature of the face in the to-be-processed two-dimensional face image according to the posture feature of the face in each frame of image.
  • the target frame face image corresponding to each frame of image can be obtained in the following manner:
  • the pixel value of each pixel in the adjusted two-dimensional mesh model 2dmesh_new is replaced with the pixel value of the corresponding pixel in the two-dimensional mesh model 2dmesh_ori before adjustment to obtain the target frame face image corresponding to each frame of image.
  • the two-dimensional grid model can be regarded as a projection of the three-dimensional grid model on a two-dimensional plane. Please refer to FIG. 2.
  • An embodiment of the present application also provides a schematic diagram of a two-dimensional grid model. It includes a group of discrete points and the small triangles surrounded by these points. Each small triangle can be considered as a pixel, and the color in each small triangle can be considered as the pixel value of the pixel.
  • the adjusted two-dimensional mesh model 2dmesh_new and the two-dimensional mesh model before adjustment 2dmesh_ori are the two-dimensional mesh models corresponding to the faces in the two-dimensional graphics to be processed, but some of the pixels related to expressions in 2dmesh_new are relative to 2dmesh_ori It is changed.
  • 2dmesh_new may be a two-dimensional grid model with some changes in the pixels of the mouth area 201 in Figure 2; therefore, the pixels in 2dmesh_new and the pixels in 2dmesh_ori have a corresponding relationship. According to this correspondence, the pixel value of each pixel in 2dmesh_new can be replaced with the pixel value of each pixel in 2dmesh_ori.
  • Step S103 Obtain a target face video corresponding to the aforementioned two-dimensional face image to be processed based on the target frame face image corresponding to each frame of image.
  • the target frame face images corresponding to each frame image can be arranged to obtain the target face video corresponding to the two-dimensional image of the face to be processed .
  • the facial feature parameters of the above-mentioned two-dimensional face image to be processed are adjusted according to the facial feature parameters of each frame of image, and the corresponding facial feature parameters of each frame of image are obtained.
  • the method further includes: performing key point recognition on each frame of the image; performing three-dimensional reconstruction on each frame of the image according to the key point recognition result to obtain each The facial feature parameters in the 3DMM parameters of the frame image.
  • key point recognition and three-dimensional reconstruction are performed on the two-dimensional face image to be processed to obtain the facial feature information of the two-dimensional face image to be processed; the key point recognition can be, but not limited to, more mature
  • the neural network model inference obtains the second set number of two-dimensional key points of each frame of image.
  • the neural network model can include but is not limited to CNN, RNN or DNN, etc.; in order to ensure the final generated target face video in the face
  • the degree of authenticity can be, but is not limited to, setting the above-mentioned second set number to 101.
  • the identified two-dimensional key points can be reconstructed in three dimensions.
  • the 3DMM of each frame of image can be obtained by using 3dmm of bfm2009 or 3dmm of bfm2017 or ordinary 3dmm.
  • the facial feature parameters in the parameters can be used.
  • a process for obtaining facial feature parameters of each frame of image in a face video template is provided, which may include:
  • Step S301 input a face video template, and obtain each frame of image of the above face video template.
  • Step S302 Perform key point identification on each frame of image to obtain key points of each frame of image.
  • step S303 three-dimensional reconstruction is performed on the key points of each frame of image through the 3dmm of bfm2009.
  • Step S304 according to the result of performing three-dimensional reconstruction on the key points of each frame of image, extract the facial feature parameters in the 3DMM parameters of each frame of image.
  • the extracted facial feature parameters of each frame of image may be stored as a preprocessing template for later use when generating a target face video from the two-dimensional face image to be processed.
  • the aforementioned attitude angle information includes at least one attitude angle parameter among the pitch angle yaw, the yaw angle pitch, and the roll angle roll.
  • FIG. 4 shows a pitch angle yaw, yaw angle pitch, and roll.
  • the posture angle information of the two-dimensional face image to be processed can be adjusted based on the posture angle information of each frame of image to obtain the adjusted two-dimensional face image to be processed corresponding to each frame of image.
  • the attitude angle information :
  • each of the aforementioned at least one attitude angle parameters determine an average attitude angle parameter of each of the aforementioned attitude angle parameters of all frame images in the face video template;
  • each posture angle information after adjustment corresponding to each frame of image is determined.
  • formula 1 principle can be used to adjust the elevation angle yaw in the facial feature parameters of the two-dimensional face image to be processed based on the elevation angle yaw in the facial feature parameters of each frame of image to obtain an adjustment corresponding to each frame of image
  • the pitch angle yaw of the two-dimensional face image to be processed afterwards:
  • src1.yaw is the pitch angle of the adjusted two-dimensional face image to be processed corresponding to each frame of image
  • dst.yaw is the pitch angle in the facial feature parameters of each frame of image
  • dst.meanyaw is the face.
  • k1 is the adjustment parameter of the elevation angle.
  • the two-dimensional face image to be processed is obviously deformed, and the adjustment of the pitch angle of the two-dimensional face image to be processed is too small to cause the two-dimensional face to be processed
  • the above k1 can be set to 0.2 or 0.3, but not limited to.
  • the following formula 2 can be used to adjust the yaw angle pitch in the facial feature parameters of the two-dimensional face image to be processed based on the yaw angle pitch in the facial feature parameters of each frame of image to obtain the adjusted image corresponding to each frame of image
  • the yaw angle pitch of the two-dimensional face image to be processed :
  • src1.pitch is the adjusted yaw angle of the to-be-processed two-dimensional face image corresponding to each frame of image
  • dst.pitch is the yaw angle in the facial feature parameters of each frame of image
  • dst.meanpitch Is the average value of the yaw angle in the facial feature parameters of all frame images in the face video template
  • k2 is the adjustment parameter of the yaw angle.
  • the two-dimensional face image to be processed is obviously deformed, and the adjustment of the yaw angle of the two-dimensional face image to be processed is too small to cause the processing
  • the above k2 can be set to 0.2 or 0.3, but is not limited to.
  • the following formula 3 can be used to adjust the roll angle roll in the facial feature number of the two-dimensional face image to be processed based on the roll angle roll in the facial feature parameters of each frame of image to obtain the adjusted pending image corresponding to each frame of image. Processing the roll angle roll of the two-dimensional face image:
  • src1.roll is the roll angle of the adjusted two-dimensional face image to be processed corresponding to each frame of image
  • dst.roll is the roll angle in the facial feature parameters of each frame of image
  • dst.meanroll is the face The average value of the roll angle in the facial feature parameters of all frame images in the video template
  • k3 is the adjustment parameter of the roll angle.
  • the two-dimensional face image to be processed is obviously deformed, and the adjustment of the roll angle of the two-dimensional face image to be processed is too small to cause the two-dimensional face image to be processed
  • the above k3 can be set to 0.1 or 0.2.
  • the adjustment parameter of the roll angle in the embodiment of the present application may be, but is not limited to, the adjustment parameter slightly smaller than the pitch angle or the adjustment parameter of the yaw angle.
  • the mouth area of the face in each frame of the image in the face video template is open, that is, the face in the face video template is smiling with an open mouth
  • the second step is to be processed. If the face in the face image is closed; or the face in the face video template is closed, and if the face in the two-dimensional face image to be processed is open, the target frame face image is obtained
  • the facial expression in the middle face may be abnormal, so the pixel value of each pixel in the above-mentioned adjusted two-dimensional grid model is replaced with the pixel value of the corresponding pixel in the above-mentioned two-dimensional grid model before adjustment to obtain the above-mentioned image of each frame
  • the range of the oral cavity in the two-dimensional grid model corresponding to the frame of image is relatively large, which can then be based on the second frame of the image.
  • the dimensional grid model adjusts the mouth edge points of the face image of the corresponding target frame, so that the adjusted mouth edge point is the same as the range of the mouth area in the two-dimensional grid model corresponding to the frame image, and then based on the preset
  • the mouth grid template fills the adjusted mouth edge points with the pixels of the mouth area surrounded by the city; if the face in the frame of the image is closed, the range of the mouth area in the two-dimensional grid model corresponding to the frame of image is relatively small, based on The two-dimensional grid model corresponding to the frame image adjusts the mouth edge points of the face image of the corresponding target frame. Since the face in the town image is closed, the adjusted mouth edge points have a smaller range of the mouth area surrounded by the city. At this time, even if the pixels of the mouth area surrounded by the adjusted mouth edge points are filled based on the preset mouth grid template, the mouth area surrounded by the adjusted mouth edge points is smaller.
  • the pixel value of each pixel in the adjusted two-dimensional grid model may be replaced with the two-dimensional grid before adjustment.
  • 16 oral edge points can be detected through key point recognition, and then the positions of these 16 oral edge points can be adjusted; because the teeth are blocked when the mouth is closed It becomes darker, so it is necessary to replace the pixel value of each pixel of the mouth area surrounded by the adjusted mouth edge point with the pixel value of the corresponding pixel in the preset mouth grid template, in order to make the adjusted mouth area and the target frame face
  • the other parts of the image are better fused, and alphablend can be used to fuse the oral boundary at the boundary of the adjusted oral area.
  • Fig. 5 it is a schematic diagram of the adjusted oral area of a target frame face image.
  • the following provides a process for obtaining the target frame face image corresponding to any frame image in the face video template, which specifically includes the following steps:
  • Step S601 Perform key point recognition and three-dimensional reconstruction on the two-dimensional face image to be processed to obtain face shape parameters and facial feature parameters of the two-dimensional face image to be processed, where the facial feature parameters include expression parameters and posture angle information;
  • Step S602 Acquire facial feature parameters of the arbitrary frame image, where the facial feature parameters include expression parameters and posture angle information;
  • Step S603 Adjusting the posture angle information of the to-be-processed two-dimensional image based on the posture angle information of the arbitrary frame of image, to obtain the posture angle information of the adjusted to-be-processed two-dimensional face image corresponding to the arbitrary frame of image;
  • Step S604 Determine the expression parameter of the arbitrary frame image and the posture angle information of the adjusted two-dimensional face image to be processed corresponding to the arbitrary frame image as the adjusted facial feature parameters corresponding to the arbitrary frame image.
  • step S605 a three-dimensional model is constructed according to the facial feature parameters of the two-dimensional image to be processed and the face shape parameters of the two-dimensional image to be processed to obtain a three-dimensional mesh model 3dmesh_ori before adjustment.
  • Step S606 Construct a three-dimensional model according to the adjusted facial feature parameters of the two-dimensional face image to be processed and the face shape parameters of the two-dimensional face image to be processed corresponding to the arbitrary frame image, to obtain an adjusted three-dimensional mesh model 3dmesh_new.
  • Step S607 respectively project 3dmesh_ori and 3dmesh_new to the same plane to obtain the adjusted two-dimensional mesh model 2dmesh_ori and the adjusted two-dimensional mesh model 2dmesh_new; and replace the pixel value of each pixel in 2dmesh_new with the corresponding pixel in 2dmesh_ori Pixel value, the target frame face image corresponding to the arbitrary frame image is obtained.
  • Step S608 Identify the mouth edge points of the face image of the target frame; and adjust the mouth edge points based on the mouth area in the two-dimensional grid model corresponding to the arbitrary frame image, and adjust the adjusted mouth edge points to each of the mouth area surrounding the city.
  • the pixel value of the pixel is replaced with the pixel value of the corresponding pixel in the preset oral grid template.
  • Figure 7 a schematic diagram of a two-dimensional face image to be processed.
  • Figure 8 is a frame of image in the face template video
  • Figure 9 is the processing based on the facial feature parameters of a frame of image in the above face template video.
  • the facial feature parameters of the two-dimensional face image are adjusted, and according to the adjusted facial feature parameters of the two-dimensional face image to be processed, the facial feature parameters of the two-dimensional face image to be processed, and the face feature parameters of the two-dimensional face
  • a schematic diagram of the target frame face image obtained by processing the face shape parameters of the two-dimensional face image.
  • the posture angle information of the two-dimensional face image to be processed is adjusted based on the posture angle information of each frame of the face video template, and the adjusted posture angle information corresponding to each frame of the image in the face video template,
  • the expression parameters of each frame of the image in the face video template and the face shape parameters in the two-dimensional face image to be processed are obtained to obtain the target face video that adds dynamic expressions to the two-dimensional face image to be processed.
  • dynamic expressions are added to the two-dimensional face images to be processed, while ensuring the authenticity of the faces in the target face video obtained, and reducing the target faces The possibility of the shape of the face in the video being deformed in the generated dynamic video.
  • an embodiment of the present application further provides a video generation device 1000, which includes:
  • the parameter acquisition unit 1001 is configured to perform key point recognition and three-dimensional reconstruction of the two-dimensional face image to be processed to obtain the three-dimensional face deformation 3DMM parameters of the two-dimensional face image to be processed.
  • the 3DMM parameters include face shape parameters and Facial feature parameters;
  • the target frame face image acquisition unit 1002 is configured to adjust the facial feature parameters of the to-be-processed two-dimensional face image according to the facial feature parameters of each frame image for each frame image in the face video template to obtain The adjusted facial feature parameters of the to-be-processed two-dimensional face image corresponding to each frame of image; and the adjusted facial feature parameters of the to-be-processed two-dimensional face image corresponding to each frame of image, and the Processing the face shape parameters of the two-dimensional face image and the facial feature parameters of the aforementioned two-dimensional face image to be processed to construct a three-dimensional model to obtain a target frame face image corresponding to each of the aforementioned images;
  • the video generating unit 1003 is configured to obtain a target face video corresponding to the aforementioned two-dimensional face image to be processed based on the target frame face image corresponding to each frame of the aforementioned image.
  • the aforementioned facial feature information includes facial expression parameters and posture angle parameters
  • the target frame face image acquisition unit 1002 is specifically configured to execute: adjust the aforementioned two-dimensional face image to be processed based on the posture angle information of each frame of image To obtain the adjusted posture angle information of the two-dimensional face image to be processed corresponding to each frame of the image; respectively, the expression parameters of each frame of the image and the adjusted posture angle information corresponding to each frame of the image
  • the posture angle information of the two-dimensional face image to be processed is determined as the adjusted facial feature parameters of the two-dimensional face image to be processed corresponding to each frame of image.
  • the aforementioned attitude angle information includes at least one attitude angle parameter of a pitch angle, a yaw angle, and a roll angle
  • the target frame face image acquisition unit 1002 is specifically configured to execute:
  • each of the aforementioned at least one attitude angle parameters determine an average attitude angle parameter of each of the aforementioned attitude angle parameters of all frame images in the face video template;
  • each posture angle information after adjustment corresponding to each frame of image is determined.
  • the target frame face image acquisition unit 1002 is specifically configured to execute:
  • the pixel value of each pixel in the adjusted two-dimensional grid model is replaced with the pixel value of the corresponding pixel in the two-dimensional grid model before adjustment to obtain the target face image corresponding to each frame of image.
  • the target frame face image acquisition unit 1002 is further configured to execute:
  • the target frame face image acquisition unit 1002 is specifically configured to execute:
  • the key point recognition is performed on the target frame face image corresponding to each frame of the oral cavity area to obtain the edge point of the oral cavity;
  • the target frame face image acquisition unit 1002 is further configured to execute: adjust the facial feature parameters of the two-dimensional face image to be processed according to the facial feature parameters of each frame image to obtain Before the step of adjusting the facial feature parameters of the two-dimensional face image to be processed corresponding to each frame of image, perform key point recognition on each frame of image; perform three-dimensional reconstruction on each frame of image according to the key point recognition result, Obtain the facial feature parameters in the 3DMM parameters of each frame of image.
  • an embodiment of the present application provides an electronic device 1100, including a processor 1101, a memory 1102 for storing executable instructions of the above-mentioned processor; wherein, the above-mentioned processor 1101 is configured to execute any of the above-mentioned video Generation method.
  • An embodiment of the present application provides a computer-readable storage medium, the above-mentioned computer-readable storage medium carries one or more computer instruction programs, and when the above-mentioned computer instruction programs are executed by one or more processors, the above-mentioned one or more processing The device executes any of the above video generation methods.
  • a storage medium including instructions such as a memory including instructions, which may be executed by a processor of the above electronic device to complete the above method.
  • the storage medium may be a non-transitory computer-readable storage medium.
  • the aforementioned non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device Wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种视频生成方法、装置、电子设备及存储介质,涉及计算机技术领域,用于简化根据二维人脸图像生成人脸视频的过程。该方法包括:对待处理二维人脸图像进行关键点识别及三维重建,得到待处理二维人脸图像的包括人脸形状参数和面部特征信息的3DMM参数;根据人脸视频模板中每帧图像的面部特征参数分别对待处理二维人脸图像的面部特征参数进行调整,获得与每帧图像对应的调整后的待处理二维人脸图像的面部特征参数;基于与每帧图像对应的调整后的待处理二维人脸图像的面部特征参数、待处理二维人脸图像的人脸形状参数和面部特征参数进行三维模型构建,得到与每帧图像对应的目标帧人脸图像;基于目标帧人脸图像获得目标人脸视频。

Description

一种视频生成方法、装置、电子设备及存储介质
相关申请的交叉引用
本申请要求在2020年05月18日提交中国专利局、申请号为202010420064.2、申请名称为“一种视频生成方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,尤其涉及一种视频生成方法、装置、电子设备及存储介质。
背景技术
相关技术中,基于一个二维人脸图像生成有表情变化的人脸视频时,常通过手动调整二维人脸图像或者由设计人员使用动画制作工具制作多帧人脸表情图像,进而基于多帧人脸表情图像生成一个有表情变化的人脸视频,发明人意识到上述人脸视频的生成过程复杂且消耗较大的人力,无法规模性生成有表情变化的人脸视频,且由于生成的人脸视频依赖于设计人员的技术,不能保证生成的人脸视频的质量。
发明内容
本申请实施例提供一种视频生成方法、装置、电子设备及存储介质,用于简化根据二维人脸图像生成动态的人脸视频的过程。
本申请实施例第一方面,提供一种视频生成方法,包括:
对待处理二维人脸图像进行关键点识别以及三维重建,得到所述待处理二维人脸图像的三维人脸形变3DMM参数,所述3DMM参数包括人脸形状参数和面部特征参数;
针对人脸视频模板中每帧图像,根据所述每帧图像的面部特征参数分别 对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数;以及基于与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数、所述待处理二维人脸图像的人脸形状参数以及所述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与所述每帧图像对应的目标帧人脸图像;
基于与所述每帧图像对应的目标帧人脸图像,获得所述待处理人脸二维图像对应的目标人脸视频。
本申请实施例第二方面,提供一种视频生成装置,包括:
参数获取单元,被配置为执行对待处理二维人脸图像进行关键点识别以及三维重建,得到所述待处理二维人脸图像的三维人脸形变3DMM参数,所述3DMM参数包括人脸形状参数和面部特征参数;
目标帧人脸图像获取单元,被配置为执行针对人脸视频模板中每帧图像,根据所述每帧图像的面部特征参数分别对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数;以及基于与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数、所述待处理二维人脸图像的人脸形状参数以及所述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与所述每帧图像对应的目标帧人脸图像;
视频生成单元,被配置为执行基于与所述每帧图像对应的目标帧人脸图像,获得所述待处理人脸二维图像对应的目标人脸视频。
本申请实施例第三方面,提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器被配置为执行如下本申请第一方面及可能的实现方式中任意一项所述的过程。
本申请第四方面,提供一种计算机可读存储介质,所述计算机可读存储介质上承载一个或多个计算机指令程序,当所述计算机指令程序被一个或多个处理器执行时,所述一个或多个处理器执行如第一方面及一种可能的实施方式中任一所述的方法。
本申请实施例能针对待处理二维人脸图像生成与人脸视频模板的面部特征信息一致的目标人脸视频,简化了根据待处理二维人脸图像生成动态的目标人脸视频的过程,且提升了生成目标人脸视频的效率。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请实施例的原理,并不构成对本申请的不当限定。
图1为本申请实施例提供的一种视频生成方法的流程示意图;
图2为本申请实施例提供的一种二维网格模型的示意图;
图3为本申请实施例提供的一种获取人脸视频模板中每帧图像的面部特征参数的流程示意图;
图4为本申请实施例提供的一种姿态角信息的示意图;
图5为本申请实施例提供的一个目标帧人脸图像的调整后的口腔区域的示意图;
图6为本申请实施例提供的一种获取人脸视频模板中任意帧图像对应的目标帧人脸图像的过程示意图;
图7为本申请实施例提供的一种待处理二维人脸图像的示意图;
图8为本申请实施例提供的人脸视频模板中一帧图像的示意图;
图9为本申请实施例提供的一种目标帧人脸图像的示意图;
图10为本申请实施例提供的一种视频生成装置的结构示意图;
图11为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本领域普通人员更好地理解本申请的技术方案,下面将结合附图,对本申请实施例中的技术方案进行清楚、完整地描述。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第 一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。
为了便于本领域技术人员更好地理解本申请的技术方案,下面对本申请涉及的技术名词进行说明。
巴塞尔人脸模型2009版本(base face model 2009,bfm2009):一种用于姿态和光照不变的人脸识别的三维网格模型(3D face model)。
三维人脸形变模型(3D Morphable Models,3DMM):是一个三维人脸形变模型,这个模型由一系列参数定义,这些参数分为:形状、反照、投影、身份等,通过给定一组这样的参数,生成一个三维模型,当然也可以生成二维图片;也可以使用二维图片,去预测这样一组3DMM参数,从而预测该二维图片对应的三维模型。
下面对本申请实施例的设计思想进行说明:相关技术在一些场景中需要给用户上传的二维人脸图像添加动态的表情以形成动态表情包,但发明人意识到在基于一个二维人脸图像生成有表情变化的人脸视频时,常由设计人员通过手动调整二维人脸图像或者使用动画制作工具制作多帧人脸表情图像,进而生成一个有表情变化的人脸视频,其过程复杂且消耗较大的人力,无法规模性生成,且生成的人脸视频的质量依赖于设计人员的技术;随着技术的发展,出现使用表情驱动人物的方式,基于二维人脸图像生成人脸视频,该过程中通过三维重建技术创建一个虚拟人物,之后用表情驱动渲染该虚拟人物,但发明人发现这种方案渲染出的人脸是独立于原来的二维人脸图像的一个虚拟形象,由于较重的渲染而使人脸缺乏真实感,且渲染出的人脸脱离了原始的二维人脸图像的背景,丢失了原始的二维人脸图像的质感和纹理。
鉴于此,本申请实施例设计一种视频生成方法、装置、电子设备及存储介质,以简化根据二维人脸图像生成人脸视频的过程,包括:基于3DMM模型获得待处理二维人脸图像的3DMM参数,该3DMM参数包括人脸形状参数和面部特征参数;并基于人脸模板视频中每帧图像的面部特征参数分别对 待处理二维人脸图像的3DMM参数进行调整,获得与每帧图像对应的调整后的待处理二维人脸图像的面部特征参数,进而基于与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数、上述待处理二维人脸图像的人脸形状参数以及上述待处理二维人脸图像的面部特征参数进行三维模型构建,获得与每帧图像对应的目标帧人脸图像;并基于与每帧图像对应的目标帧人脸图像,生成待处理人脸二维图像对应的目标人脸视频。
其中目标帧人脸图像和人脸模板视频中对应帧图像的面部特征参数趋于一致,进而使得获得的目标帧人脸图像中的面部特征信息与人脸模板视频中对应帧图像的面部特征信息趋于一致。
以下结合附图对本申请实施例的方案进行详细说明;如图1所示,本申请实施例提供一种视频生成方法,具体包括如下步骤:
步骤S101,对待处理二维人脸图像进行关键点识别以及三维重建,得到上述待处理二维人脸图像的三维人脸形变3DMM参数,上述3DMM参数包括人脸形状参数和面部特征信息。
在一种可能的实施方式中,上述关键点识别可以但不局限于使用较成熟的神经网络模型推理得到待处理二维人脸图像中人脸的第一设定数量的二维关键点,其中神经网络模型可以但不局限于包括卷积神经网络(Convolutional Neural Networks,CNN)、循环神经网络(RecurrentNeuralNetworks,RNN)以及深度神经网络(Deep Neural Networks,DNN)等;为了保证最终生成的目标人脸视频中人脸的真实度,可以但不局限于将上述第一设定数量设置为101。
在对待处理二维人脸图像进行关键点识别后,可以对识别出的二维关键点进行三维重建,如可以通过使用bfm2009的3dmm或bfm2017的3dmm或普通的3dmm方法进行三维重建,得到待处理二维人脸图像的3DMM参数。
步骤S102,针对人脸视频模板中每帧图像,根据上述每帧图像的面部特征参数分别对上述待处理二维人脸图像的面部特征参数进行调整,获得与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数;以及 基于与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数、上述待处理二维人脸图像的人脸形状参数以及上述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与上述每帧图像对应的目标帧人脸图像。
应当说明的是,对待处理二维人脸图像的面部特征参数进行调整的目的是:尽量让调整后的待处理二维人脸图像的面部特征参数表征的面部特征,和每帧图像的面部特征参数表征的面部特征趋向于一致,即每帧图像对应的调整后的面部特征参数表征的面部特征信息,与每帧图像的面部特征参数表征的面部特征信息趋向于一致。
作为一种实施例,若上述面部特征信息包括人脸的表情参数和姿态角信息,则可以通过如下方式,根据上述每帧图像的面部特征参数分别对上述待处理二维人脸的面部特征参数进行调整,获得上述每帧图像对应的调整后的待处理二维人脸图像的面部特征参数:
基于每帧图像的姿态角信息调整上述待处理二维人脸图像的姿态角信息,得到与上述每帧图像对应的调整后的上述待处理二维人脸图像的姿态角信息;
分别将上述每帧图像的表情参数、与上述每帧图像对应的调整后的上述待处理二维人脸图像的姿态角信息,确定为与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数。
此处,将每帧图像的面部特征参数中的表情参数,作为每帧图像对应的调整后的待处理二维人脸图像的面部特征参数的一部分,即每帧图像对应修改后的待处理二维人脸图像的面部特征参数中保留了人脸视频模板中每帧图像中人脸的表情特征;基于每帧图像对应的姿态角信息调整待处理二维人脸图像对应的姿态角信息,得到与每帧图像对应的调整后的待处理二维人脸图像的姿态角信息;并将与每帧图像对应的调整后的姿态角信息,作为与每帧图像对应的调整后的待处理二维人脸图像的面部特征参数的一部分。即与每帧图像对应的调整后的面部特征参数中保留了待处理二维人脸图像中人脸根据每帧图像中人脸的姿态特征调整后的姿态特征。
作为一种实施例,可以通过如下方式,得到与上述每帧图像对应的目标帧人脸图像:
针对上述每帧图像,根据上述待处理二维图像的面部特征参数和上述待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整前的三维网格模型3dmesh_ori;
根据与每帧图像对应的调整后的面部特征参数和待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整后的三维网格模型3dmesh_new;
分别将上述调整前的三维网格模型3dmesh_ori和上述调整后的三维网格模型3dmesh_new投影至同一平面,获得调整前的二维网格模型2dmesh_ori和调整后的二维网格模型2dmesh_new;
将上述调整后的二维网格模型2dmesh_new中各像素的像素值替换为上述调整前的二维网格模型2dmesh_ori中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像。
其中二维网格模型可以看作三维网格模型在二维平面上的一个投影,请参见图2,本申请实施例还提供一种二维网格模型的示意图,该二维网格模型中包括一群离散的点和这些点围城的一个个小的三角形,每个小的三角形可以认为是一个像素,每个小的三角形里面的颜色可以认为是该像素的像素值。
调整后的二维网格模型2dmesh_new和调整前的二维网格模型2dmesh_ori都是待处理二维图形中的人脸对应的二维网格模型,只是2dmesh_new中的部分有关表情的像素相对于2dmesh_ori是改变了的,如若图2为2dmesh_ori,则2dmesh_new可能是图2中口腔区域201的像素发生了一些改变的二维网格模型;因此2dmesh_new中的像素和2dmesh_ori中的像素是存在对应关系的,可以根据这个对应关系将2dmesh_new中各像素的像素值替换为2dmesh_ori中各像素的像素值。
步骤S103,基于与每帧图像对应的目标帧人脸图像,获得上述待处理人脸二维图像对应的目标人脸视频。
在一种可能的实施方式中,可以按照人脸视频模板中每帧图像的排列顺 序,将与每帧图像对应的目标帧人脸图像排列得到待处理人脸二维图像对应的目标人脸视频。
作为一种实施例,在上述步骤S102中作为一种实施例,根据上述每帧图像的面部特征参数分别对上述待处理二维人脸图像的面部特征参数进行调整,获得与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数的步骤之前,还包括:对上述每帧图像进行关键点识别;根据关键点识别结果对上述每帧图像进行三维重建,得到上述每帧图像的3DMM参数中的面部特征参数。
在一种可能的实施方式中,对待处理二维人脸图像进行关键点识别以及三维重建,得到上述待处理二维人脸图像的面部特征信息;上述关键点识别可以但不局限于使用较成熟的神经网络模型推理得到每帧图像的第二设定数量的二维关键点,其中神经网络模型可以但不局限于包括CNN、RNN或DNN等;为了保证最终生成的目标人脸视频中人脸的真实度,可以但不局限于将上述第二设定数量设置为101。
在对每帧图像进行关键点识别后,可以对识别出的二维关键点进行三维重建,如可以通过使用bfm2009的3dmm或bfm2017的3dmm或普通的3dmm方法进行三维重建,得到每帧图像的3DMM参数中的面部特征参数。
如图3所示,提供一种获取人脸视频模板中每帧图像的面部特征参数的过程,可以包括:
步骤S301,输入人脸视频模板,并获取上述人脸视频模板的每帧图像。
步骤S302,对每帧图像进行关键点识别,获得每帧图像的关键点。
步骤S303,通过bfm2009的3dmm对每帧图的关键点进行三维重建。
步骤S304,根据对每帧图的关键点进行三维重建的结果,提取每帧图像的3DMM参数中的面部特征参数。
在步骤S304之后,可以将提取的每帧图像的面部特征参数存储为预处理模板,以便后期对待处理二维人脸图像生成目标人脸视频时使用。
作为一种实施例,上述姿态角信息包括俯仰角yaw、偏航角pitch以及翻 滚角roll中的至少一个姿态角参数,请参见图4,给出一种俯仰角yaw、偏航角pitch以及翻滚角roll的示意图,其中以图中人物头部的中心点为原点,原点向内图像内部为x轴、原点向图示上方为y轴,原点向图示右方为z轴建立一个三维坐标系,其中俯仰角yaw的方向为绕y轴旋转的方向,偏航角pitch的方向为绕x轴旋转的方向,翻滚角roll为绕z轴旋转的方向。
在上述步骤S102中可以通过如下方式,基于每帧图像的姿态角信息调整上述待处理二维人脸图像的姿态角信息,得到上述每帧图像对应的调整后的上述待处理二维人脸图像的姿态角信息:
针对上述至少一个姿态角参数中每个姿态角参数,确定人脸视频模板中所有帧图像的上述每个姿态角参数的平均姿态角参数;
确定上述每帧图像的上述每个姿态角参数对应的偏差角,上述偏差角为上述每个姿态角参数与对应的平均姿态角参数的偏差值;
基于上述待处理二维人脸图像中上述每个姿态角参数和上述每帧图像中上述每个姿态角参数对应的偏差角,确定上述每帧图像对应的调整后的每个姿态角信息。
进一步,可以通过如下公式1的原理,基于每帧图像的面部特征参数中的俯仰角yaw调整上述待处理二维人脸图像的面部特征参数中的俯仰角yaw,得到与每帧图像对应的调整后的待处理二维人脸图像的俯仰角yaw:
公式1:src1.yaw=src.yaw+(dst.yaw-dst.meanyaw)×k1;
公式1中,src1.yaw为与每帧图像对应的调整后的待处理二维人脸图像的俯仰角,dst.yaw为上述每帧图像的面部特征参数中的俯仰角,dst.meanyaw为人脸视频模板中所有帧图像的面部特征参数中的俯仰角的平均值,k1为俯仰角的调整参数。
此处为了避免对待处理二维人脸图像的俯仰角的调整过大导致待处理二维人脸图像明显变形,以及避免对待处理二维人脸图像的俯仰角的调整过小导致待处理二维人脸图像无变化,可以但不局限于将上述k1设置为0.2或0.3。
进而可以通过如下公式2,基于每帧图像的面部特征参数中的偏航角pitch 调整上述待处理二维人脸图像的面部特征参数中的偏航角pitch,得到上述每帧图像对应的调整后的待处理二维人脸图像的偏航角pitch:
公式2:src1.pitch=src.pitch+(dst.pitch-dst.meanpitch)×k2;
公式2中,src1.pitch为与每帧图像对应的调整后的待处理二维人脸图像的偏航角,dst.pitch为上述每帧图像的面部特征参数中的偏航角,dst.meanpitch为人脸视频模板中所有帧图像的面部特征参数中的偏航角的平均值,k2为偏航角的调整参数。
此处为了避免对待处理二维人脸图像的偏航角的调整过大导致待处理二维人脸图像明显变形,以及避免对待处理二维人脸图像的偏航角的调整过小导致待处理二维人脸图像无变化,可以但不局限于将上述k2设置为0.2或0.3。
进而可以通过如下公式3,基于每帧图像的面部特征参数中的翻滚角roll调整上述待处理二维人脸图像的面部特征数中的翻滚角roll,得到与每帧图像对应的调整后的待处理二维人脸图像的翻滚角roll:
公式3:src1.roll=src.roll+(dst.roll-dst.meanroll)×k3;
公式3中,src1.roll为与每帧图像对应的调整后的待处理二维人脸图像的翻滚角,dst.roll为上述每帧图像的面部特征参数中的翻滚角,dst.meanroll为人脸视频模板中所有帧图像的面部特征参数中的翻滚角的平均值,k3为翻滚角的调整参数。
此处为了避免对待处理二维人脸图像的翻滚角的调整过大导致待处理二维人脸图像明显变形,以及避免对待处理二维人脸图像的翻滚角的调整过小导致待处理二维人脸图像无变化,可以但不局限于将上述k3设置为0.1或0.2。
应当说明的是,发明人考虑到对翻滚角的调整会导致人脸的扭动,翻滚角的调整参数k3过大时会导致人脸和背景扭曲过大,翻滚角的调整参数k3较小时导致无扭曲而使得人脸显得僵硬,因此本申请实施例中的翻滚角的调整参数可以但不局限于略小于俯仰角的调整参数或偏航角的调整参数。
作为一种实施例,在上述步骤S102中,若人脸视频模板中每帧图像中人脸的口腔区域是张开的,即人脸视频模板中的人脸是张嘴微笑的,而待处理 二维人脸图像中人脸是闭嘴的话;或人脸视频模板中的人脸是闭嘴的,而待处理二维人脸图像中人脸是张嘴的话,则获取到的目标帧人脸图像中人脸的表情可能是异常的,因此将上述调整后的二维网格模型中各像素的像素值替换为上述调整前的二维网格模型中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像的步骤之后,还可以对上述每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;基于上述每帧图像对应的二维网格模型中的口腔区域调整上述每帧图像对应的目标帧人脸图像的口腔边缘点,并将调整后的口腔边缘点围城的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
即针对人脸视频模板的一帧图像,若该帧图像中人脸是张嘴的,则该帧图像对应的二维网格模型中口腔区域的范围比较大,进而可以基于该帧图像对应的二维网格模型调整对应的目标帧人脸图像的口腔边缘点,使得调整后的口腔边缘点围城的口腔区域与该帧图像对应的二维网格模型中口腔区域的范围一致,进而基于预设口腔网格模板填充调整后的口腔边缘点围城的口腔区域的像素;若该帧图像中人脸是闭嘴的,则该帧图像对应的二维网格模型中口腔区域的范围比较小,基于该帧图像对应的二维网格模型调整对应的目标帧人脸图像的口腔边缘点,其中由于该镇图像中人脸是闭嘴的,调整后的口腔边缘点围城的口腔区域的范围较小,此时即便基于预设口腔网格模板填充调整后的口腔边缘点围城的口腔区域的像素,对调整后的口腔边缘点围城的口腔区域也较小。
考虑到进一步提升调整目标帧人脸图像的口腔区域的准确度,本申请实施例中还可以,将上述调整后的二维网格模型中各像素的像素值替换为上述调整前的二维网格模型中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像的步骤之后,检测人脸视频模板中每帧图像的口腔区域是否闭合,若检测出上述每帧图像中存在口腔区域未闭合的图像,则对口腔区域未闭合的每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
基于上述口腔区域未闭合的每帧图像对应的二维网格模型中的口腔区域 调整上述口腔区域未闭合的每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
其中,针对检测出的口气区域闭合的图像,则不用按照上述方法调整与其对应的目标人脸图像的口腔边缘点。
为了更准确地调整目标帧人脸图像的口腔区域,本申请实施例中可以通过关键点识别检测出16个口腔边缘点,进而调整这16个口腔边缘点的位置;因为口腔闭合时牙齿因为遮挡而变暗,因此需要将调整后的口腔边缘点围城的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值之后,为了使得调整后的口腔区域与目标帧人脸图像的其他部分更好的融合,可以在调整后的口腔区域的边界使用alphablend来融合口腔边界,如图5所示,为一个目标帧人脸图像的调整后的口腔区域的示意图。
如图6所示,以下提供一种获取人脸视频模板中任意帧图像对应的目标帧人脸图像的过程,具体包括如下步骤:
步骤S601,对待处理二维人脸图像进行关键点识别以及三维重建,得到待处理二维人脸图像的人脸形状参数和面部特征参数,该面部特征参数包括表情参数和姿态角信息;
步骤S602,获取该任意帧图像的面部特征参数,该面部特征参数包括表情参数和姿态角信息;
步骤S603,基于该任意帧图像的姿态角信息调整待处理二维图像的姿态角信息,得到与该任意帧图像对应的调整后的待处理二维人脸图像姿态角信息;
步骤S604,将该任意帧图像的表情参数、与该任意帧图像对应的调整后的待处理二维人脸图像的姿态角信息,确定为与该任意帧图像对应的调整后的面部特征参数。
步骤S605,根据待处理二维图像的面部特征参数和待处理二维图像的人脸形状参数进行三维模型构建,得到调整前的三维网格模型3dmesh_ori。
步骤S606,根据该任意帧图像对应的调整后的待处理二维人脸图像的面部特征参数和待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整后的三维网格模型3dmesh_new。
步骤S607,分别将3dmesh_ori和3dmesh_new投影至同一平面,获得调整前的二维网格模型2dmesh_ori和调整后的二维网格模型2dmesh_new;并将2dmesh_new中各像素的像素值替换为2dmesh_ori中对应像素的像素值,得到该任意帧图像对应的目标帧人脸图像。
步骤S608,识别目标帧人脸图像的口腔边缘点;并基于该任意帧图像对应的二维网格模型中的口腔区域调整上述口腔边缘点,以及将调整后的口腔边缘点围城的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
请参见图7,给出一个待处理二维人脸图像的示意图,图8为人脸模板视频中某一帧图像,图9为根据上述人脸模板视频中某一帧图像的面部特征参数对待处理二维人脸图像的面部特征参数进行调整,并根据与上述某一帧图像对应的调整后的待处理二维人脸图像的面部特征参数、待处理二维人脸图像的面部特征参数以及待处理二维人脸图像的人脸形状参数获得的目标帧人脸图像的示意图。
本申请实施例中基于人脸视频模板中每帧图像的姿态角信息调整待处理二维人脸图像的姿态角信息,并基于人脸视频模板中每帧图像对应的调整后的姿态角信息、人脸视频模板中每帧图像的表情参数以及待处理人脸二维图像中的人脸形状参数,获得将待处理二维人脸图像添加动态表情的目标人脸视频,一方面简化了基于待处理二维人脸图像生成动态视频的过程,另一方面为待处理二维人脸图像添加了动态表情的同时保证了得到的目标人脸视频中人脸的真实度,且减少了目标人脸视频中人脸的形状在生成动态视频中发生形变的可能性。
如图10所示,基于相同的发明构思,本申请实施例还提供一种视频生成装置1000,该装置包括:
参数获取单元1001,被配置为执行对待处理二维人脸图像进行关键点识别以及三维重建,得到上述待处理二维人脸图像的三维人脸形变3DMM参数,上述3DMM参数包括人脸形状参数和面部特征参数;
目标帧人脸图像获取单元1002,被配置为执行针对人脸视频模板中每帧图像,根据上述每帧图像的面部特征参数分别对上述待处理二维人脸图像的面部特征参数进行调整,获得与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数;以及基于与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数、上述待处理二维人脸图像的人脸形状参数以及上述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与上述每帧图像对应的目标帧人脸图像;
视频生成单元1003,被配置为执行基于与上述每帧图像对应的目标帧人脸图像,获得上述待处理人脸二维图像对应的目标人脸视频。
可选的,上述面部特征信息包括人脸的表情参数和姿态角参数,目标帧人脸图像获取单元1002具体被配置为执行:基于每帧图像的姿态角信息调整上述待处理二维人脸图像的姿态角信息,得到与上述每帧图像对应的调整后的上述待处理二维人脸图像的姿态角信息;分别将上述每帧图像的表情参数、与上述每帧图像对应的调整后的上述待处理二维人脸图像的姿态角信息,确定为与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数。
在一种可能的实施方式中,上述姿态角信息包括俯仰角、偏航角和翻滚角中的至少一个姿态角参数,目标帧人脸图像获取单元1002具体被配置为执行:
针对上述至少一个姿态角参数中每个姿态角参数,确定人脸视频模板中所有帧图像的上述每个姿态角参数的平均姿态角参数;
确定上述每帧图像的上述每个姿态角参数对应的偏差角,上述偏差角为上述每个姿态角参数与对应的平均姿态角参数的偏差值;
基于上述待处理二维人脸图像中上述每个姿态角参数和上述每帧图像中 上述每个姿态角参数对应的偏差角,确定上述每帧图像对应的调整后的每个姿态角信息。
在一种可能的实施方式中,目标帧人脸图像获取单元1002具体被配置为执行:
针对上述每帧图像,根据上述待处理二维图像的面部特征参数和上述待处理二维图像的人脸形状参数进行三维模型构建,得到调整前的三维网格模型;
根据上述每帧图像对应的调整后的面部特征参数和上述待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整后的三维网格模型;
分别将上述调整前的三维网格模型和上述调整后的三维网格模型投影至同一平面,获得调整前的二维网格模型和调整后的二维网格模型;
将上述调整后的二维网格模型中各像素的像素值替换为上述调整前的二维网格模型中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像。
在一种可能的实施方式中,目标帧人脸图像获取单元1002还被配置为执行:
将上述调整后的二维网格模型中各像素的像素值替换为上述调整前的二维网格模型中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像的步骤之后,对上述每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
基于上述每帧图像对应的二维网格模型中的口腔区域调整上述每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
在一种可能的实施方式中,目标帧人脸图像获取单元1002具体被配置为执行:
将上述调整后的二维网格模型中各像素的像素值替换为上述调整前的二维网格模型中对应像素的像素值,得到上述每帧图像对应的目标帧人脸图像的步骤之后,若检测出上述每帧图像中存在口腔区域未闭合的图像,则对口 腔区域未闭合的每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
基于上述口腔区域未闭合的每帧图像对应的二维网格模型中的口腔区域调整上述口腔区域未闭合的每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
在一种可能的实施方式中,目标帧人脸图像获取单元1002还被配置为执行:根据上述每帧图像的面部特征参数分别对上述待处理二维人脸图像的面部特征参数进行调整,获得与上述每帧图像对应的调整后的上述待处理二维人脸图像的面部特征参数的步骤之前,对上述每帧图像进行关键点识别;根据关键点识别结果对上述每帧图像进行三维重建,得到上述每帧图像的3DMM参数中的面部特征参数。
如图11所示,本申请实施例提供一种电子设备1100,包括处理器1101、用于存储上述处理器可执行指令的存储器1102;其中,上述处理器1101被配置为执行上述任意一种视频生成方法。
本申请实施例提供一种计算机可读存储介质,上述计算机可读存储介质上承载一个或多个计算机指令程序,当上述计算机指令程序被一个或多个处理器执行时,上述一个或多个处理器执行上述任意一种视频生成方法。
在示例性实施例中,还提供了一种包括指令的存储介质,例如包括指令的存储器,上述指令可由上述电子设备的处理器执行以完成上述方法。可选地,存储介质可以是非临时性计算机可读存储介质,例如,上述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被 视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (16)

  1. 一种视频生成方法,包括:
    对待处理二维人脸图像进行关键点识别以及三维重建,得到所述待处理二维人脸图像的三维人脸形变3DMM参数,所述3DMM参数包括人脸形状参数和面部特征参数;
    针对人脸视频模板中每帧图像,根据所述每帧图像的面部特征参数分别对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数;以及基于与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数、所述待处理二维人脸图像的人脸形状参数以及所述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与所述每帧图像对应的目标帧人脸图像;
    基于与所述每帧图像对应的目标帧人脸图像,获得所述待处理人脸二维图像对应的目标人脸视频。
  2. 如权利要求1所述的方法,所述面部特征参数包括人脸的表情参数和姿态角信息,所述根据所述每帧图像的面部特征参数分别对所述待处理二维人脸的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数的步骤,包括:
    基于每帧图像的姿态角信息调整所述待处理二维人脸图像的姿态角信息,得到与所述每帧图像对应的调整后的所述待处理二维人脸图像的姿态角信息;
    分别将所述每帧图像的表情参数、与所述每帧图像对应的调整后的所述待处理二维人脸图像的姿态角信息,确定为与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数。
  3. 如权利要求2所述的方法,所述姿态角信息包括俯仰角、偏航角和翻滚角中的至少一个姿态角参数,所述基于每帧图像的姿态角信息调整所述待处理二维人脸图像的姿态角信息,得到所述每帧图像对应的调整后的所述待处理二维人脸图像的姿态角信息的步骤,包括:
    针对所述至少一个姿态角参数中每个姿态角参数,确定人脸视频模板中所有帧图像的所述每个姿态角参数的平均姿态角参数;
    确定所述每帧图像的所述每个姿态角参数对应的偏差角,所述偏差角为所述每个姿态角参数与对应的平均姿态角参数的偏差值;
    基于所述待处理二维人脸图像中所述每个姿态角参数和所述每帧图像中所述每个姿态角参数对应的偏差角,确定所述每帧图像对应的调整后的每个姿态角信息。
  4. 如权利要求1所述的方法,所述基于与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数、所述待处理二维人脸图像的人脸形状参数以及所述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与所述每帧图像对应的目标帧人脸图像的步骤,包括:
    针对所述每帧图像,根据所述待处理二维图像的面部特征参数和所述待处理二维图像的人脸形状参数进行三维模型构建,得到调整前的三维网格模型;
    根据所述每帧图像对应的调整后的面部特征参数和所述待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整后的三维网格模型;
    分别将所述调整前的三维网格模型和所述调整后的三维网格模型投影至同一平面,获得调整前的二维网格模型和调整后的二维网格模型;
    将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像。
  5. 如权利要求4所述的方法,所述将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像的步骤之后,还包括:
    对所述每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
    基于所述每帧图像对应的二维网格模型中的口腔区域调整所述每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
  6. 如权利要求4所述的方法,所述将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像的步骤之后,还包括:
    若检测出所述每帧图像中存在口腔区域未闭合的图像,则对口腔区域未闭合的每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
    基于所述口腔区域未闭合的每帧图像对应的二维网格模型中的口腔区域调整所述口腔区域未闭合的每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
  7. 如权利要求1~6任一项所述的方法,所述根据所述每帧图像的面部特征参数分别对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数的步骤之前,还包括:
    对所述每帧图像进行关键点识别;
    根据关键点识别结果对所述每帧图像进行三维重建,得到所述每帧图像的3DMM参数中的面部特征参数。
  8. 一种视频生成装置,包括:
    参数获取单元,被配置为执行对待处理二维人脸图像进行关键点识别以及三维重建,得到所述待处理二维人脸图像的三维人脸形变3DMM参数,所述3DMM参数包括人脸形状参数和面部特征参数;
    目标帧人脸图像获取单元,被配置为执行针对人脸视频模板中每帧图像,根据所述每帧图像的面部特征参数分别对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数;以及基于与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数、所述待处理二维人脸图像的人脸形状参数以及所述待处理二维人脸图像的面部特征参数进行三维模型构建,得到与所述每帧图像对应的目标帧人脸图像;
    视频生成单元,被配置为执行基于与所述每帧图像对应的目标帧人脸图像,获得所述待处理人脸二维图像对应的目标人脸视频。
  9. 如权利要求8所述的装置,所述面部特征信息包括人脸的表情参数和姿态角参数,所述目标帧人脸图像获取单元具体被配置为执行:
    基于每帧图像的姿态角信息调整所述待处理二维人脸图像的姿态角信息,得到与所述每帧图像对应的调整后的所述待处理二维人脸图像的姿态角信息;
    分别将所述每帧图像的表情参数、与所述每帧图像对应的调整后的所述待处理二维人脸图像的姿态角信息,确定为与所述每帧图像对应的调整后的所述待处理二维人脸图像的面部特征参数。
  10. 如权利要求9所述的装置,所述姿态角信息包括俯仰角、偏航角和翻滚角中的至少一个姿态角参数,所述目标帧人脸图像获取单元具体被配置为执行:
    针对所述至少一个姿态角参数中每个姿态角参数,确定人脸视频模板中所有帧图像的所述每个姿态角参数的平均姿态角参数;
    确定所述每帧图像的所述每个姿态角参数对应的偏差角,所述偏差角为所述每个姿态角参数与对应的平均姿态角参数的偏差值;
    基于所述待处理二维人脸图像中所述每个姿态角参数和所述每帧图像中所述每个姿态角参数对应的偏差角,确定所述每帧图像对应的调整后的每个姿态角信息。
  11. 如权利要求8所述的装置,所述目标帧人脸图像获取单元具体被配置为执行:
    针对所述每帧图像,根据所述待处理二维图像的面部特征参数和所述待处理二维图像的人脸形状参数进行三维模型构建,得到调整前的三维网格模型;
    根据所述每帧图像对应的调整后的面部特征参数和所述待处理二维人脸图像的人脸形状参数进行三维模型构建,得到调整后的三维网格模型;
    分别将所述调整前的三维网格模型和所述调整后的三维网格模型投影至 同一平面,获得调整前的二维网格模型和调整后的二维网格模型;
    将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像。
  12. 如权利要求11所述的装置,所述目标帧人脸图像获取单元还被配置为执行:
    将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像的步骤之后,对所述每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
    基于所述每帧图像对应的二维网格模型中的口腔区域调整所述每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
  13. 如权利要求11所述的装置,所述目标帧人脸图像获取单元具体被配置为执行:
    将所述调整后的二维网格模型中各像素的像素值替换为所述调整前的二维网格模型中对应像素的像素值,得到所述每帧图像对应的目标帧人脸图像的步骤之后,若检测出所述每帧图像中存在口腔区域未闭合的图像,则对口腔区域未闭合的每帧图像对应的目标帧人脸图像进行关键点识别获得口腔边缘点;
    基于所述口腔区域未闭合的每帧图像对应的二维网格模型中的口腔区域调整所述口腔区域未闭合的每帧图像对应的目标帧人脸图像中的口腔边缘点,并将调整后的口腔边缘点确定的口腔区域各像素的像素值替换为预设口腔网格模板中对应像素的像素值。
  14. 如权利要求8~13任一项所述的装置,所述目标帧人脸图像获取单元还被配置为执行:
    根据所述每帧图像的面部特征参数分别对所述待处理二维人脸图像的面部特征参数进行调整,获得与所述每帧图像对应的调整后的所述待处理二维 人脸图像的面部特征参数的步骤之前,对所述每帧图像进行关键点识别;
    根据关键点识别结果对所述每帧图像进行三维重建,得到所述每帧图像的3DMM参数中的面部特征参数。
  15. 一种电子设备,包括处理器、用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行权利要求1至7中任一项所述的方法。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上承载一个或多个计算机指令程序,当所述计算机指令程序被一个或多个处理器执行时,所述一个或多个处理器执行权利要求1-7中任一项所述的方法。
PCT/CN2020/126223 2020-05-18 2020-11-03 一种视频生成方法、装置、电子设备及存储介质 WO2021232690A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010420064.2 2020-05-18
CN202010420064.2A CN113689538B (zh) 2020-05-18 2020-05-18 一种视频生成方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021232690A1 true WO2021232690A1 (zh) 2021-11-25

Family

ID=78575542

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/126223 WO2021232690A1 (zh) 2020-05-18 2020-11-03 一种视频生成方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN113689538B (zh)
WO (1) WO2021232690A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626979A (zh) * 2022-03-18 2022-06-14 广州虎牙科技有限公司 一种人脸驱动方法、装置、电子设备及存储介质
CN116453198A (zh) * 2023-05-06 2023-07-18 广州视景医疗软件有限公司 一种基于头部姿态差异的视线校准方法和装置
WO2023241298A1 (zh) * 2022-06-16 2023-12-21 虹软科技股份有限公司 一种视频生成方法、装置、存储介质及电子设备
CN117593442A (zh) * 2023-11-28 2024-02-23 拓元(广州)智慧科技有限公司 一种基于多阶段细粒度渲染的人像生成方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114845067B (zh) * 2022-07-04 2022-11-04 中科计算技术创新研究院 基于隐空间解耦的人脸编辑的深度视频传播方法
CN118037939A (zh) * 2022-11-11 2024-05-14 广州视源电子科技股份有限公司 虚拟视频图像生成方法、装置、设备和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107067429A (zh) * 2017-03-17 2017-08-18 徐迪 基于深度学习的人脸三维重建和人脸替换的视频编辑系统及方法
CN110675475A (zh) * 2019-08-19 2020-01-10 腾讯科技(深圳)有限公司 一种人脸模型生成方法、装置、设备及存储介质
CN110677598A (zh) * 2019-09-18 2020-01-10 北京市商汤科技开发有限公司 视频生成方法、装置、电子设备和计算机存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978984A (zh) * 2017-12-27 2019-07-05 Tcl集团股份有限公司 人脸三维重建方法及终端设备
CN108062791A (zh) * 2018-01-12 2018-05-22 北京奇虎科技有限公司 一种重建人脸三维模型的方法和装置
CN110796719A (zh) * 2018-07-16 2020-02-14 北京奇幻科技有限公司 一种实时人脸表情重建方法
CN110866864A (zh) * 2018-08-27 2020-03-06 阿里巴巴集团控股有限公司 人脸姿态估计/三维人脸重构方法、装置及电子设备
CN109712080A (zh) * 2018-10-12 2019-05-03 迈格威科技有限公司 图像处理方法、图像处理装置及存储介质
CN110956691B (zh) * 2019-11-21 2023-06-06 Oppo广东移动通信有限公司 一种三维人脸重建方法、装置、设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107067429A (zh) * 2017-03-17 2017-08-18 徐迪 基于深度学习的人脸三维重建和人脸替换的视频编辑系统及方法
CN110675475A (zh) * 2019-08-19 2020-01-10 腾讯科技(深圳)有限公司 一种人脸模型生成方法、装置、设备及存储介质
CN110677598A (zh) * 2019-09-18 2020-01-10 北京市商汤科技开发有限公司 视频生成方法、装置、电子设备和计算机存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626979A (zh) * 2022-03-18 2022-06-14 广州虎牙科技有限公司 一种人脸驱动方法、装置、电子设备及存储介质
WO2023241298A1 (zh) * 2022-06-16 2023-12-21 虹软科技股份有限公司 一种视频生成方法、装置、存储介质及电子设备
CN116453198A (zh) * 2023-05-06 2023-07-18 广州视景医疗软件有限公司 一种基于头部姿态差异的视线校准方法和装置
CN116453198B (zh) * 2023-05-06 2023-08-25 广州视景医疗软件有限公司 一种基于头部姿态差异的视线校准方法和装置
CN117593442A (zh) * 2023-11-28 2024-02-23 拓元(广州)智慧科技有限公司 一种基于多阶段细粒度渲染的人像生成方法
CN117593442B (zh) * 2023-11-28 2024-05-03 拓元(广州)智慧科技有限公司 一种基于多阶段细粒度渲染的人像生成方法

Also Published As

Publication number Publication date
CN113689538B (zh) 2024-05-21
CN113689538A (zh) 2021-11-23

Similar Documents

Publication Publication Date Title
WO2021232690A1 (zh) 一种视频生成方法、装置、电子设备及存储介质
WO2018201551A1 (zh) 一种人脸图像的融合方法、装置及计算设备
US9639914B2 (en) Portrait deformation method and apparatus
CN112669447B (zh) 一种模型头像创建方法、装置、电子设备和存储介质
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
Zhou et al. Parametric reshaping of human bodies in images
Yang et al. Facial expression editing in video using a temporally-smooth factorization
WO2022095721A1 (zh) 参数估算模型的训练方法、装置、设备和存储介质
WO2022143645A1 (zh) 三维人脸重建的方法、装置、设备和存储介质
CN111652123B (zh) 图像处理和图像合成方法、装置和存储介质
US10467793B2 (en) Computer implemented method and device
US20130127827A1 (en) Multiview Face Content Creation
CN113628327B (zh) 一种头部三维重建方法及设备
WO2020108304A1 (zh) 人脸网格模型的重建方法、装置、设备和存储介质
EP3991140A1 (en) Portrait editing and synthesis
CN111243051B (zh) 基于肖像照片的简笔画生成方法、系统及存储介质
US20180225882A1 (en) Method and device for editing a facial image
CN113592988A (zh) 三维虚拟角色图像生成方法及装置
CA3173542A1 (en) Techniques for re-aging faces in images and video frames
CN113223137B (zh) 透视投影人脸点云图的生成方法、装置及电子设备
US10467822B2 (en) Reducing collision-based defects in motion-stylization of video content depicting closely spaced features
JP2023089947A (ja) 容貌トラッキングシステムおよび方法
WO2021197230A1 (zh) 三维头部模型的构建方法、装置、系统及存储介质
TWI844180B (zh) 影像處理方法和虛擬實境顯示系統
WO2023169023A1 (zh) 表情模型的生成方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20937033

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 20.03.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20937033

Country of ref document: EP

Kind code of ref document: A1