WO2022147736A1 - Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage - Google Patents

Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage Download PDF

Info

Publication number
WO2022147736A1
WO2022147736A1 PCT/CN2021/070727 CN2021070727W WO2022147736A1 WO 2022147736 A1 WO2022147736 A1 WO 2022147736A1 CN 2021070727 W CN2021070727 W CN 2021070727W WO 2022147736 A1 WO2022147736 A1 WO 2022147736A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
expression
expression base
personalized
image
Prior art date
Application number
PCT/CN2021/070727
Other languages
English (en)
Chinese (zh)
Inventor
谢新林
Original Assignee
广州视源电子科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州视源电子科技股份有限公司 filed Critical 广州视源电子科技股份有限公司
Priority to PCT/CN2021/070727 priority Critical patent/WO2022147736A1/fr
Priority to CN202180024686.6A priority patent/CN115335865A/zh
Publication of WO2022147736A1 publication Critical patent/WO2022147736A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Definitions

  • the embodiments of the present application relate to the technical field of image processing, and in particular, to a method, apparatus, device, and storage medium for constructing a virtual image.
  • Embodiments of the present application provide a virtual image construction method, apparatus, device, and storage medium, so as to solve the technical problem of the stuck phenomenon caused by the transmission of real face images in the related art.
  • an embodiment of the present application provides a method for constructing a virtual image, including:
  • the current frame image data comprising the face image of the target object
  • an embodiment of the present application further provides a virtual image construction device, including:
  • an image acquisition module for acquiring current frame image data, where the current frame image data includes a face image of a target object
  • an expression base building module for constructing a neutral facial expression base and a plurality of individualized facial expression bases of the target object according to the current frame image data
  • a face model building module used for constructing a three-dimensional face model of the target object according to the neutral facial expression base and a plurality of individual facial expression bases;
  • a parameter determination module configured to determine the pose parameters of the three-dimensional human face model when the three-dimensional human face model is mapped to the human face image and the weight coefficients of the individualized expression bases of the human face;
  • a parameter sending module configured to send the pose parameters and the weight coefficients to a remote device, so that the remote device generates images corresponding to the face according to the pose parameters and the weight coefficients virtual image.
  • an embodiment of the present application further provides a virtual image construction device, including:
  • processors one or more processors
  • memory for storing one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the virtual image construction method as described in the first aspect.
  • an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the virtual image construction method described in the first aspect.
  • the above-mentioned virtual image construction method, device, equipment and storage medium by acquiring the current frame image data including the target object's face image, and constructing the target object's neutral facial expression base and a plurality of facial personalized expressions according to the current frame image data After that, a 3D face model is constructed according to the neutral face expression base and multiple face personalized expression bases. After that, the weight coefficient and pose parameters when the 3D face model is mapped to the face image are determined. The pose parameters and weight coefficients are sent to the remote device, so that the remote device can display the virtual image corresponding to the face image through the pose parameters and weight coefficients, which solves the problem caused by the transmission of real face images in the related art. Caton technical issues.
  • the transmitted weight coefficients and pose parameters can enable the remote device to display the corresponding virtual image, effectively protecting the privacy of the target object and preventing information leakage.
  • the virtual image accurately follows the expressions and poses in the face image. , to ensure the imaging quality of the remote device.
  • FIG. 1 is a flowchart of a method for constructing a virtual image provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of current frame image data provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a virtual image provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a reference three-dimensional face model provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a face image provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a three-dimensional face model of a target object provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of key points of a human face provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an expression refinement partition provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of expression transfer provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of selecting key points of a face provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a mutually exclusive expression base provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of another mutually exclusive expression base provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of an apparatus for constructing a virtual image provided by an embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of a virtual image construction device according to an embodiment of the present application.
  • the virtual image construction method provided by the embodiment of the present application may be executed by a virtual image construction device, and the virtual image construction device may be implemented by means of software and/or hardware, and the virtual image construction device may be composed of two or more physical entities, It can also be a physical entity.
  • the virtual image construction device may be a smart device such as a computer, a mobile phone, a tablet computer, or an interactive smart tablet.
  • the virtual image construction device is applied in the scenario of video communication using network communication technology, such as online conferences and online classes.
  • the virtual image construction device in addition to the virtual image construction device, it also includes other devices participating in video communication.
  • the other devices can be one or more, and the other devices can also be smart devices such as computers, mobile phones, tablet computers, or interactive smart tablets.
  • the virtual image construction device executes the virtual image construction method provided in this embodiment, so as to process when collecting the face image of the local user, thereby enabling other devices to display the virtual image obtained based on the face image.
  • the other devices are remote devices with respect to the virtual image construction device.
  • the virtual image construction method provided in this embodiment can also be executed when the remote device collects the face image of the user.
  • the remote device can also be considered as a virtual image construction device, while the local device is Used to display the corresponding virtual image.
  • the device used by the lecturer can be considered as a virtual image acquisition device, and the device used by the students can be considered as a remote device.
  • the device used by the current speaker may be considered as a virtual image construction device, and the devices used by other participants may be considered as remote devices.
  • the virtual image construction device is installed with at least one type of operating system, wherein the operating system includes but is not limited to an Android system, an IOS system, and/or a Windows system.
  • the virtual image construction device can install at least one application program based on the operating system, and the application program can be an application program that comes with the operating system, or it can be an application program downloaded from a third-party device or server.
  • the embodiment of the application program is not limited. , it can be understood that the virtual image construction method provided by the embodiment of the present application may also be an application program itself.
  • the virtual image constructing device is installed with at least an application program for executing the virtual avatar constructing method provided by the embodiment of the present application, and the virtual avatar constructing method is executed when the application program runs.
  • FIG. 1 is a flowchart of a method for constructing a virtual image according to an embodiment of the present application.
  • the virtual image construction method specifically includes:
  • Step 110 Acquire current frame image data, where the current frame image data includes a face image of the target object.
  • the virtual image construction device may collect image data through an image collection device (eg, a camera) installed by itself.
  • the currently collected image data is recorded as the current frame image data.
  • the current frame image data includes the face image of the target object, wherein the target object refers to the object that needs to generate a virtual image, and any object that can be recognized as a face image can be considered as the target object, and does not need to be Specified in advance; for example, in an online classroom scenario, the target object can be a lecturer using a virtual image construction device, and the face image of the target object refers to the lecturer's face image.
  • the number of target objects in the image data of the current frame is one or more.
  • the image data of the current frame includes one target object for exemplary description. In practical applications, when there are multiple target objects, the processing method of each target object is the same as the processing method of the current target object.
  • the embodiment of the technical means used to confirm whether the current frame image data contains a face image is not limited.
  • a face detection algorithm based on deep learning is used to detect the face image area in the current frame image data. face image area, it is determined that the image data of the current frame contains a face image; otherwise, it is determined that the image data of the current frame does not contain a face image.
  • Step 120 construct a neutral facial expression base and a plurality of personalized facial expression bases of the target object according to the current frame image data.
  • the expression base can be understood as a three-dimensional face model containing the position information of the key points of the face, and the expression of the person can be reflected by the position information of the key points of the face in the expression base.
  • the key points of the face can be obtained from the key parts of the face.
  • the key parts of the face include eyebrows, eyes, nose, mouth, and cheeks, etc.
  • the key points of the face are located in the above key parts and are used to describe the key parts.
  • the point of the current action At this time, the action of each key part can be determined through the face key points, and then the face pose, face position and face expression can be determined, that is, each face key point is the semantic information of the face.
  • facial expressions are divided into neutral expressions and personalized expressions.
  • Neutral expression refers to the shape of the face without any expression, which can reflect the identity of the face.
  • the identity of the face is a specific description of the shape of the face.
  • the identity of the face describes the key parts of the face.
  • the key parts described by face identity are big eyes, high nose bridge, and thin lips.
  • Personalized expressions refer to expressions made by a human face, such as eyes closed, mouth open, frowning, etc.
  • the expression bases are divided into neutral expression bases and personalized facial expression bases.
  • the neutral facial expression base can be understood as an expression base representing neutral expressions, and the shape of the human face in three-dimensional space can be confirmed through the neutral facial expression base.
  • the face personalized expression base refers to an expression base containing personalized expressions, and each face personalized expression base corresponds to a personalized expression. Understandably, since the expressions of the human face are very rich, if you want to express all the expressions of the human face, you need to build a large number of personalized facial expression bases, which will greatly increase the amount of data processing. Therefore, in the embodiment, only the face personalized expression base of the basic expression is constructed, wherein the specific content of the basic expression can be set according to the actual situation, and various expressions of the human face can be obtained by combining the basic expression and the neutral expression .
  • the basic expressions for eyes include: left eye closed, left eye wide, right eye closed and right eye wide, at this time, various expressions of eyes can be obtained according to the above four basic expressions and neutral expressions, such as , the expression with slightly squinting eyes can be obtained by linear superposition of left eye closed, right eye closed and neutral expression.
  • the neutral facial expression base and individual facial expression bases of the target object are constructed by using the facial image of the current frame image data.
  • prior information can be introduced.
  • the prior information is obtained by collecting a large amount of 3D face data, which can reflect the average coordinate data, face identity base vector and personalized expression base vector of a large amount of 3D face data, and a 3D face model can be constructed through the prior information , it can be understood that there are differences in the 3D face models obtained when different coefficients are set for the prior information.
  • the 3D face model can be regarded as a reference 3D face model, that is, a 3D face model corresponding to the target object face image in the current frame image data can be obtained by adjusting the reference 3D face model.
  • a reference 3D face model that is, a 3D face model corresponding to the target object face image in the current frame image data can be obtained by adjusting the reference 3D face model.
  • first obtain the coordinates of each face key point in the two-dimensional plane in the face image of the target object and then refer to the three-dimensional key points of the three-dimensional face model (that is, refer to the face key points in the three-dimensional face model). ) into the two-dimensional plane to determine the coordinates of the three-dimensional key points in the two-dimensional plane, and then calculate the three-dimensional key points in the reference three-dimensional face model and the face key points in the face image in the two-dimensional plane.
  • the three-dimensional key points that need to calculate the error have a corresponding relationship with the face key points, that is, the relative positions of each group of corresponding three-dimensional key points and face key points in the corresponding image are the same.
  • the three-dimensional key points and The key points of the face are the left boundary points of the eyes.
  • adjust the position of the 3D key points in the reference 3D face model according to the calculated error so as to ensure that the 3D key points of the adjusted reference 3D face model are projected to the 2D plane with the face key points in the face image.
  • the coordinates of the points should be as coincident as possible.
  • the adjustment of the positions of the three-dimensional key points can be realized by adjusting the coefficients used by the prior information.
  • the reference 3D face model can be considered as the neutral facial expression base of the target object. It should be noted that in practical applications, other methods can also be used to construct a neutral facial expression base. For example, by using a neural network, the corresponding facial image or key points in the facial image can be input into the neural network. Human face neutral expression base.
  • the neutral facial expression base is processed to obtain the personalized facial expression base of the target object.
  • the prior information is also introduced when constructing the face personalized expression base.
  • each basic expression corresponds to a prior information
  • the prior information represents the three-dimensional face model corresponding to the basic expression
  • the neutral expression also corresponds to a prior information.
  • the prior information represents a three-dimensional face model corresponding to a neutral expression
  • the basic expression and the neutral expression in the a priori information above belong to the same face.
  • first calculate the transfer deformation variable between the prior information corresponding to the neutral expression and the prior information corresponding to the basic expression that is, after transforming the prior information corresponding to the neutral expression by the transfer deformation variable, the basic expression can be obtained.
  • the neutral facial expression base is converted according to the transfer deformation variable, so as to obtain the personalized facial expression base of the target object under the basic expression. It can be understood that, according to the above method, the personalized facial expression base of the target object under each basic expression can be obtained.
  • Step 130 constructing a three-dimensional face model of the target object according to the neutral facial expression base and multiple personalized facial expression bases.
  • a weight coefficient is set for each individual face expression base, and then linear weighting is performed on each face individual face expression base and the face neutral expression base in combination with the weight coefficient, so that a target object with an expression can be obtained.
  • 3D face model is expressed as: Among them, B represents a three-dimensional face model, B 0 represents the neutral facial expression base, B i represents the ith face personalized expression base, 1 ⁇ i ⁇ n, n is the total number of face personalized expression bases, ⁇ i represents the weight coefficient corresponding to B i .
  • Step 140 Determine the pose parameters of the three-dimensional face model when the three-dimensional face model is mapped to the face image and the weight coefficients of the individualized expression bases of each face.
  • the weight coefficients of the individualized expression bases of each face in the three-dimensional human face model can be continuously adjusted, so that the expressions represented by the three-dimensional human face model are close to the expressions of the human face image.
  • the pose parameters can also be understood as rigid transformation parameters.
  • the rigid transformation refers to changing the position, orientation and size of the three-dimensional face model without changing the shape.
  • the rigid transformation parameter refers to a parameter used when performing rigid transformation on the three-dimensional face model.
  • the rigid transformation parameter includes: a rigid rotation matrix, a translation vector, and a scaling factor. The rigid rotation matrix is used to change the orientation of the 3D face model, the translation vector is used to change the position of the 3D face model, and the scaling factor is used to change the size of the 3D face model.
  • the difference between the two-dimensional image and the face image when the three-dimensional face model is mapped to the two-dimensional plane is determined by constructing an error parameter formula.
  • the error parameter formula is constructed by the coordinate difference between the two-dimensional image when the three-dimensional face model is mapped to the two-dimensional plane and the key points of the face in the face image. It is understandable that the coordinates of the key points of the face corresponding to the 3D face model can be determined by the weight coefficients and pose parameters.
  • the weight coefficients and pose parameters can be considered as unknown quantities. Constantly adjust the weight coefficients and pose parameters, so that the coordinates of the face key points corresponding to the 3D face model and the face key point coordinates in the face image are getting closer and closer, so that the error parameters are getting smaller and smaller, and the 3D face
  • the expression represented by the model is more and more the same as the real expression of the face image, and the action of the 3D face model is more and more consistent with the pose parameters of the head action of the target object in the face image. Specifically, when the calculated error parameter has reached the desired value, the currently used weight coefficient and pose parameter can be used as the finally obtained weight coefficient and pose parameter.
  • the error parameter it can be determined whether the error parameter reaches the desired value by setting the adjustment times of the weight coefficient and the pose parameter, that is, when the adjustment times reaches a certain number of times, it is determined that the error parameter reaches the desired value. It is also possible to determine whether the error parameter reaches the expected value by setting the parameter threshold, that is, when the error parameter is lower than the parameter threshold, it is determined that the error parameter reaches the expected value. Generally speaking, if the error parameter has reached the expected value, it can be considered that the expression represented by the 3D face model is sufficiently the same as the real expression of the face image, and the action of the 3D face model is the same as the head of the target object in the face image. The pose parameters of the actions are sufficiently consistent.
  • Step 150 Send the pose parameters and the weight coefficients to the remote device, so that the remote device generates a virtual image corresponding to the face image according to the pose parameters and the weight coefficients.
  • the pose parameters and weight coefficients are sent to a remote device for video communication with the virtual image construction device.
  • a virtual image is stored in the remote device, and the virtual image may be a cartoon image, which may be a two-dimensional virtual image or a three-dimensional virtual image.
  • a three-dimensional virtual image is used as an example.
  • the storage of the three-dimensional virtual image in the remote device specifically includes storing a neutral expression base and a personalized expression base of the three-dimensional virtual image, wherein each personalized expression base of the three-dimensional virtual image has the same characteristics as the corresponding personalized facial expression base. expression.
  • the user of the remote device can install an application program in the remote device, and the application program can receive the pose parameters and weight coefficients sent by the virtual image construction device, and generate the pose parameters and weight coefficients according to the pose parameters and the weight coefficients.
  • a virtual image corresponding to the face image, and the remote device stores the virtual image when the application is installed.
  • the application program is upgraded or updated, the virtual image stored in the remote device can be updated at the same time.
  • the remote device when it generates a virtual image corresponding to a face image, it can render and display a preset three-dimensional virtual image through a graphics rendering framework of an open source graphics library (Open Graphics Library, OpenGL).
  • OpenGL Open Graphics Library
  • the individualized expression base and neutral expression base of the three-dimensional virtual image are linearly weighted according to the weight coefficient, so as to obtain a three-dimensional virtual image containing expressions, wherein the linear weighting method is the same as that of constructing the three-dimensional face model in step 130.
  • the graphics rendering framework After generating the three-dimensional virtual image containing the expression, the graphics rendering framework performs corresponding rigid transformation on the three-dimensional virtual image containing the expression according to the pose parameters, and displays it after the rigid transformation is completed.
  • FIG. 2 is a schematic diagram of current frame image data provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a virtual image provided by an embodiment of the present application.
  • next frame of image data can be obtained, and the next frame of image data can be used as the current frame of image data, and the above process is repeated to make the remote device display continuous display. virtual image.
  • the virtual image construction device can also be set with a function control for enabling the virtual image, and the function control can be realized by physical physical buttons or virtual buttons.
  • the function control can be realized by physical physical buttons or virtual buttons.
  • the above method is executed to enable the remote control.
  • the terminal device displays the virtual image.
  • the function control is stopped being triggered, it only needs to send the current frame image data to the remote device, so that the remote device can display the current frame image. In this way, the user can determine whether to display the real image based on his own needs, thereby improving the user's experience.
  • the target object's face image by acquiring the current frame image data containing the target object's face image, and constructing the target object's face neutral expression base and a plurality of face personalized expression bases according to the current frame image data, and then, according to the face neutral expression base.
  • Build a 3D face model with multiple face personalized expression bases and then determine the weight coefficients and pose parameters when the 3D face model is mapped to the face image, and then send the pose parameters and weight coefficients to the remote device.
  • the technical means of enabling the remote device to display the virtual image corresponding to the face image through pose parameters and weight coefficients solves the technical problems of information leakage and jamming caused by the transmission of the real face image in the related art.
  • the transmitted weight coefficients and pose parameters can enable the remote device to display the corresponding virtual image, effectively protecting the privacy of the target object and preventing information leakage.
  • the virtual image accurately follows the expressions and poses in the face image. , to ensure the imaging quality of the remote device.
  • FIG. 4 is a flowchart of another virtual image construction method provided by an embodiment of the present application. This embodiment is embodied on the basis of the above-mentioned embodiment. Referring to Figure 4, the virtual image construction method specifically includes:
  • Step 210 Acquire current frame image data, where the current frame image data includes a face image of the target object.
  • Step 220 Construct a neutral facial expression base of the target object according to the current frame image data and the preset a priori information of the facial model.
  • the prior information of the face model refers to the prior information used when constructing the neutral facial expression base of the target object
  • the reference three-dimensional face model can be constructed through the prior information of the face model, and the reference three-dimensional face model In this embodiment, it is a neutral expression.
  • the face neutral expression base of the target object is obtained by fitting the face model parameters through the face image.
  • FIG. 5 is a schematic diagram of a reference three-dimensional face model provided by an embodiment of the present application
  • FIG. 6 is a schematic diagram of a face image provided by an embodiment of the present application
  • FIG. 7 is a schematic diagram of a three-dimensional face model of a target object provided by an embodiment of the present application.
  • FIG. 5 is constructed through the prior information of the face model. After that, the reference 3D face model shown in Fig. 5 is fitted with the face image shown in Fig. 6, and the target shown in Fig. 7 can be obtained.
  • the three-dimensional face model of the object can be understood.
  • the three-dimensional face model shown in FIG. 7 has no expression, it can be used as a neutral expression base of the face. It should be noted that FIG. 7 shows a side view of the three-dimensional face model.
  • the prior information of the face model can be constructed based on the three-dimensional face data in the published BFM (Basel Face Mode) database, and each three-dimensional face data can be considered as a three-dimensional face model.
  • the embodiment of the expression of the model is not limited.
  • Principal Component Analysis is used to extract 200 three-dimensional face data in the BFM database to obtain a bilinear model, wherein the bilinear model is constructed based on the 200 three-dimensional face data.
  • PCA Principal Component Analysis
  • M is the reference 3D face model
  • MU is the average coordinate data of 200 3D face data
  • MU has a total of 3h data, where h refers to the average number of point clouds of 200 3D face data.
  • Each point cloud contains the coordinates of the three axes of x, y, and z.
  • a three-dimensional face can be constructed through MU.
  • PC id is the face identity base vector obtained from 200 three-dimensional face data, which can be reflected as MU through PC id
  • the superimposed face identity that is, the face identity of the reference 3D face model can be obtained by superimposing the PC id for the MU (for example, the neutral expression face feature when there is no expression), and the PC exp is obtained by 200 3D face data.
  • Personalized expression base vector which can be expressed as the personalized expression superimposed by MU through PC exp , that is, the personalized expression of the reference 3D face model can be obtained by superimposing PC exp for MU, ⁇ id is the coefficient corresponding to the face identity base vector, ⁇ exp is the coefficient corresponding to the personalized expression base vector. That is, the PC id and PC exp are linearly weighted by ⁇ id and ⁇ exp , and the weighted results are fused into the average coordinate data of the 3D face data, so that the reference 3D face model can be obtained. It can be understood that MU, PC id , PC exp is the prior information of the face model.
  • the reference 3D face model can be mapped to the 2D plane to determine the difference between the 2D image in the 2D plane and the face image, and then adjust the face model priors according to the difference.
  • the information uses ⁇ id and ⁇ exp so that the adjusted reference 3D face model is mapped to the 2D plane and is highly similar to or the same as the face image.
  • step 220 when determining the difference between the two-dimensional image corresponding to the reference three-dimensional face model and the face image, it is specifically determined by the key points of the face.
  • step 220 includes steps 221-223:
  • Step 221 Detect the face image in the current frame data image.
  • a face recognition algorithm is used to detect the location area where the human face is located in the current frame data image, and then the location area where the human face is located is extracted to obtain the human face image.
  • Step 222 Perform facial key point positioning on the face image to obtain a key point coordinate array.
  • the face key points are detected in the face image, and the coordinates of the detected face key points are obtained, and the coordinates of each face key point are formed into a key point coordinate array.
  • FIG. 8 is a schematic diagram of face key points provided by the embodiment of the present application.
  • a total of 68 face key points are detected in the current face image, and each face key point has corresponding face semantic information.
  • the coordinates of the 68 face key points in the face image are arranged in a certain order to form a key point coordinate array.
  • the method further includes: performing a filtering operation and a smoothing operation on the key point coordinate array.
  • the filtering operation refers to adjusting the key point coordinate array of the current frame in combination with the key point coordinate array in the image data of the previous frame, so as to ensure that the key point coordinate array of the previous frame and the key point coordinate array of the current frame are smoothly gradient. , so that the coordinate arrays of key points of each frame in the video communication process are all smoothly gradient.
  • the filtering operation is implemented by means of Kalman filtering. Among them, when Kalman filtering is performed on the key point coordinate data of the current frame, the key point coordinate array of the current frame and the key point coordinate array of the previous frame are weighted to update the weighted result to the key point coordinate data of the current frame. .
  • the smoothing operation is used to avoid the situation that some face key points are outliers, so that the coordinate curve between adjacent face key points is smooth.
  • the PCA algorithm is used to perform a smoothing operation on the filtered key point coordinate array to update the key point coordinate array.
  • the key point coordinate array used subsequently is the key point coordinate array after filtering and smoothing operations.
  • Step 223 Determine the neutral facial expression base of the target object according to the facial image, the coordinate array of key points and the preset prior information of the facial model.
  • E lan (p) represents the energy constraint between the reference 3D face model and the face image
  • p represents the parameters used by the reference 3D face model
  • p includes the coefficient ⁇ id corresponding to the face identity base vector
  • the coefficient ⁇ exp corresponding to the basis vector
  • the weak perspective projection matrix ⁇ and the rigid transformation matrix ⁇ where the weak perspective projection is mainly used to project the 3D space point information (such as the reference 3D face model) to the 2D imaging plane.
  • the projection matrix refers to the matrix used when projecting the reference 3D face model to the 2D plane
  • the rigid transformation matrix may include rigid rotation matrix, translation vector, and scale factor.
  • ⁇ conf,j represents the confidence of the detection of the jth face key point in the face image
  • f j represents the coordinates of the jth face key point in the face image
  • F represents the key point coordinate array
  • v j represents the reference 3D face
  • E lan (p) when the reference three-dimensional face model is mapped to the two-dimensional plane, the more similar the coordinates of each three-dimensional key point and the corresponding face key point in the face image, the smaller the E lan (p), the more the reference three-dimensional face model.
  • Step 230 according to the neutral facial expression base of the human face and the preset reference neutral expression base and each reference individualized expression base, determine each face individualized expression base of the target object, and each reference individualized expression base corresponds to a human face.
  • Personalized expression base according to the neutral facial expression base of the human face and the preset reference neutral expression base and each reference individualized expression base, determine each face individualized expression base of the target object, and each reference individualized expression base corresponds to a human face. Personalized expression base.
  • the reference neutral expression base is a preset expression base representing neutral expressions.
  • the reference personalized expression base is an expression base obtained by adding a preset basic expression on the basis of the reference neutral expression base.
  • Each reference personalized expression base has a corresponding physical meaning.
  • the facial action coding system (Facial Action Coding System, FACS) is used to define each facial muscle action as a different action unit AU value or AD value, that is, to classify each basic expression by muscle action. For example, the AU value corresponding to "the inner eyebrow is raised upward" is recorded as AU1.
  • each AU value also includes a refinement value, which is used to indicate the movement range of the muscle.
  • the AU value including the refinement value is AU1 (0.2)
  • the current basic expression is that the inner eyebrow is pulled up.
  • the pull degree is 0.2.
  • the AU value corresponding to "eyes closed” is denoted as AU43
  • AU43(0) indicates that the eyes are normally opened
  • AU43(1) indicates that the eyes are completely closed.
  • Schematic diagram of the partition Referring to FIG. 9 , from left to right are the refinement values corresponding to the closing degrees of each eye during the process from fully opening to fully closing the eyes.
  • 26 basic expressions are defined according to muscle movements, and each basic expression corresponds to a reference personalized expression base. At this time, each reference personalized expression base, corresponding basic expression and AU value are shown in the following table:
  • Blendshape custom emoji Definition of FACS Blendshape custom emoji Definition of FACS 0 left eye closed AU43 13 right corner of mouth up AU12 1 right eye closed AU43 14 Left mouth corner abduction AU20 2 left eye widened AU5 15 right mouth corner AU20 3 right eye wide AU5 16 upper lip adducted AU28 4 frown AU4 17 Adduction of lower lip AU28 5 frown AU4 18 lower lip outward AD29 6 raised eyebrows AU1 19 upper lip up AU10 7 Pick left eyebrow AU2 20 lower lip down AU16 8 Pick right eyebrow AU2 twenty one left corner of mouth down AU17 9 open mouth AU26 twenty two right corner of mouth down AU17 10 Chin left AD30 twenty three pouting AU18 11 Chin right AD30 twenty four cheeks bulge AD34 12 Left corner of mouth up AU12 25 wrinkled nose AU9
  • Blendshape represents the personalized expression base
  • 0-25 is the number of 26 personalized expression bases
  • the custom expression is the basic expression corresponding to each expression base
  • the FACS definition represents the AU value or AD corresponding to each personalized expression base value.
  • the deformation information required when the reference neutral expression base is transformed to the reference individual expression base can be determined, and the deformation information can also be regarded as the reference neutral expression base.
  • the deformation information is obtained by means of three-dimensional mesh deformation.
  • step 230 includes steps 231-232:
  • Step 231 Determine deformation information according to the reference neutral expression base and the reference personalized expression base.
  • the reference neutral expression base can be divided into a plurality of triangular patches after triangulating the face key points in the reference neutral expression base according to the arrangement order by using the Delaunay triangulation algorithm.
  • the three vertices of the patch are three face key points that form a triangle, and each triangular patch can form a three-dimensional mesh representing the reference neutral expression base.
  • the reference personalized expression base can be divided into multiple triangular patches, each triangular patch
  • the three vertices of are the three face key points that form a triangle, and each triangular facet can form a three-dimensional mesh representing the reference personalized expression base.
  • Each triangular face in the reference personalized expression base is in one-to-one correspondence with each triangular face in the reference neutral expression base. According to the corresponding relationship, it can be determined that the deformation of the triangular face in the reference neutral expression base corresponds to the reference individual expression base.
  • the deformation information of the triangular patch represents the transfer deformation variables (rotation matrix, translation vector, scaling factor, etc.) used in the deformation of the triangular face in the reference neutral expression base, so that the deformed triangular face corresponds to the reference personalized expression base.
  • Triangular patches are the same. Each triangular patch corresponds to a deformation information.
  • the deformation information of each triangular facet constitutes the deformation information from the reference neutral expression base to the current reference individual expression base. Understandably, each reference personalized expression base has corresponding deformation information.
  • Step 232 determining the personalized facial expression base of the target object according to the deformation information and the neutral facial expression base.
  • three-dimensional network registration is performed on the neutral expression base of the face and the reference neutral expression base.
  • Three-dimensional space transformation (such as scaling, rotation, and translation) is performed on each triangular face in the neutral expression base, so that the transformed triangular facets are in one-to-one correspondence with each triangular face in the neutral face expression base.
  • the reference neutral expression base after three-dimensional space transformation can be called the deformed reference neutral expression base.
  • the three-dimensional coordinates in the three-dimensional space of each triangular patch in the deformed reference neutral expression base and the corresponding triangular face in the human face neutral expression base are highly similar or identical.
  • smooth constraints and key point constraints are used for each face in the deformed reference neutral expression base.
  • the three-dimensional coordinates of the key points are processed.
  • 3D smoothing can be used for smoothing constraints
  • PCA algorithm is used for key point constraints.
  • the deformed reference neutral expression base is the same as the human face neutral expression base.
  • the triangular patches in the deformed reference neutral expression base and the face neutral expression base can be determined through the k-d tree.
  • the corresponding relationship between the triangular facets in the reference neutral expression base and the triangular facets in the neutral face expression base is determined.
  • a k-d tree can be understood as a data structure that organizes points in a k-dimensional Euclidean space.
  • each triangular facet in the reference neutral expression base when used to transform to a certain reference personalized expression base, the deformation information processes the corresponding triangular faces in the neutral expression base of the face, that is, deforms the triangular faces, so that the deformed triangular faces are used as the triangular faces in the face personalized expression base.
  • the individualized facial expression base after processing each triangular face in the neutral facial expression base, the individualized facial expression base corresponding to the reference individualized expression base can be obtained.
  • each reference personalized expression base After processing in the above manner, each reference personalized expression base has a corresponding face personalized expression base.
  • the above processing process can be calculated by the deformation formula, wherein the deformation formula of a triangular patch is expressed as:
  • V T represents the vertex-related information of the corresponding triangular facets in the face personalized expression base
  • V S represents the vertex-related information of the corresponding triangular facets in the reference neutral expression base
  • V S is the vertex-related information of the corresponding triangular facets in the reference neutral expression base
  • V T [v T2 -v T1 v T3 -v T1 v T4 -v T1 ]
  • v T1 , v T2 and v T3 are the corresponding triangular faces in the neutral expression base of the face
  • v T4 is the normal vector of the triangular patch
  • FIG. 10 is a schematic diagram of expression transfer provided by an embodiment of the present application.
  • the first column in the first row is a reference neutral expression base
  • the second to fourth columns in the first row are three reference personalized expression bases, and the corresponding basic expressions are closed right eye, open mouth and pouting.
  • the first column in the second row is the neutral facial expression base, and the deformation information is determined according to the reference neutral expression base and each reference personalized expression base.
  • Face personalized expression base the second row, second row to fourth column in Figure 10 are the face personalized expression base obtained according to the reference personalized expression base of the first row, second row to fourth column, at this time,
  • the basic expressions corresponding to the three face personalized expression bases are closing the right eye, opening the mouth, and pouting, namely, the basic expressions are transferred from the reference personalized expression base to the face personalized expression base.
  • Step 240 constructing a three-dimensional face model of the target object according to the neutral facial expression base and multiple personalized facial expression bases.
  • Step 250 constructing an error parameter formula when the three-dimensional face model is mapped to the face image.
  • the error parameter formula can also be understood as an energy function.
  • the construction rule of the error parameter formula can be set according to the actual situation.
  • the error parameter formula is constructed by minimizing the residual error. In this case, the error parameter formula is:
  • E represents the error parameter
  • B represents the three-dimensional face model
  • B 0 represents the neutral facial expression base of the target object
  • B i represents the ith face personalized expression base of the target object
  • 1 ⁇ i ⁇ n n is the total number of face personalized expression bases
  • ⁇ i represents B i
  • B k represents the kth face key point in the three-dimensional face model
  • f k represents the face image In the kth face key point
  • s represents the scaling factor when the 3D face model is mapped to the face image
  • R represents the rigid rotation matrix when the 3D face model is mapped to the face image
  • t represents the 3D face model mapped to the face image.
  • the translation vector of the face image, s, R, and t are the pose parameters, and
  • the error parameter formula is constructed by means of the linear least squares method, that is, the above-mentioned minimized residuals are converted into the form of solving by the linear least squares method.
  • the least squares method also known as the least squares method
  • the unknown data can be easily obtained by using the least squares method (in the embodiment, ⁇ i , s, R and t are unknown data), And make the sum of squares of errors between the obtained data and the actual data to be the smallest.
  • the error parameter formula can be expressed as:
  • E' exp represents the error parameter
  • ⁇ B [B 1 -B 0 B 2 -B 0 ... B n -B 0 ]
  • B 0 represents the neutral facial expression base
  • B i represents the ith person's personalized facial expression Base
  • 1 ⁇ i ⁇ n n represents the total number of face personalized expression bases
  • represents the weight coefficient vector
  • ( ⁇ 1 ⁇ 2 ... ⁇ n )
  • ⁇ i represents the i-th face personalized expression base
  • Weight coefficient
  • s represents the scaling factor when the 3D face model is mapped to the face image
  • R represents the rigid rotation matrix of the face when the 3D face model is mapped to the face image
  • t represents the 3D face model is mapped to the face image
  • a ⁇ can reflect the difference between the personalized facial expression base and the neutral facial expression base in the two-dimensional plane
  • b can reflect the difference between the neutral facial expression base and the face image in the two-dimensional plane. It can be understood that the closer the 3D face model is to the face image, the smaller the difference between A ⁇ and b.
  • the error parameter formula is a linear equation system, and at this time, the solved ⁇ is (A T A) -1 ⁇ A T b.
  • the error parameter formula when ⁇ is solved by the linear least squares method in the previous embodiment is a linear equation system, and the solved ⁇ is (A T A) ⁇ 1 ⁇ A T b.
  • the value range of ⁇ includes both positive and negative numbers, and negative numbers are meaningless for the 3D face model, that is, the weight coefficient cannot be negative.
  • each time ⁇ is solved it is solved by the difference value of the key points of the face when the 3D face model is mapped. If the detection process of the key points of the face in the face image is wrong, it will affect the accuracy of the calculation results.
  • the mouth in the face image is closed, but due to the detection error of the key points of the face, there is a certain distance between the key points of the face in the upper lip and the lower lip (the two key points should be extremely coincident or completely coincident), so that the possibility of recognizing the mouth as open will appear in the subsequent calculation process. Therefore, in the embodiment, quadratic programming and dynamic constraints are performed on ⁇ to avoid the above problems. At this time, the dynamic equation of ⁇ can be expressed as C ⁇ d, that is, the error parameter formula is:
  • C represents the constraint parameter of ⁇
  • d represents the value range of ⁇ .
  • C and d are constraints on ⁇
  • eye represents the unit matrix
  • eye(n) represents the unit matrix corresponding to the personalized expression base of n faces.
  • the specific value of d can be set according to the actual situation. For example, ⁇ should be in the range of 0.5-1, so d can be set between 0.5 and 1.
  • ones(n) represents the upper bound of n weight coefficients, which contains n values, each value corresponds to a weight coefficient
  • zero(n) represents the lower bound of n weight coefficients, which contains n values , each value corresponds to a weight coefficient.
  • the weight coefficient should be between 0-1, therefore, ones(n) can be n 1s, and zero(n) can be n 0s. which is Among them, both 1 and 0 are n.
  • the value range of the weight coefficient can be fixed between 0-1 to prevent the occurrence of negative numbers.
  • ones(n) represents the upper bound of the n weight coefficients
  • zero(n) represents the lower bound of the n weight coefficients
  • p n and q n are the value constraint matrices
  • p n and q n are based on the face
  • the relative distance of the face key points in the image is determined. The relative distance refers to the pixel distance of the face key points in the face image.
  • each face personalized expression base corresponds to a p value and a q value
  • n p values form p n
  • n q values form q n .
  • the weight coefficients corresponding to different facial personalized expression bases can have different value ranges.
  • FIG. 11 is a schematic diagram of face key point selection according to an embodiment of the present application.
  • face key points corresponding to the left eye there are 6 face key points corresponding to the left eye, among which the face key point P1 and the face key point P2 are a group of faces located in the upper eyelid and the lower eyelid in the left eye respectively.
  • Key points, face key point P3 and face key point P4 are a group of face key points located in the upper eyelid and lower eyelid in the left eye respectively.
  • the distance used to determine whether the left eye is closed can be regarded as the relative distance of the face key points of the left eye
  • the relative distance of the face key points of the left eye is L represents the relative distance of the face key point, it can be understood that L is the pixel distance
  • p 1 represents the two-dimensional coordinate of the face key point P1 in the face image
  • p 2 represents the second position of the face key point P2 in the face image.
  • p 3 represents the two-dimensional coordinates of the face key point P3 in the face image
  • p 4 represents the two-dimensional coordinates of the face key point P4 in the face image.
  • the weight coefficient corresponding to the face personalized expression base indicating that the left eye is closed should be larger, so , you can set a larger value range for the weight coefficient, such as setting a value range of 0.9-1, at this time, the p value corresponding to the left eye closed in p n can be 1, and the left eye closed in q n The corresponding q value can be set to 0.9, so that the value range of the weight coefficient of the face personalized expression base representing the left eye closed in ⁇ is between 0.9 and 1.
  • Whether the right eye is closed can be determined by calculating the relative distance of the face key points of the two groups of face key points (face key points in the box) of the right eye, and then the weight coefficient of the face personalized expression base corresponding to the right eye closed Set a reasonable value range.
  • the calculation method of the relative distance of the face key points corresponding to the face personalized expression base, the error distance, and the p and q values when the relative distance of the face key points does not exceed the error distance, and the relative distance of the face key points are predetermined. p- and q-values when the error distance is exceeded. Then, when constructing the error parameter formula, the p value and q value are determined by calculating the relative distance of the key points of the face, and then the value range of the weight coefficient is determined. At this time, the relative distance of the key points of the face and the corresponding error distance can be regarded as the weight. prior information on the coefficients. In this way, errors caused by incorrect detection of face key points can be allowed during the detection of face key points, and the accuracy of the subsequent processing process is ensured.
  • E exp represents the error parameter
  • represents the weight coefficient vector
  • ( ⁇ 1 ⁇ 2 ... ⁇ n )
  • n represents the total number of face personalized expression bases
  • ⁇ i represents the ith face personalized expression base
  • Weight coefficient 1 ⁇ i ⁇ n
  • A sR ⁇ B
  • s is the scaling factor when the 3D face model is mapped to the face image
  • R is the rigid rotation matrix when the 3D face model is mapped to the face image
  • ⁇ B [B 1 -B 0 B 2 -B 0 ...
  • B 0 represents the neutral facial expression base
  • B i represents the i-th face personalized expression base
  • b ft-sR ⁇ B 0
  • f represents the key points of the face in the face image
  • t represents the translation vector when the three-dimensional face model is mapped to the face image
  • s, R and t are the pose parameters
  • C represents the constraint parameter of ⁇
  • d represents the Ranges.
  • the manner of determining C and d may refer to the foregoing embodiment.
  • the error parameter formula can also be constructed by the LI regular optimization method, and in order to ensure that the weight coefficient is within the correct value range, when constructing the error parameter formula, the L1 regularity can be combined with the gradient projection, that is, in each When calculating the weight coefficient using the L1 regularity, the gradient of the weight coefficient is projected into the value range of the weight coefficient to ensure that the final calculated weight coefficient is within the corresponding value range.
  • the constructed error parameter formula is:
  • is the L1 regular coefficient, and its value can be set according to the actual situation
  • ⁇ j is the th j is the weight coefficient of the individual face expression base
  • ⁇ k is the weight coefficient of the kth face individual expression base
  • n is the total number of face individual expression bases
  • m is the total number of
  • the weight coefficient of the personalized expression base of the jth face when processing the image data of the previous frame. According to the above formula, the weight coefficient of each face personalized expression base can be calculated, and then the pose parameter can be calculated according to the weight coefficient.
  • the error parameter formula used in the subsequent calculation process is:
  • Step 260 Determine, according to the error parameter formula, the pose parameters of the three-dimensional face model when the error parameters are the smallest and the weight coefficients of the individualized expression bases of each face.
  • the unknowns in the error parameter formula include pose parameters and weight coefficients. Therefore, the pose parameters and weight parameters used in the error parameter formula when the error parameter is the smallest can be determined through the error parameter formula, and then the pose parameters and weight parameters are used as the final calculated pose parameters and weight coefficients.
  • the calculation may be performed in an alternate iterative manner. For example, first set the initialization parameters for the weight coefficient, then substitute the initialization parameters into the error parameter formula to fix the weight coefficient in the error parameter formula, and perform calculation to determine the value of the pose parameter when the error parameter is the smallest in the current calculation process. After that, the value of the calculated pose parameter is substituted into the error parameter formula again to fix the pose parameter, and the calculation is performed to determine the parameter of the weight coefficient when the error parameter is the smallest in the current calculation process.
  • step 260 includes steps 261-267:
  • Step 261 Obtain the initialization weight coefficients of each face personalized expression base, and use the initialization weight coefficients as the current weight coefficients.
  • the initialization weight coefficient refers to a preset weight coefficient, that is, a weight coefficient is preset for each individual face expression base.
  • the specific value of the initialization weight coefficient can be set according to the actual situation. For example, according to the value range of the weight coefficient of the face personalized expression base, a value boundary is selected as the initialization weight coefficient of the face personalized expression base.
  • the currently used weight coefficients are recorded as the current weight coefficients. weight factor.
  • Step 262 Substitute the current weight coefficient into the error parameter formula, and calculate the candidate pose parameters of the three-dimensional face model when the error parameter is the smallest.
  • the current weight coefficient is substituted into the error parameter formula, so that the weight coefficient in the error parameter formula is a fixed value (the value of the current weight coefficient).
  • the unknowns are only the pose parameters.
  • the calculation is performed according to the error parameter formula to determine the specific value of the pose parameter when the error parameter is the smallest in the current calculation process.
  • the pose parameters obtained by this calculation are recorded as candidate pose parameters.
  • the candidate pose parameters can be understood as intermediate values, and the purpose of calculating the candidate pose parameters is to obtain the final pose parameters.
  • Step 263 Substitute the candidate pose parameters into the error parameter formula, and calculate the candidate weight coefficients of the individualized expression bases for each face when the error parameters are the smallest.
  • the currently calculated candidate pose parameters are substituted into the error parameter formula, so that the pose parameters in the error reference formula are fixed values.
  • the error parameter formula whose unknowns only have weight coefficients.
  • the calculation is performed according to the error parameter formula to determine the specific value of the weight coefficient when the error parameter is the smallest in the current calculation process.
  • the weight coefficient obtained by this calculation is recorded as the candidate weight coefficient.
  • the candidate weight coefficient can be understood as an intermediate value, and the purpose of calculating the candidate weight coefficient is to obtain the final weight coefficient.
  • Step 264 update the current number of iterations.
  • an iterative calculation process refers to a process of obtaining candidate pose parameters and candidate weight coefficients after substituting the current weight coefficient into the error parameter formula. After the candidate pose parameters and the candidate weight coefficients are obtained, it is determined that one iteration calculation is completed, and the number of iterations is updated, that is, the current number of iterations is increased by 1. It can be understood that after each candidate weight coefficient is obtained, the number of iterations is incremented by 1, and the candidate weight coefficient and candidate pose parameter calculated by the latest iteration are used as the current and final candidate weight coefficient and candidate pose. parameter.
  • Step 265 Determine whether the number of iterations reaches the number threshold, and when the number of iterations does not reach the number threshold, perform step 266. When the number of iterations reaches the number threshold, step 267 is executed.
  • the number of times threshold is used to confirm whether to stop the iterative calculation.
  • the number of times threshold may be set in combination with the actual situation. For example, an appropriate number of times threshold may be determined in combination with historical experience data. In this embodiment, the number of times threshold is 5. Exemplarily, after updating the number of iterations, it is determined whether the current number of iterations reaches the number threshold, if so, stop the iterative calculation, and execute step 266 , if not, continue the iterative calculation and execute step 267 .
  • Step 266 take the candidate weight coefficient as the current weight coefficient, and return to step 262 .
  • the candidate weight coefficient obtained by this iterative calculation is used as the current weight coefficient, and the process returns to step 262 to start a new iterative calculation.
  • Step 267 Use the finally obtained candidate pose parameters as the pose parameters of the three-dimensional face model, and use the finally obtained candidate weight coefficients as the weight coefficients of the face personalized expression base.
  • the candidate pose parameters and the candidate weight coefficients finally obtained refer to the candidate pose parameters and the candidate weight coefficients calculated by the latest iteration when the number of iterations reaches the number threshold.
  • the iterative calculation is stopped, and the finally obtained candidate pose parameters and candidate weight coefficients are used as the pose parameters of the final 3D face model and the weight coefficients of the face personalized expression base.
  • Step 270 Send the pose parameters and the weight coefficients to the remote device, so that the remote device generates a virtual image corresponding to the face image according to the pose parameters and the weight coefficients.
  • the neutral facial expression base of the target object is constructed according to the current frame image data and the preset a priori information of the facial model by acquiring the current frame image data including the facial image of the target object, and then according to the neutral facial expression Based on the base, the reference neutral expression base and the reference personalized expression base, the personalized expression base of the face is obtained, and the 3D face model is constructed according to the individualized expression base and the neutral expression base of each face, and the 3D face model and the human face are constructed.
  • the error parameter formula of the face image then, according to the error parameter formula, determine the weight coefficient of the face personalized expression base and the pose parameters of the 3D face model when the error parameter is the smallest, and send the weight coefficient and pose parameters to the remote device.
  • each basic expression corresponds to a face personalized expression base, which makes the expressions contained in the 3D face model more abundant, thereby ensuring that the obtained pose parameters and weight coefficients are close to the real face image.
  • the basic expression defined by the FACS refinement mainly divides the left and right symmetrical expressions.
  • the expression of the face image is asymmetrical, it can be effectively captured and driven, so that the obtained pose parameters and weight coefficients are close to the real ones. face image.
  • the error parameter formula can be converted into a linear solution formula, which simplifies the calculation process.
  • FIG. 12 is a flowchart of still another virtual image construction method provided by an embodiment of the present application.
  • the virtual image construction method specifically includes:
  • Step 310 Acquire current frame image data, where the current frame image data includes a face image of the target object.
  • Step 320 construct a neutral facial expression base and a plurality of personalized facial expression bases of the target object according to the current frame image data.
  • Step 330 constructing a three-dimensional face model of the target object according to the neutral facial expression base and multiple personalized facial expression bases.
  • Step 340 constructing an error parameter formula when the three-dimensional face model is mapped to the face image.
  • the error parameter formula adopted is:
  • E exp represents the error parameter
  • represents the weight coefficient vector
  • ( ⁇ 1 ⁇ 2 ... ⁇ n )
  • n represents the total number of face personalized expression bases
  • ⁇ i represents the ith face personalized expression base
  • Weight coefficient 1 ⁇ i ⁇ n
  • A sR ⁇ B
  • s is the scaling factor when the 3D face model is mapped to the face image
  • R is the rigid rotation matrix when the 3D face model is mapped to the face image
  • ⁇ B [B 1 -B 0 B 2 -B 0 ...
  • B 0 represents the neutral facial expression base
  • B i represents the i-th face personalized expression base
  • b ft-sR ⁇ B 0
  • f represents the face key points in the face image
  • t represents the translation vector when the three-dimensional face model is mapped to the face image
  • s, R and t are the pose parameters
  • C represents the constraint parameter of ⁇
  • d represents the Ranges.
  • ones(n) represents the upper bound of the value of n weight coefficients
  • zero(n) represents the lower bound of the value of n weight coefficients
  • ones(n) represents the upper bound of the n weight coefficients
  • zero(n) represents the lower bound of the n weight coefficients
  • p n and q n are the value constraint matrices
  • p n and q n are based on the face The relative distance of the face key points in the image is determined.
  • the weight coefficient corresponding to the face personalized expression base should be relatively large. Therefore, a larger value range can be set for the weight coefficient, such as a value range of 0.9-1.
  • p n and the left eye are closed.
  • the corresponding p value can be set to 1
  • the q value corresponding to the left eye closure in q n can be set to 0.9, so that the weight coefficient of the face personalized expression base representing the left eye closure in ⁇ ranges from 0.9 to 1. .
  • the relative distance of the face key points of the mouth in the face image After calculating the relative distance of the face key points of the mouth in the face image, if the relative distance of the face key points does not exceed the error distance (for example, L ⁇ 3), the mouth is considered to be closed, and the p value is set to 0.1 and the q value is 0 , and then make the value range of the face personalized expression base weight coefficient representing the open mouth between 0 and 0.1. If the relative distance of the key points of the face exceeds the error distance (for example, L>3), then set the p value to 1, q The value is 0, so that the value range of the weight coefficient of the face personalized expression base representing the open mouth is between 0-1. According to the above method, the relative distance of the face key points and the corresponding error distance are used as the prior information of the weight coefficient. In this way, errors caused by incorrect detection of face key points can be allowed in the detection of face key points, and the accuracy of the subsequent processing process can be ensured.
  • the error distance for example, L ⁇ 3
  • Step 350 searching for mutually exclusive expression bases in the personalized expression bases of each face.
  • FIG. 13 is a schematic diagram of a mutually exclusive expression base provided by an embodiment of the present application. Referring to FIG. 13 , based on the reader’s vision, the lips and chin in the left face personalized expression base are moved to the left, and the right The lips and chin in the face personalized expression base are shifted to the right. For a human face, it can only make one of the expressions, but cannot make two expressions at the same time. For another example, FIG.
  • FIG. 14 is a schematic diagram of another mutually exclusive expression base provided by the embodiment of the present application.
  • the expression of the human face personalized expression base on the left is an open mouth
  • the expression of the human face personalized expression base on the right is Cheeks puffed out.
  • a human face it cannot bulge its cheeks when opening its mouth, so it can be considered as a mutually exclusive expression base.
  • the expressions that cannot appear at the same time in the mutually exclusive expression base not only refer to the basic expressions corresponding to the face personalized expression base, but also include the superimposed expressions.
  • the corresponding multiple face personalized expression bases are also mutually exclusive expression bases.
  • the superimposed expression is the expression of wrinkling the nose and frowning the left eyebrow
  • the other superimposed expression is the raised eyebrow and the left eyebrow tail.
  • Two superimposed expressions cannot appear in a human face at the same time. Therefore, the facial personalized expression base of wrinkling the nose and frowning the left eyebrow and the facial personalized expression base of raising the eyebrow and raising the left eyebrow tail are mutually exclusive expression bases.
  • the mutually exclusive expression base can be constructed manually, and the virtual image construction device directly obtains the mutually exclusive expression base.
  • the virtual image construction device can also gradually add basic expressions in the same three-dimensional face model and superimpose the basic expressions to determine whether the three-dimensional face model can display all expressions at the same time, thereby determining mutually exclusive expression bases.
  • the mutually exclusive expression bases in the 26 face personalized expression bases are shown in the following table:
  • the face personalized expression base included in mutual exclusion 1 and the face personalized expression base included in mutual exclusion 2 in the same row are mutually exclusive expression bases. It should be noted that “B” in Table 2 corresponds to “Blendshape” in Table 1, and the number after “B” in Table 2 is the number of "Blendshape”.
  • Step 360 Group the individualized expression bases of faces according to mutually exclusive expression bases to obtain multiple expression base groups, and any two individualized facial expression bases in each expression base group are not mutually exclusive.
  • the facial personalized expression bases are grouped according to the mutually exclusive expression bases, and at this time, each grouping is recorded as an expression basis set.
  • the face personalized expression bases in each expression base group are not mutually exclusive. For example, if an expression base group contains the personalized facial expression base corresponding to B1, then it will not contain the personalized facial expression base corresponding to B3. If an expression base group contains the personalized facial expression bases corresponding to B4 and B25, then it will not contain the personalized facial expression bases corresponding to B6 and B7.
  • each expression base group does not contain mutually exclusive facial personalized expression bases.
  • Step 370 Calculate the minimum error parameter corresponding to each expression base set, and the pose parameters of the three-dimensional face model and the weight coefficients of the individual face expression bases in the expression base set when the minimum error parameter is based on the error parameter formula.
  • the calculation is performed in units of expression basis groups, and one group of expression basis groups is optimized each time. Since the calculation process of each expression basis set is the same, in the embodiment, the calculation of one expression basis set is taken as an example for description.
  • the error parameter formula when calculating the minimum error parameter, the weight coefficients of the individualized expression bases of each face in the expression base group and the pose parameters of the three-dimensional face model. Since the expression basis group does not contain all the face personalized expression basis, in the calculation process, the weight coefficient of the face personalized expression basis not included in the expression basis group can always be set to 0, so as to reduce the weight of the solution. the number of coefficients. It can be understood that in the calculation process, the iterative calculation method can also be used. For details, refer to the process described in step 260. The only difference is that the final obtained weight coefficient is not included in the facial expression base group. The weight factor is 0.
  • each expression base group contains different facial expression bases, the minimum error parameters, weight coefficients and pose parameters obtained may be different when different expression base groups are used for calculation. Therefore, after each expression basis group is calculated in the above manner, each expression basis group corresponds to a minimum error parameter, a weight coefficient and a pose parameter.
  • Step 380 From the minimum error parameters corresponding to each expression basis set, select the smallest minimum error parameter.
  • Step 390 Use the pose parameter and weight coefficient corresponding to the smallest minimum error parameter as the finally obtained pose parameter and weight coefficient.
  • the three-dimensional face model is the closest to the face image.
  • Step 3100 Send the pose parameters and the weight coefficients to the remote device, so that the remote device generates a virtual image corresponding to the face image according to the pose parameters and the weight coefficients.
  • the weight coefficient when the weight coefficient is sent, for the face personalized expression base not included in the expression base group, the weight coefficient is set to 0, and is sent to the remote device together with other weight coefficients.
  • the weight coefficient when sending the weight coefficient, only the weight coefficient of the face personalized expression base in the expression base group corresponding to the smallest minimum error parameter is sent, and the remote device searches for the corresponding personalized expression base according to the received weight coefficient. Instead of using all the personalized expression bases, a corresponding virtual image is constructed according to the searched personalized expression bases and the corresponding weight coefficients.
  • the technical problem of freezing reduces the demand for network bandwidth, effectively protects the privacy of the target object, and ensures the imaging quality of the remote device.
  • the pose parameters and weight coefficients are calculated in units of expression base groups, which reduces the number of weight coefficients to be solved each time, thereby reducing the number of expressions.
  • the base search space makes the expression coefficient solution more accurate and efficient, and at the same time, fewer facial expression bases are used to express the expression of the face image.
  • FIG. 15 is a schematic structural diagram of an apparatus for constructing a virtual image provided by an embodiment of the present application.
  • the virtual image construction apparatus includes: an image acquisition module 401 , an expression base construction module 402 , a face model construction module 403 , a parameter determination module 404 and a parameter transmission module 405 .
  • the image acquisition module 401 is used to acquire the current frame image data, and the current frame image data includes the face image of the target object;
  • the expression base construction module 402 is used to construct the neutral facial expression base of the target object according to the current frame image data and a plurality of face personalized expression bases;
  • the face model building module 403 is used to construct a three-dimensional face model of the target object according to the neutral facial expression base and a plurality of face personalized expression bases;
  • the parameter determination module 404 is used to determine When the three-dimensional face model is mapped to the face image, the pose parameters of the three-dimensional face model and the weight coefficients of the individualized expression bases of each face;
  • the parameter sending module 405 is used for sending the pose parameters and the weight coefficients to the remote device, So that the remote device generates a virtual image corresponding to the face image according to the pose parameters and the weight coefficient.
  • the parameter determination module 404 includes: a formula construction unit for constructing an error parameter formula when the three-dimensional face model is mapped to a face image; a formula calculation unit for determining the error parameter according to the error parameter formula The minimum pose parameters of the 3D face model and the weight coefficients of the individualized expression bases of each face.
  • the weight coefficient of the face personalized expression base, 1 ⁇ i ⁇ n, A sR ⁇ B, s represents the scaling factor when the 3D face model is mapped to the face image, R represents the 3D face model is mapped to the face image
  • the rigid rotation matrix of , ⁇ B [B 1 -B 0 B 2 -B 0 ...
  • B 0 represents the neutral facial expression base
  • B i represents the i-th face personalized expression base
  • b ft-sR ⁇ B 0
  • f represents the face key points in the face image
  • t represents the translation vector when the 3D face model is mapped to the face image
  • s, R and t are the pose parameters
  • C represents the constraint of ⁇ parameter
  • d represents the value range of ⁇ .
  • ones(n) represents the upper bound of the value of n weight coefficients
  • zero(n) represents the lower bound of the value of n weight coefficients
  • ones(n) represents the upper bound of the value of n weight coefficients
  • zero(n) represents the lower bound of the value of n weight coefficients
  • p n and q n are the value constraint matrices, p n and q n according to the The relative distance of the face key points in the face image is determined.
  • an expression base search module which is used to determine, according to the error parameter formula, the pose parameters of the three-dimensional face model when the error parameter is the smallest and the weight coefficients of the individualized expression bases of each face before , to find the mutually exclusive expression bases in the personalized expression bases of each face;
  • the expression base grouping module is used to group the personalized expression bases of each face according to the mutually exclusive expression bases, and obtain multiple expression base groups, each of which is Any two face-personalized expression bases in the expression base group are not mutually exclusive.
  • the formula calculation unit includes: a group calculation sub-unit, which is used to calculate the minimum error parameter corresponding to each expression basis group according to the error parameter formula, and the pose parameters of the three-dimensional face model and each person in the expression basis group when the minimum error parameter is used.
  • the weight coefficient of the face personalized expression base is used to select the smallest minimum error parameter among the minimum error parameters corresponding to each expression base group; the second parameter selection subunit is used to select the smallest minimum error parameter
  • the pose parameters and weight coefficients corresponding to the error parameters are used as the finally obtained pose parameters and weight coefficients.
  • the expression base construction module 402 includes: a neutral expression base construction unit, configured to construct a neutral expression base of the target object according to the current frame image data and the preset prior information of the face model;
  • the personalized expression base construction unit is used to determine the individual facial expression bases of the target object according to the neutral facial expression base, the preset reference neutral expression base and each reference personalized expression base, and each reference personalized expression base
  • the expression base corresponds to a face personalized expression base.
  • the neutral expression base construction unit includes: a face image detection subunit for detecting the face image in the current frame data image; a key point location subunit for detecting the face image Perform facial key point positioning to obtain a key point coordinate array; neutral expression base determination subunit, used to determine the face of the target object according to the face image, the key point coordinate array and the preset a priori information of the face model sexual expression base.
  • the personalized expression base construction unit includes: a deformation information determination subunit, used for determining deformation information according to the reference neutral expression base and the reference personalized expression base; the personalized expression base determination subunit, used for According to the deformation information and the neutral facial expression base, the facial personalized expression base of the target object is determined.
  • the virtual image construction device provided above can be used to execute the virtual image construction method provided by any of the above embodiments, and has corresponding functions and beneficial effects.
  • the units and modules included are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, The specific names of the functional units are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present invention.
  • FIG. 16 is a schematic structural diagram of a virtual image construction device according to an embodiment of the present application.
  • the virtual image construction device includes a processor 50, a memory 51, an input device 52, and an output device 53; the number of processors 50 in the virtual image construction device may be one or more, and in FIG. 16, one process Take device 50 as an example.
  • the processor 50 , the memory 51 , the input device 52 , and the output device 53 in the virtual image construction device may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 16 .
  • the memory 51 can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the virtual image construction method in the embodiment of the present invention (for example, the image acquisition module 401, expression base construction module 402, face model construction module 403, parameter determination module 404 and parameter transmission module 405).
  • the processor 50 executes various functional applications and data processing of the virtual image construction device by running the software programs, instructions and modules stored in the memory 51 , that is, implements the above virtual image construction device method.
  • the memory 51 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the virtual image construction apparatus, and the like.
  • the memory 51 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device.
  • memory 51 may further include memory located remotely relative to processor 50, and these remote memories may be connected to the virtual image construction device through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the input device 52 can be used to receive input digital or character information, and generate key signal input related to user settings and function control of the virtual image construction device, and also includes image capture devices, audio capture devices, and the like.
  • the output device 53 may include a display device such as a display screen.
  • the virtual image construction apparatus may further include communication means for data communication with other apparatuses.
  • the above virtual image construction device includes a virtual image construction device, which can be used to execute any virtual image construction method, and has corresponding functions and beneficial effects.
  • embodiments of the present application also provide a storage medium containing computer-executable instructions, when executed by a computer processor, the computer-executable instructions are used to execute the relevant information in the virtual image construction method provided by any embodiment of the present application. operation, and has corresponding functions and beneficial effects.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product.
  • the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects.
  • the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • the present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions.
  • These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM).
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Des modes de réalisation de la présente invention concernent le domaine technique du traitement d'images. L'invention concerne un procédé et un appareil de construction d'image virtuelle, un dispositif, et un support de stockage. Le procédé consiste à : obtenir des données d'image de trame actuelle, les données d'image de trame actuelle comprenant une image de visage d'un objet cible ; construire une base d'expression neutre de visage humain et une pluralité de bases d'expression personnalisées de visage de l'objet cible selon les données d'image de trame actuelle ; construire un modèle de visage tridimensionnel de l'objet cible en fonction de la base d'expression neutre de visage et de la pluralité de bases d'expression personnalisées de visage ; lorsque le modèle de visage tridimensionnel est mappé sur l'image de visage, déterminer des paramètres de pose du modèle de visage tridimensionnel et des coefficients de pondération des bases d'expression personnalisées de visage ; et envoyer les paramètres de pose et les coefficients de pondération à un dispositif distant, de telle sorte que le dispositif distant génère, en fonction des paramètres de pose et des coefficients de pondération, une image virtuelle correspondant à l'image de visage. Grâce à l'utilisation du procédé, le problème technique, selon l'état de la technique, de retard provoqué par la transmission d'une image de visage réelle peut être résolu.
PCT/CN2021/070727 2021-01-07 2021-01-07 Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage WO2022147736A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/070727 WO2022147736A1 (fr) 2021-01-07 2021-01-07 Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage
CN202180024686.6A CN115335865A (zh) 2021-01-07 2021-01-07 虚拟图像构建方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/070727 WO2022147736A1 (fr) 2021-01-07 2021-01-07 Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage

Publications (1)

Publication Number Publication Date
WO2022147736A1 true WO2022147736A1 (fr) 2022-07-14

Family

ID=82357818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/070727 WO2022147736A1 (fr) 2021-01-07 2021-01-07 Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage

Country Status (2)

Country Link
CN (1) CN115335865A (fr)
WO (1) WO2022147736A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220230399A1 (en) * 2021-01-19 2022-07-21 Samsung Electronics Co., Ltd. Extended reality interaction in synchronous virtual spaces using heterogeneous devices
CN114972661A (zh) * 2022-08-01 2022-08-30 深圳元象信息科技有限公司 人脸模型构建方法、人脸图像生成方法、设备及存储介质
CN115222895A (zh) * 2022-08-30 2022-10-21 北京百度网讯科技有限公司 图像生成方法、装置、设备以及存储介质
CN116453222A (zh) * 2023-04-19 2023-07-18 北京百度网讯科技有限公司 目标对象姿态确定方法、训练方法、装置以及存储介质
CN117746381A (zh) * 2023-12-12 2024-03-22 北京迁移科技有限公司 位姿估计模型配置方法及位姿估计方法
CN116453222B (zh) * 2023-04-19 2024-06-11 北京百度网讯科技有限公司 目标对象姿态确定方法、训练方法、装置以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920886A (zh) * 2006-09-14 2007-02-28 浙江大学 基于视频流的三维动态人脸表情建模方法
CN105528805A (zh) * 2015-12-25 2016-04-27 苏州丽多数字科技有限公司 一种虚拟人脸动画合成方法
WO2017137947A1 (fr) * 2016-02-10 2017-08-17 Vats Nitin Production réaliste de visage parlant avec expression au moyen d'images, de texte et de voix
CN111814652A (zh) * 2020-07-03 2020-10-23 广州视源电子科技股份有限公司 虚拟人像渲染方法、装置以及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920886A (zh) * 2006-09-14 2007-02-28 浙江大学 基于视频流的三维动态人脸表情建模方法
CN105528805A (zh) * 2015-12-25 2016-04-27 苏州丽多数字科技有限公司 一种虚拟人脸动画合成方法
WO2017137947A1 (fr) * 2016-02-10 2017-08-17 Vats Nitin Production réaliste de visage parlant avec expression au moyen d'images, de texte et de voix
CN111814652A (zh) * 2020-07-03 2020-10-23 广州视源电子科技股份有限公司 虚拟人像渲染方法、装置以及存储介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220230399A1 (en) * 2021-01-19 2022-07-21 Samsung Electronics Co., Ltd. Extended reality interaction in synchronous virtual spaces using heterogeneous devices
US11995776B2 (en) * 2021-01-19 2024-05-28 Samsung Electronics Co., Ltd. Extended reality interaction in synchronous virtual spaces using heterogeneous devices
CN114972661A (zh) * 2022-08-01 2022-08-30 深圳元象信息科技有限公司 人脸模型构建方法、人脸图像生成方法、设备及存储介质
CN115222895A (zh) * 2022-08-30 2022-10-21 北京百度网讯科技有限公司 图像生成方法、装置、设备以及存储介质
CN116453222A (zh) * 2023-04-19 2023-07-18 北京百度网讯科技有限公司 目标对象姿态确定方法、训练方法、装置以及存储介质
CN116453222B (zh) * 2023-04-19 2024-06-11 北京百度网讯科技有限公司 目标对象姿态确定方法、训练方法、装置以及存储介质
CN117746381A (zh) * 2023-12-12 2024-03-22 北京迁移科技有限公司 位姿估计模型配置方法及位姿估计方法

Also Published As

Publication number Publication date
CN115335865A (zh) 2022-11-11

Similar Documents

Publication Publication Date Title
WO2022147736A1 (fr) Procédé et appareil de construction d'image virtuelle, dispositif et support de stockage
US10679046B1 (en) Machine learning systems and methods of estimating body shape from images
CN111598998B (zh) 三维虚拟模型重建方法、装置、计算机设备和存储介质
US11210804B2 (en) Methods, devices and computer program products for global bundle adjustment of 3D images
US11399141B2 (en) Processing holographic videos
WO2023050992A1 (fr) Procédé et appareil d'apprentissage de réseau pour la reconstruction faciale, et dispositif et support de stockage
CN111161395B (zh) 一种人脸表情的跟踪方法、装置及电子设备
Li et al. Object detection in the context of mobile augmented reality
EP4307233A1 (fr) Procédé et appareil de traitement de données, dispositif électronique et support de stockage lisible par ordinateur
CN111723707B (zh) 一种基于视觉显著性的注视点估计方法及装置
US20200349754A1 (en) Methods, devices and computer program products for generating 3d models
CN113643366B (zh) 一种多视角三维对象姿态估计方法及装置
US20220198731A1 (en) Pixel-aligned volumetric avatars
CN111815768B (zh) 三维人脸重建方法和装置
Chang et al. Salgaze: Personalizing gaze estimation using visual saliency
Fornalczyk et al. Robust face model based approach to head pose estimation
US11158122B2 (en) Surface geometry object model training and inference
Wang et al. Handling occlusion and large displacement through improved RGB-D scene flow estimation
CN115460372A (zh) 虚拟图像构建方法、装置、设备及存储介质
CN115937365A (zh) 用于人脸重建的网络训练方法、装置、设备及存储介质
Lee et al. Real-time camera tracking using a particle filter and multiple feature trackers
Jian et al. Realistic face animation generation from videos
US20230290101A1 (en) Data processing method and apparatus, electronic device, and computer-readable storage medium
TWI819639B (zh) 深度估計模型之訓練方法、裝置、電子設備及存儲介質
CN116228808A (zh) 表情跟踪方法、装置、设备以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21916793

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21916793

Country of ref document: EP

Kind code of ref document: A1