CN115222866A

CN115222866A - Rendering method and device of virtual prop, electronic equipment and storage medium

Info

Publication number: CN115222866A
Application number: CN202110431222.9A
Authority: CN
Inventors: 张纪绪; 邹瑞波
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-04-21
Filing date: 2021-04-21
Publication date: 2022-10-21

Abstract

The application provides a rendering method and device of a virtual item, electronic equipment and a storage medium; the method comprises the following steps: performing three-dimensional reconstruction on a target object based on a video frame image including the target object in a video to obtain a three-dimensional object model corresponding to the target object; acquiring a virtual prop model corresponding to a virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model; determining a tracking point of each prop model, and determining a model coordinate of each prop model; obtaining model parameters of the three-dimensional object model, and determining world coordinates of each sub prop model based on the model parameters of the three-dimensional object model, tracking points of each sub prop model and model coordinates of each sub prop model; rendering the virtual props based on the world coordinates of the sub prop models to present the target objects equipped with the virtual props in the video; through the application, the virtual prop can be attached to the target object more accurately, and the rendering effect is improved.

Description

Rendering method and device of virtual prop, electronic equipment and storage medium

Technical Field

The present application relates to the field of video processing technologies, and in particular, to a method and an apparatus for rendering a virtual item, an electronic device, and a storage medium.

Background

With the rapid development of the mobile internet, the competition of industries such as short videos is more and more intense, and various video shooting special effects are generated, for example, virtual props such as clothes and armors are added to objects in videos during video shooting. In the related technology, when the virtual prop of the target object in the video is rendered, a Lens Studio effect editor is adopted to realize the rendering of the virtual prop by binding the relative position of the target object and the virtual prop based on the three-dimensional model grid adsorption function, however, the virtual prop rendered in the video cannot be well attached to the target object, and the rendering effect is poor.

Disclosure of Invention

The embodiment of the application provides a rendering method and device of a virtual item, electronic equipment and a storage medium, which can enable the virtual item to be more accurately attached to a target object and improve the rendering effect.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a rendering method of a virtual item, which comprises the following steps:

performing three-dimensional reconstruction on a target object based on a video frame image including the target object in a video to obtain a three-dimensional object model corresponding to the target object;

acquiring a virtual prop model corresponding to a virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model;

determining a tracking point of each of the prop models, and determining model coordinates of each of the prop models; the tracking point is used for indicating the connecting position between the corresponding prop model and the three-dimensional object model;

obtaining model parameters of the three-dimensional object model, and determining world coordinates of each sub-prop model based on the model parameters of the three-dimensional object model, tracking points of each sub-prop model and model coordinates of each sub-prop model;

rendering the virtual prop based on the world coordinates of each of the sub-prop models to present the target object equipped with the virtual prop in the video.

In the foregoing solution, the determining, based on the model parameters of the three-dimensional object model, the tracking point of each of the prop models, and the model coordinates of each of the prop models, the world coordinates of each of the prop models includes:

determining rotation parameters of each sub prop model relative to an initial sub prop model based on the tracking point of each sub prop model;

acquiring world coordinates of tracking points of each sub prop model;

and determining the world coordinates of each sub prop model based on the model coordinates of each sub prop model by combining the model parameters of the three-dimensional object model, the world coordinates of the tracking point of each sub prop model and the rotation parameters of each sub prop model.

In the foregoing solution, the determining, based on the tracking point of each of the prop models, a rotation parameter of each of the prop models with respect to an initial prop model includes:

performing the following processing aiming at each sub prop model to determine the rotation parameters of each sub prop model relative to the initial sub prop model:

acquiring a normal direction vector of a tracking point of the sub prop model and a standard direction vector of the initial sub prop model, and

determining a rotation vector corresponding to the sub prop model and a projection vector of the standard direction vector on a normal plane of the normal direction vector based on the normal direction vector and the standard direction vector;

determining, based on the rotation vector and the projection vector, a rotation parameter of the secondary prop model relative to the initial secondary prop model.

In the above scheme, the determining a rotation vector corresponding to the prop model based on the normal direction vector and the standard direction vector includes:

synthesizing the normal direction vector and the standard direction vector to obtain a first synthesized direction vector;

determining a first rotation quaternion corresponding to the normal direction vector and a second rotation quaternion corresponding to the first synthetic direction vector;

and determining a rotation vector corresponding to the sub prop model based on the first rotation quaternion and the second rotation quaternion.

The embodiment of the present application further provides a rendering apparatus for a virtual item, including:

the reconstruction module is used for performing three-dimensional reconstruction on a target object based on a video frame image including the target object in a video to obtain a three-dimensional object model corresponding to the target object;

the acquisition module is used for acquiring a virtual prop model corresponding to a virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model;

the first determining module is used for determining a tracking point of each sub prop model and determining a model coordinate of each sub prop model; the tracking points are used for indicating the connection positions between the corresponding sub prop models and the three-dimensional object model;

the second determination module is used for obtaining model parameters of the three-dimensional object model and determining world coordinates of each sub-prop model based on the model parameters of the three-dimensional object model, tracking points of each sub-prop model and model coordinates of each sub-prop model;

and the rendering module is used for rendering the virtual prop based on the world coordinates of each sub prop model so as to present the target object assembled with the virtual prop in the video.

In the above scheme, the reconstruction module is further configured to perform image recognition processing on the video frame image to obtain two-dimensional image coordinates of object key points of the target object in the video frame image;

determining three-dimensional model coordinates of the object key points based on the two-dimensional image coordinates;

and performing three-dimensional reconstruction on the target object based on the three-dimensional model coordinates of the key points of the object to obtain a three-dimensional object model corresponding to the target object.

In the foregoing solution, the first determining module is further configured to perform the following processing for each of the sub-prop models to determine the model coordinates of each of the sub-prop models:

establishing a model coordinate system where the child prop model is located by taking the tracking point of the child prop model as the origin of the model coordinate system, and

and determining model coordinates of the sub prop model based on the model coordinate system.

In the foregoing solution, the second determining module is further configured to determine a first model vertex and a second model vertex on the standard three-dimensional object model, and determine a first distance between the first model vertex and the second model vertex, when the model parameter is a scaling parameter of the three-dimensional object model relative to the standard three-dimensional object model;

determining a third model vertex corresponding to the first model vertex and a fourth model vertex corresponding to the second model vertex on the three-dimensional object model;

determining a second distance between the third model vertex and the fourth model vertex;

determining a scaling parameter for the three-dimensional object model based on the first distance and the second distance.

In the above scheme, the second determining module is further configured to determine, based on the tracking point of each of the sub-prop models, a rotation parameter of each of the sub-prop models with respect to the initial sub-prop model;

acquiring world coordinates of tracking points of each sub prop model;

In the foregoing solution, the second determining module is further configured to obtain model coordinates of a model vertex corresponding to each tracking point in the three-dimensional object model, and use the model coordinates of the model vertex as model coordinates of the corresponding tracking point;

obtaining the coordinate transformation parameters of the model, and

and determining the world coordinates of the tracking points of the sub prop models based on the model coordinates of the tracking points and the model coordinate transformation parameters.

In the foregoing scheme, the second determining module is further configured to perform the following processing for each of the sub-prop models to determine a rotation parameter of each of the sub-prop models relative to an initial sub-prop model:

determining a rotation vector corresponding to the prop model and a projection vector of the standard direction vector on a normal plane of the normal direction vector based on the normal direction vector and the standard direction vector;

In the above scheme, the second determining module is further configured to determine a model vertex of a tracking point corresponding to the sub-prop model in the three-dimensional object model, and determine a plurality of model triangles where the model vertex is located;

determining the surface normal direction vector of each model triangle, averaging the determined surface normal direction vectors, and taking the average result as the average normal direction vector of the model vertex;

and determining the average normal direction vector of the model vertex as the normal direction vector of the tracking point of the prop model.

In the above solution, the second determining module is further configured to synthesize the normal direction vector and the standard direction vector to obtain a first synthesized direction vector;

In the above scheme, the second determining module is further configured to rotate the rotation vector in the direction of the standard direction vector to obtain an intermediate rotation vector;

synthesizing the intermediate rotation vector and the projection vector to obtain a second synthetic direction vector;

determining a third rotation quaternion corresponding to the projection vector and a fourth rotation quaternion corresponding to the second synthetic direction vector;

determining rotation parameters of the sub-prop model relative to the initial sub-prop model based on the third rotation quaternion, the fourth rotation quaternion and the rotation vector.

In the above scheme, the second determining module is further configured to determine a world coordinate transformation parameter of each of the prop sub-models based on the world coordinate of the tracking point of each of the prop sub-models, the rotation parameter, and the model parameter;

and multiplying the model coordinates of each sub prop model by the world coordinate change parameters respectively to obtain the world coordinates of each sub prop model.

In the above scheme, the rendering module is further configured to perform vertex rendering on the virtual prop model corresponding to the virtual prop based on the world coordinates of each of the sub prop models to obtain an intermediate virtual prop;

determining pixel colors of pixel points contained in the virtual props based on preset material attributes corresponding to the virtual props;

and performing pixel rendering on the intermediate virtual prop based on the pixel color of a pixel point contained in the virtual prop to obtain the virtual prop so as to present a target object assembled with the virtual prop in the video.

In the above scheme, the apparatus further comprises:

a presentation module for presenting a video capture interface for video frame image capture of the target object, and

presenting prop function items in the video acquisition interface;

presenting at least one candidate virtual prop for selection in response to a triggering operation for the prop function item;

and responding to the selection operation aiming at the target candidate virtual prop, and determining that the target candidate virtual prop is the virtual prop of the three-dimensional object model.

An embodiment of the present application further provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the rendering method of the virtual prop provided by the embodiment of the application when the executable instructions stored in the memory are executed.

The embodiment of the present application further provides a computer-readable storage medium, which stores executable instructions, and when the executable instructions are executed by a processor, the method for rendering the virtual prop provided in the embodiment of the present application is implemented.

The embodiment of the application has the following beneficial effects:

in the embodiment of the application, after a video frame image including a target object is obtained, the target object is subjected to three-dimensional reconstruction based on the video frame image to obtain a corresponding three-dimensional object model, then a plurality of sub-prop models corresponding to virtual props of the three-dimensional object model are obtained to determine tracking points and model coordinates of the sub-prop models, model parameters of the three-dimensional object model are obtained to determine world coordinates of the sub-prop models based on the model parameters, the tracking points and the model coordinates of the sub-prop models, and therefore the virtual props are rendered based on the world coordinates of the sub-prop models to present the target object equipped with the virtual props.

Here, the tracking point is used for indicating a connection position between the corresponding sub-prop model and the three-dimensional object model, so that, for each video image frame, a position of each sub-prop model relative to the three-dimensional object model can be accurately determined based on the tracking point, that is, a world coordinate of each sub-prop model is accurately determined, and a model parameter of the three-dimensional object model is obtained, so that the determined world coordinate of each sub-prop model is more attached to the three-dimensional object model.

Drawings

Fig. 1 is a schematic architecture diagram of a system 100 for rendering a virtual prop according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of an electronic device 500 implementing a rendering method of a virtual item according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a method for rendering a virtual item according to an embodiment of the present application;

fig. 4 is a presentation schematic diagram of a target object equipped with a virtual prop in a video provided by an embodiment of the present application;

fig. 5 is a schematic diagram illustrating a selection process of a virtual item according to an embodiment of the present application;

fig. 6 is a schematic flowchart of a rendering method of a virtual item according to an embodiment of the present application;

FIG. 7 is an anchor point schematic diagram of a sub-prop model provided by an embodiment of the present application;

FIG. 8 is an axial schematic view of a model of a prop provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of rendering apparatus 555 of a virtual prop according to an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order or importance, but rather "first \ second \ third" may, where permissible, be interchanged in a particular order or sequence so that embodiments of the present application described herein can be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) The terminal comprises a client and an application program running in the terminal and used for providing various services, such as an instant messaging client and a video playing client.

2) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

3) PinToMesh is a three-dimensional model grid adsorption function, specifically, a point on a three-dimensional object A is selected, and a three-dimensional object B moves and renders in a world space along with the point.

4) Physical Based Rendering, a realistic Rendering method Based on physics, according to the material property and surface property of real world objects, and combining a Physically correct illumination calculation method, generating the real color of each pixel of a Rendering object.

Based on the above explanations of terms and terms involved in the embodiments of the present application, the following describes a rendering system of virtual props provided in the embodiments of the present application. Referring to fig. 1, fig. 1 is a schematic architecture diagram of a rendering system 100 for virtual items provided in this embodiment of the present application, in order to support an exemplary application, terminals (exemplary show terminal 400-1 and terminal 400-2) are connected to a server 200 through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of the two, and data transmission is implemented using a wireless or wired link.

Terminals (e.g., terminal 400-1 and terminal 400-2) for presenting a video capture interface for video frame image capture of a target object at a graphical interface 410 (graphical interface 410-1 and graphical interface 410-2 are exemplarily shown); acquiring a video frame image of a target object through a video acquisition interface, and transmitting the acquired video frame image including the target object to the server 200;

the server 200 is configured to receive a video frame image including a target object, and perform three-dimensional reconstruction on the target object based on the video frame image including the target object to obtain a three-dimensional object model corresponding to the target object; acquiring a virtual prop model corresponding to a virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model; determining a tracking point of each prop model, and determining a model coordinate of each prop model; acquiring model parameters of the three-dimensional object model, and determining world coordinates of each sub-prop model based on the model parameters of the three-dimensional object model, tracking points of each sub-prop model and model coordinates of each sub-prop model; rendering the virtual props based on the world coordinates of the sub prop models, and sending the rendered virtual props to the terminal;

and the terminals (such as the terminal 400-1 and the terminal 400-2) are used for receiving the virtual item corresponding to the target object sent by the server and presenting the target object equipped with the virtual item in the video.

In practical application, the server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminals (e.g., the terminal 400-1 and the terminal 400-2) may be, but are not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart television, a smart watch, and the like. The terminals (e.g., terminal 400-1 and terminal 400-2) and the server 200 may be directly or indirectly connected through wired or wireless communication, and the application is not limited thereto.

Referring to fig. 2, fig. 2 is a schematic structural diagram of an electronic device 500 for implementing a method for rendering a virtual item according to an embodiment of the present disclosure. In practical application, the electronic device 500 may be a server or a terminal shown in fig. 1, and taking the electronic device 500 as the terminal shown in fig. 1 as an example, an electronic device implementing the rendering method of the virtual item in the embodiment of the present application is described, where the electronic device 500 provided in the embodiment of the present application includes: at least one processor 510, memory 550, at least one network interface 520, and a user interface 530. The various components in the electronic device 500 are coupled together by a bus system 540. It is understood that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 2.

The Processor 510 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., wherein the general purpose Processor may be a microprocessor or any conventional Processor, etc.

The user interface 530 includes one or more output devices 531 enabling presentation of media content, including one or more speakers and/or one or more visual display screens. The user interface 530 also includes one or more input devices 532, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 550 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 550 optionally includes one or more storage devices physically located remote from processor 510.

The memory 550 can include both volatile and nonvolatile memory, and can also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 550 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 550 may be capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 551 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;

a network communication module 552 for communicating to other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

a presentation module 553 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;

an input processing module 554 to detect one or more user inputs or interactions from one of the one or more input devices 532 and to translate the detected inputs or interactions.

In some embodiments, the rendering apparatus of the virtual item provided in this embodiment may be implemented in software, and fig. 2 illustrates a rendering apparatus 555 of the virtual item stored in a memory 550, which may be software in the form of programs and plug-ins, and includes the following software modules: the reconstruction module 5551, the obtaining module 5552, the first determining module 5553, the second determining module 5554 and the rendering module 5555 are logical and thus may be arbitrarily combined or further split according to the implemented functions, which will be described below.

In other embodiments, the rendering Device of the virtual prop provided in this embodiment may be implemented by combining software and hardware, and as an example, the rendering Device of the virtual prop provided in this embodiment may be a processor in the form of a hardware decoding processor, which is programmed to execute the rendering method of the virtual prop provided in this embodiment, for example, the processor in the form of the hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable Logic Devices (PLDs), complex Programmable Logic Devices (CPLDs), field Programmable Gate Arrays (FPGAs), or other electronic elements.

Based on the above description of the rendering system of the virtual item and the electronic device provided in this embodiment of the present application, the following description describes a rendering method of the virtual item provided in this embodiment of the present application. In some embodiments, the rendering method of the virtual item provided in the embodiment of the present application may be implemented by a server or a terminal alone, or implemented by a server and a terminal in a cooperation manner, and the rendering method of the virtual item provided in the embodiment of the present application is described below by taking a terminal as an example. Referring to fig. 3, fig. 3 is a schematic flow chart of a rendering method of a virtual item provided in the embodiment of the present application, where the rendering method of a virtual item provided in the embodiment of the present application includes:

step 101: and the terminal carries out three-dimensional reconstruction on the target object based on the video frame image including the target object in the video to obtain a three-dimensional object model corresponding to the target object.

Here, the terminal may be installed with a client having a video shooting function, such as a short video APP, and the terminal may perform corresponding steps according to an operation instruction of a user for the client. In practical application, when a user has a video shooting requirement, an operation instruction for a client can be triggered, and the terminal responds to the operation instruction, operates the client and presents a video acquisition interface for acquiring video frame images.

The user can shoot the video based on the video acquisition interface, and in the embodiment of the application, when the user shoots the video, the user can also enable a target object (such as the user) in the shot video to be assembled with a virtual prop, such as virtual clothes, virtual armor and the like. As shown in fig. 4, fig. 4 is a presentation schematic diagram of a video provided by an embodiment of the present application, where a target object is equipped with a clothing virtual prop in a video frame image.

In some embodiments, the terminal may determine the virtual item corresponding to the target object by: presenting a video acquisition interface for acquiring video frame images of a target object, and presenting prop function items in the video acquisition interface; presenting at least one candidate virtual prop for selection in response to a triggering operation for the prop function item; and responding to the selection operation aiming at the target candidate virtual prop, and determining the target candidate virtual prop to be a virtual prop of the three-dimensional object model.

In practical application, when the terminal presents a video acquisition interface for acquiring video frame images of the target object, prop function items can be presented in the video acquisition interface; when the terminal receives a trigger operation aiming at a prop function item, at least one candidate virtual prop for selection is presented in response to the trigger operation, such as a virtual armor prop, a virtual machine armor prop, a virtual clothing prop and the like; and receiving a selection operation aiming at the target candidate virtual prop, and determining that the selected target candidate virtual prop is a virtual prop of the three-dimensional object model, namely the virtual prop corresponding to the target object.

As an example, referring to fig. 5, fig. 5 is a schematic diagram of a selection process of a virtual item provided in the embodiment of the present application. Here, the terminal presents a video capture interface for capturing video frame images of the target object, and presents a prop function item "prop" in the video capture interface, as shown in a diagram a in fig. 5; in response to a triggering operation for a prop function item 'prop', presenting at least one candidate virtual prop for selection, including virtual clothes, virtual armor and the like, as shown in a diagram B in FIG. 5; in response to the selection operation for the target candidate virtual item "virtual armor", it is determined that the target candidate virtual item "virtual armor" is a virtual item of the three-dimensional object model, that is, when the target object takes a video, the target object in the video is made to be equipped with the virtual item "virtual armor", as shown in fig. 5C.

Here, in order to assemble the virtual prop to the target object in the video, three-dimensional reconstruction of the target object is required to determine a virtual prop model based on the reconstructed three-dimensional object model, so that the virtual prop model is rendered to present the target object assembled with the virtual prop in the video frame image. In some embodiments, the terminal may perform three-dimensional reconstruction on the target object in the following manner to obtain a three-dimensional object model corresponding to the target object: performing image recognition processing on the video frame image to obtain two-dimensional image coordinates of object key points of a target object in the video frame image; determining three-dimensional model coordinates of key points of the object based on the two-dimensional image coordinates; and performing three-dimensional reconstruction on the target object based on the three-dimensional model coordinates of the key points of the object to obtain a three-dimensional object model corresponding to the target object.

In practical application, aiming at a collected video, the video is divided into a plurality of video frame images, so that each video frame image is processed, and a virtual prop is assembled on a target object in the video; or processing each video frame image acquired in real time so as to assemble the virtual prop for the target object in the video in real time.

The following processing is performed for each video frame image including the target object: and performing three-dimensional reconstruction on the target object based on the video frame image comprising the target object to obtain a three-dimensional object model corresponding to the target object. Specifically, the terminal may perform three-dimensional reconstruction on the target object in the following manner to obtain a three-dimensional object model corresponding to the target object: firstly, performing image recognition processing on a video frame image to obtain two-dimensional image coordinates of object key points of a target object in the video frame image, and during actual implementation, performing image recognition processing on the video frame image through a pre-trained neural network model (such as a convolutional neural network model, a deep neural network model and the like), and recognizing an area where the target object is located in the video frame image to determine the two-dimensional image coordinates of the object key points of the target object, namely 2D key point coordinate information.

Then, based on the two-dimensional image coordinates of the object key points of the target object, the three-dimensional model coordinates of the object key points are determined. In practical implementation, the three-dimensional model coordinates of the object key points can be predicted through a pre-trained neural network model (such as a convolutional neural network model, a deep neural network model, etc.), so as to obtain the three-dimensional model coordinates of the object key points, where in the three-dimensional coordinates, the object key points may be skeletal key points of the target object.

After the three-dimensional coordinates of the object key points of the target object are obtained, three-dimensional reconstruction is carried out on the target object based on the three-dimensional model coordinates of the object key points, and a three-dimensional object model corresponding to the target object is obtained. In practical applications, after obtaining 2D/3D object key point information, if a three-dimensional object model of a target object is to be restored, two important pieces of information, namely the body type of the target object and the 3D rotation of joints, need to be predicted. The size of the target object is simply the height, the weight and the thinness of a person, and although key points of the object can provide part of the stature information, the stature is difficult to accurately recover by only relying on the key points, particularly the weight and the thinness. Although the 3D key points contain partial joint angle information, the joints need to contain more degrees of freedom, especially rotation.

From the perspective of technical solutions, the 3d repeat reconstruction of the human body can be divided into Deep Learning and Fitting. The Fitting-based method is based on an existing human body model, such as SMPL/MANO and the like, and obtains the optimal model parameters of each video image frame by minimizing the error between points projected to an image by the human body model (namely key points of an object in the image) and the key points of a known object. Most methods based on Deep Learning also need to rely on a human body model, and parameters of the human body model are estimated through a neural network, such as VIBE, and the algorithm is generally called model-based method. In addition, the 3D rotation of the joints is difficult to estimate through images, compared with a model-based scheme, the model-free directly regresses the coordinates of dense 3D points on the human mesh, and carries out constraint by adding prior information such as shape, position and the like of the human body in the loss, and mainly depends on a graph neural network to model the topological structure between different joint points of the human body.

Step 102: and acquiring a virtual prop model corresponding to the virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model.

Here, after the terminal performs three-dimensional reconstruction on the target object and obtains a three-dimensional object model of the target object, a virtual item model corresponding to a virtual item of the three-dimensional object model and a plurality of sub item models included in the virtual item model are obtained. Here, the virtual prop of the three-dimensional object model is a target candidate virtual prop selected by the user when shooting a video, such as a virtual armor prop; the virtual prop model corresponding to the virtual prop is a pre-constructed and default initial virtual prop model, and the virtual prop model is composed of at least two sub prop models, such as a virtual prop model 'armor model', and is composed of a plurality of sub prop models 'armor component models', such as a foot component model of armor, a head component model of armor, a hand component model of armor, and the like.

Step 103: and determining the tracking point of each sub prop model and determining the model coordinates of each sub prop model.

And the tracking point is used for indicating the connecting position between the corresponding prop model and the three-dimensional object model.

In practical applications, when a video is shot, a three-dimensional object model corresponding to a target object in a video frame image is mostly inconsistent with a form of an initially-constructed and default initial virtual prop model, and therefore, operations such as translation, rotation, scaling and the like of a position need to be performed on the initial virtual prop model, so that a posture or a posture of a virtual prop obtained by rendering the virtual prop model is adapted to a posture or a posture of the target object. Therefore, in the embodiment of the application, the virtual prop model is divided into a plurality of prop models, and the tracking point of each prop model is provided, so as to indicate the connection position between the corresponding prop model and the three-dimensional object model based on the tracking point, and thus the position of the prop model corresponding to the target object can be determined according to the tracking point.

After a virtual prop model corresponding to a virtual prop of the three-dimensional object model and a plurality of sub prop models included in the virtual prop model are obtained, a tracking point of each sub prop model is determined, wherein the tracking point is preset during design of the virtual prop model and is used for indicating a connection position between each sub prop model and the three-dimensional object model (such as a three-dimensional human body model).

Therefore, in practical application, based on the tracking point, the sub prop models corresponding to each part of the three-dimensional object model of the target object can be determined, that is, based on the tracking point, the position of each sub prop model relative to the three-dimensional object model can be determined to determine the virtual prop corresponding to the target object, so that the posture or the attitude of the virtual prop is adapted to the posture or the attitude of the target object.

After the tracking points of the respective sub prop models are determined, model coordinates of the sub prop models are further determined. In some embodiments, the terminal may determine the model coordinates of each of the sub-prop models by: performing the following processing for each of the sub prop models to determine model coordinates of each of the sub prop models: and establishing a model coordinate system where the sub prop model is located by taking the tracking point of the sub prop model as the origin of the model coordinate system, and determining the model coordinates of the sub prop model based on the model coordinate system.

In practical application, in order to ensure that the positions of the respective sub prop models are strictly aligned to form an overall virtual prop model, when determining the model coordinates of the respective sub prop models, the following processing may be performed for each sub prop model: and establishing a model coordinate system where the sub prop model is located by taking the tracking point of the sub prop model as the origin of the model coordinate system, and determining the model coordinates of the sub prop model based on the established model coordinate system.

Here, the tracking point is a tracking point for providing position change of the sub prop model, and the tracking point of each sub prop model is used as an origin of the model coordinate system to construct the model coordinate system where the corresponding sub prop model is located, so that the positions of the sub prop models can be strictly aligned to form the whole virtual prop model.

Step 104: and obtaining model parameters of the three-dimensional object model, and determining the world coordinates of each sub prop model based on the model parameters of the three-dimensional object model, the tracking points of each sub prop model and the model coordinates of each sub prop model.

Here, in order to render the virtual item corresponding to the target object, the world coordinates of each of the sub-item models need to be acquired. In the embodiment of the application, the world coordinates of each sub-prop model are determined, and model parameters of the sub-prop model need to be obtained, in practical application, the model parameters may be scaling parameters of the three-dimensional object model relative to a standard three-dimensional object model, specifically, in video shooting, the distance between a human body and a camera may affect the proportion of the human body in a screen, and further affect the overall size of the corresponding virtual prop, so scaling parameters of the corresponding virtual prop under different human body proportions need to be estimated, and in practical implementation, the scaling parameters of the corresponding sub-prop model under different human body proportions may be represented by the scaling parameters of the three-dimensional object model relative to the standard three-dimensional object model.

In some embodiments, the terminal may obtain the model parameters of the three-dimensional object model by: when the model parameters are scaling parameters of the three-dimensional object model relative to the standard three-dimensional object model, determining a first model vertex and a second model vertex on the standard three-dimensional object model, and determining a first distance between the first model vertex and the second model vertex; determining a third model vertex corresponding to the first model vertex and a fourth model vertex corresponding to the second model vertex on the three-dimensional object model; determining a second distance between the third model vertex and the fourth model vertex; based on the first distance and the second distance, a scaling parameter of the three-dimensional object model is determined.

Here, in practical applications, two model vertices, labeled X and Y, may be selected on a standard three-dimensional object model of a default size, and then the euclidean distance a of X to Y is calculated. Therefore, when the scaling parameters of the three-dimensional object model are relative to the standard three-dimensional object model, the real-time distance d between the point A corresponding to X and the point B corresponding to Y on the three-dimensional object model corresponding to each video frame image is dynamically calculated, and the scaling parameter is calculated to be d/a.

In some embodiments, the terminal may determine the world coordinates of each of the sub-prop models based on the model parameters of the three-dimensional object model, the tracking points of each of the sub-prop models, and the model coordinates of each of the sub-prop models by: determining the rotation parameters of each sub prop model relative to the initial sub prop model based on the tracking points of each sub prop model; acquiring world coordinates of tracking points of each prop model; and determining the world coordinates of each sub prop model based on the model coordinates of each sub prop model by combining the model parameters of the three-dimensional object model, the world coordinates of the tracking points of each sub prop model and the rotation parameters of each sub prop model.

In some embodiments, the terminal may obtain the world coordinates of the tracking point of each prop model by: obtaining model coordinates of model vertexes corresponding to the tracking points in the three-dimensional object model, and taking the model coordinates of the model vertexes as model coordinates of the corresponding tracking points; and obtaining a model coordinate transformation parameter, and determining the world coordinates of the tracking points of each sub-prop model based on the model coordinates and the model coordinate transformation parameter of each tracking point.

Here, in practical applications, since the tracking points are used to indicate the connection positions between the corresponding sub-prop models and the three-dimensional object model, the model coordinates of the model vertices corresponding to the connection positions between the sub-prop models and the three-dimensional object model (i.e., the model coordinates of the model vertices corresponding to the tracking points in the three-dimensional object model) may be used as the model coordinates of the corresponding tracking points; the model coordinates of the tracking points are then converted to world coordinates of the tracking points. Specifically, a model coordinate transformation parameter, such as a model coordinate transformation matrix, may be obtained, where the model coordinate transformation parameter may be preset, and then the model coordinates of each tracking point are multiplied by the model coordinate transformation parameter to obtain the world coordinates of the tracking points of each sub-prop model.

In some embodiments, the terminal may determine the rotation parameters of each of the sub-prop models relative to the initial sub-prop model based on the tracking point of each of the sub-prop models by: performing the following processing aiming at each sub prop model to determine the rotation parameters of each sub prop model relative to the initial sub prop model: acquiring a normal direction vector of a tracking point of the sub prop model and a standard direction vector of the initial sub prop model, and determining a rotation vector corresponding to the sub prop model and a projection vector of the standard direction vector on a normal plane of the normal direction vector based on the normal direction vector and the standard direction vector; and determining the rotation parameters of the sub prop model relative to the initial sub prop model based on the rotation vector and the projection vector.

Here, the following processing is performed for each of the sub-prop models to determine a rotation parameter of each of the sub-prop models with respect to the initial sub-prop model:

firstly, a normal direction vector of a tracking point of a sub prop model and a standard direction vector of an initial sub prop model are obtained. Here, the standard direction vector of the initial sub-prop model is the direction vector of the red axis X axis of the model coordinate system. In some embodiments, the terminal may obtain the normal direction vector of the tracking point of the prop model by: determining model vertexes of tracking points of corresponding sub-prop models in the three-dimensional object model, and determining a plurality of model triangles in which the model vertexes are located; determining the surface normal direction vector of each model triangle, averaging the determined surface normal direction vectors, and taking the average result as the average normal direction vector of the model vertex; and determining the average normal direction vector of the model vertex as the normal direction vector of the tracking point of the prop model.

In practical application, model vertexes corresponding to tracking points of the sub-prop model in the three-dimensional object model correspond to a plurality of model triangles, and each model triangle is composed of a model vertex including the tracking point of the corresponding sub-prop model in the three-dimensional object model and other two model vertexes around the model vertex. When determining the normal direction vectors of the tracking points of the sub prop models, firstly determining the surface normal direction vectors of each model triangle, then averaging the determined surface normal direction vectors, and taking the averaged result as the average normal direction vector of the model vertex; and determining the average normal direction vector of the model vertex as the normal direction vector of the tracking point of the prop model.

For example, if a model vertex in the three-dimensional object model corresponding to the tracking point is a, three vertices of the model triangle are a, B, and C, coordinates of the vector AB are (ax, ay, az), coordinates of the vector AC are (bx, by, bz), and the normal vector is denoted as R, the formula for calculating the plane normal direction vector of the model triangle is as follows:

R＝a x b＝(ay*bz-az*by，az*bx-ax*bz，ax*by-ay*bx)；

after the normal direction vector of each model triangle where the model vertex A in the three-dimensional object model corresponding to the tracking point is located is obtained, all the normal direction vectors are added to obtain an average and normalization operation is carried out, the average normal direction vector of the model vertex A in the three-dimensional object model corresponding to the tracking point is obtained, and the average normal direction vector of the model vertex A is determined as the normal direction vector of the corresponding tracking point.

And secondly, determining a rotation vector corresponding to the sub prop model and a projection vector of the standard direction vector on a normal plane of the normal direction vector based on the normal direction vector and the standard direction vector. In some embodiments, the terminal may determine the rotation vector corresponding to the sub-prop model based on the normal direction vector and the standard direction vector by: synthesizing the normal direction vector and the standard direction vector to obtain a first synthesized direction vector; determining a first rotation quaternion corresponding to the normal direction vector and a second rotation quaternion corresponding to the first synthetic direction vector; and determining a rotation vector corresponding to the sub prop model based on the first rotation quaternion and the second rotation quaternion.

Here, in practical applications, the rotation vector may be a rotation quaternion, where the rotation quaternion is QuatR, the resultant vector of the red axis direction vector a (i.e., the standard direction vector) and the normal direction vector B is M = (a.normal + b.normal). Normal, the first rotation quaternion corresponding to the normal direction vector is Quat (0.0, -bx, -by, -bz), and the second rotation quaternion Quat (0.0, m.x, m.y, m.z) corresponding to the first resultant direction vector, and the rotation quaternion of the vector a direction to the B direction (i.e., the rotation vector corresponding to the sub-prop model) can be calculated by the following formula:

QuatR＝Quat(0.0，-bx，-by，-bz)*Quat(0.0，M.x，M.y，M.z)；

the Quat is a quaternion data structure and comprises four components of x, y, z and w, wherein the A.normal represents a normalized red axis direction vector A, and the B.normal represents a normalized normal direction vector B.

In practical applications, the terminal may determine a projection vector of the standard direction vector on a normal plane of the normal direction vector based on the normal direction vector and the standard direction vector as follows. Exemplarily, assuming that a standard direction vector (i.e., a red axis direction vector) is a (ax, ay, az), a normal direction vector of a tracking point is B (bx, by, bz), a dot product of the vector a and the vector B is denoted as c = a × B = ax × bx + ay × by + az × bz, and a modulo length squared value of a normal vector is denoted as Bmag, a projection vector D of the standard direction vector a (ax, ay, az) on a normal plane of the normal direction vector B is:

D＝(ax–bx*c/Bmag，ay–by*c/Bmag，az–bz*c/Bmag)。

and thirdly, determining the rotation parameters of the sub prop model relative to the initial sub prop model based on the rotation vector and the projection vector. In some embodiments, the terminal may determine the rotation parameters of the sub-prop model relative to the initial sub-prop model based on the rotation vector and the projection vector by: rotating the rotation vector according to the direction of the standard direction vector to obtain a middle rotation vector; synthesizing the intermediate rotation vector and the projection vector to obtain a second synthetic direction vector; determining a third rotation quaternion corresponding to the projection vector and a fourth rotation quaternion corresponding to the second synthetic direction vector; and determining the rotation parameters of the sub prop model relative to the initial sub prop model based on the third rotation quaternion, the fourth rotation quaternion and the rotation vector.

Here, the rotation vector is first rotated in the direction of the normal direction vector to obtain an intermediate rotation vector, specifically, the rotation vector QuatR is rotated around the default direction (1.0, 0.0) of the red axis to obtain an intermediate rotation vector VecR;

then, synthesizing the intermediate rotation vector VecR and the projection vector D to obtain a second synthesis direction vector; and meanwhile, determining a third rotation quaternion corresponding to the projection vector and a fourth rotation quaternion corresponding to the second synthetic direction vector, multiplying the third rotation quaternion and the fourth rotation quaternion, and finally multiplying the result obtained by multiplication by a rotation vector QuatR to obtain a rotation parameter QuatF of the final sub-prop model relative to the initial sub-prop model.

In some embodiments, the terminal may determine the world coordinates of each of the sub-prop models by combining the model parameters of the three-dimensional object model, the world coordinates of the tracking points of each of the sub-prop models, and the rotation parameters of each of the sub-prop models, based on the model coordinates of each of the sub-prop models, as follows: determining world coordinate transformation parameters of each sub prop model based on the world coordinates, the rotation parameters and the model parameters of the tracking points of each sub prop model; and multiplying the model coordinates of each sub-prop model by the world coordinate change parameters respectively to obtain the world coordinates of each sub-prop model.

Here, after determining the model parameters of the three-dimensional object model, the world coordinates of the tracking points of each of the sub-prop models, and the rotation parameters of each of the sub-prop models based on the above-described embodiment, the world coordinate transformation parameters of each of the sub-prop models are determined based on the world coordinates, the rotation parameters, and the model parameters of the tracking points of each of the sub-prop models, and specifically, the world coordinate transformation matrix of each of the sub-prop models is constructed based on the world coordinates, the rotation parameters, and the model parameters of the tracking points of each of the sub-prop models. And then multiplying the model coordinates of each sub prop model by the world coordinate variation parameters respectively to obtain the world coordinates of each sub prop model.

Here, the model parameter of the three-dimensional object model (specifically, the scaling parameter of the three-dimensional object model relative to the standard three-dimensional object model) is used to represent the scaling parameter of the sub prop model corresponding to the corresponding portion, the rotation parameter is the rotation parameter of the sub prop model relative to the initial sub prop model, and the world coordinate of the tracking point of the sub prop model is used to represent the position parameter of the sub prop model, so that the determined world coordinate transformation parameter of each sub prop model can represent the transformation situation (including translation of position, rotation of direction, and scaling of model size) of the virtual prop model (i.e., the virtual prop model corresponding to the target object) formed by the plurality of sub prop models relative to the default initial virtual prop model, so that the world coordinate corresponding to the virtual prop model can be accurately determined based on the world coordinate transformation parameter, the world coordinate describes the position of each model vertex included in the virtual prop model corresponding to the target object in the video frame image, and the rendered virtual prop object can be more suitable for the target object, namely, and the target object can be more suitable for the target posture and the target object.

Step 105: and rendering the virtual prop based on the world coordinates of each sub prop model so as to present the target object assembled with the virtual prop in the video.

Here, after the world coordinates of each of the sub-prop models are determined, the virtual prop is rendered based on the world coordinates of each of the sub-prop models to obtain a rendered virtual prop, so that the target object equipped with the virtual prop is presented in the video frame image.

In some embodiments, the terminal may render the virtual prop to present the target object equipped with the virtual prop based on the world coordinates of each of the sub-prop models by: performing vertex rendering on a virtual prop model corresponding to the virtual prop based on the world coordinates of each sub prop model to obtain a middle virtual prop; determining pixel colors of pixel points contained in the virtual props based on preset material attributes corresponding to the virtual props; and performing pixel rendering on the intermediate virtual prop based on the pixel color of the pixel point contained in the virtual prop to obtain the virtual prop so as to present the target object assembled with the virtual prop.

After the world coordinates of each sub prop model are determined, performing vertex rendering on the virtual prop model corresponding to the virtual prop based on the world coordinates of each sub prop model to obtain an intermediate virtual prop; then determining the pixel color of a pixel point contained in the virtual prop based on the preset material attribute corresponding to the virtual prop; and performing pixel rendering on the intermediate virtual prop based on the pixel color of the pixel point contained in the virtual prop to obtain the virtual prop so as to present the target object assembled with the virtual prop.

In practical application, a PBR-based rendering pipeline expands a custom material function in a special effect rendering engine, namely, a material can be set in a custom mode according to needs, then, a real pixel color of each pixel point of a rendering object is generated according to the material attribute and the surface characteristic of a real world object based on the custom set material attribute and a PBR physical reality rendering method and by combining a physically correct illumination calculation method, and therefore, pixel rendering is performed on an intermediate virtual prop based on the pixel color of the pixel point contained in the virtual prop, and the virtual prop is obtained.

By applying the embodiments of the present application, in the embodiments of the present application, after a video frame image including a target object is obtained, three-dimensional reconstruction is performed on the target object based on the video frame image to obtain a corresponding three-dimensional object model, then a plurality of sub-prop models corresponding to virtual props of the three-dimensional object model are obtained to determine tracking points and model coordinates of the sub-prop models, further model parameters of the three-dimensional object model are obtained, and world coordinates of the sub-prop models are determined based on the model parameters of the three-dimensional object model, the tracking points and the model coordinates of the sub-prop models, so that the virtual props are rendered based on the world coordinates of the sub-prop models to present the target object equipped with the virtual props.

Here, the tracking point is used for indicating a connection position between the corresponding sub-prop model and the three-dimensional object model, so that, for each video image frame, the position of each sub-prop model relative to the three-dimensional object model can be accurately determined based on the tracking point, that is, the world coordinates of each sub-prop model are accurately determined, and further, the model parameters of the three-dimensional object model are obtained, so that the determined world coordinates of each sub-prop model are more attached to the three-dimensional object model, and thus, the virtual prop is rendered based on the determined world coordinates of each sub-prop model, so that the rendered virtual prop can be more accurately attached to the target object, and the rendering effect of assembling the virtual prop to the target object in the video is improved.

An exemplary application of the embodiments of the present application in a practical application scenario will be described below.

In the embodiment of the application, in the aspect of rendering of a mobile-end virtual item (such as a virtual dress), firstly, a technical scheme of rendering the virtual dress based on a Unity engine is provided; secondly, a technical scheme of a grid adsorption function is also provided, and a function of binding the relative position relationship of one three-dimensional object and another three-dimensional object is mainly provided, so that the function has no direct relationship with the technology in the field of prop assembly; thirdly, an ACGPN virtual conversion technology based on deep learning is further provided, planar clothes with different textures are generated through a deep neural network, and the clothes are calibrated through a thin plate spline interpolation algorithm. However, applicants have found in practice that:

firstly, the Unity engine only supports prop effect rendering of an offline fixed virtual object model, and has no solution for a three-dimensional object model dynamically reconstructed online, so that the requirements of thousands of people on content in a video shooting scene cannot be met, and meanwhile, the computing power of the relative position and rotation of a virtual prop and each part of an object (such as a human body) is lacked, and in addition, the engine has a huge structure, so that deep customization development cannot be performed on the existing mainstream special effect engine. Secondly, the technical scheme of the grid adsorption function only provides relative binding of positions, and the accuracy of the rotation direction of the virtual prop cannot be guaranteed, so that the virtual prop has larger flaws relative to the fitting degree of an object. Finally, the ACGPN virtual conversion technology based on deep learning has large performance overhead on the mobile terminal, the coverage rate of equipment is low, the modeling and material effects of the virtual prop are limited, the controllable flexibility of a designer is low, and the special effect requirement of shooting scenes by the mobile terminal cannot be met.

Based on this, the embodiment of the present application provides a rendering method for a virtual item, which can accurately attach each virtual component of the virtual item to a dynamically created three-dimensional object model, and meanwhile, ensure that the position and the rotation direction of the virtual item are correct in the process of object motion. The performance requirement of the whole scheme is relatively low, and the scheme can cover common mobile phone equipment. In the aspect of universality of virtual item resources, virtual item digital assets in the formats of obj/glb/fbx and the like are supported, and related functions of quickly replacing the virtual item resources are provided.

The scheme provided by the application can be widely applied to mobile terminal video shooting scenes, and as shown in fig. 4 and 5, under the mobile terminal short video shooting scene, a set of special effect function of automatic reloading is provided based on a three-dimensional object reconstruction algorithm, so that people in videos can wear virtual armors.

Next, a detailed description is given to a rendering method of a virtual item provided in an embodiment of the present application, and as shown in fig. 6, fig. 6 is a schematic flow chart of the rendering method of a virtual item provided in an embodiment of the present application, and includes the following three flow units: (1) a three-dimensional object model reconstruction unit; (2) a virtual prop splicing unit based on PinTomesh; and (3) a PBR-based virtual item rendering unit.

Next, (1) a three-dimensional object model reconstruction unit will be first explained. The three-dimensional object model reconstruction unit detects the 2D position of an object in a video image frame through an image processing algorithm, predicts the coordinate value corresponding to each surface point of a three-dimensional object model under a three-dimensional world coordinate system based on a pre-trained neural network model, and organizes rendering data capable of being submitted to a GPU through the assembly of vertexes and indexes. In addition, in order to correctly calculate the orientation of the virtual item (such as a virtual clothing item) relative to the three-dimensional object model in the next step, the normal information of the three-dimensional object model needs to be calculated in real time. The method comprises the following steps:

step 201: and detecting the point location of the three-dimensional object model.

Here, first, a captured video is divided into video frame images of one frame; then, performing image recognition processing on each video frame image (for example, performing image recognition processing through a pre-trained neural network model), and determining the region where the target object is located in the image frame to obtain two-dimensional coordinate information of each 2D key point contained in the target object in the video frame image; then, the depth of each 2D key point is predicted and determined by combining the time sequence information of the previous image frame and the next image frame, so that the three-dimensional coordinate information of the bone key points of the target object is obtained; therefore, the 3Dmesh of the target object under the model coordinate system is obtained by combining the coordinate information of the 2D/3D key points and the posture and the body type of the target object, namely the point location information of each model vertex of the three-dimensional object model, such as the position coordinates and the texture coordinates of the vertex.

In practical applications, the applicant reconstructs an object 3d mhs in a model coordinate system by using 7 thousand vertices and 1.5 ten thousand patches, and the 3d mhs reconstructed from the image need to be filtered to remove jitter, so as to obtain the 3d mhs of the target object in the video frame image.

Step 202: and constructing the vertex coordinates and indexes of the three-dimensional object model.

Here, according to the point location information of each model vertex of the three-dimensional object model calculated in the previous step, the position coordinates, texture coordinates, and triangle index information (which may be the labels of each vertex constituting a triangle) of the vertex are assembled into a data structure recognizable by the rendering engine, and in practical applications, the data structure recognizable by the rendering engine may be composed of the position coordinates xyz, the texture coordinates uv, and the triangle index information of each vertex, and is submitted to the rendering engine in a customized order before each video frame image is rendered, and the three-dimensional object model is rendered by the rendering engine.

Step 203: and (4) reconstructing a normal of the three-dimensional object model.

Since the normal information to the surface of the three-dimensional object model is needed to calculate the orientation of the correct virtual prop, a normal direction vector corresponding to the tracking point of each sub prop model needs to be calculated on the reconstructed three-dimensional object model. Here, the tracking point is a fitting position or a connection position of a sub prop model (for example, each clothing component included in the virtual clothing prop) included in a preset virtual prop model and the three-dimensional object model, and in practical application, each sub prop model may correspond to two tracking points, including: a tracking point providing a change in position and a tracking point providing a change in orientation.

The method for calculating the normal information of the surface of the three-dimensional object model comprises the following steps: firstly, calculating the surface normal of each triangle where the tracking point is located, and then adding all the normals to calculate normalization to obtain the average normal of the point.

R＝a x b＝(ay*bz-az*by，az*bx-ax*bz，ax*by-ay*bx)；

Next, description is continued on (2) the PinToMesh-based virtual item splicing unit. The PinTo Mesh-based virtual item splicing unit is mainly used for solving the problems of the coordinate position and the rotation angle of a virtual item relative to a three-dimensional object model.

Step 204: and calculating the world coordinates of the tracking points of the various sub prop models included in the virtual prop model of each video frame image in a three-dimensional world coordinate system.

Here, according to the point location result of the three-dimensional object model reconstruction in the previous step, the model coordinates of the model vertex corresponding to the assigned index number in the model coordinate system, which is the tracking point corresponding to each of the sub-prop models, can be obtained, so that the model coordinates of the model vertex corresponding to each of the tracking points in the three-dimensional object model are used as the model coordinates of the corresponding tracking point. And then acquiring a preset model coordinate transformation matrix, and converting the model coordinate of the tracking point into a world coordinate through the model coordinate transformation matrix, wherein the world coordinate of the tracking point is equal to the model coordinate of the tracking point multiplied by the model coordinate transformation matrix.

After the world coordinates of the tracking point in the world coordinate system are obtained, the position of the world coordinates of the tracking point is set as an anchor point of the model, and the anchor point is the tracking point providing position change. As shown in fig. 7, fig. 7 is a schematic view of an anchor point of the sub prop model provided in the embodiment of the present application, where a highlighted small square in the left figure is the anchor point of the sub prop model, the sub prop model is a hand sub prop model of a virtual clothing model, and the right figure is a schematic view of the hand sub prop model. In practical application, the anchor point of each sub prop model is located at the origin of model coordinates (0, 0) in the model coordinate system, so as to ensure that the positions of the sub prop models are strictly aligned to form an integral virtual prop model.

Step 205: and calculating the relative rotation information of the virtual prop model.

Here, the relative rotation information of the virtual prop model is a rotation parameter of each sub prop model relative to the initial prop model. When the rotation parameters of the sub prop model corresponding to the three-dimensional object model in the video frame image relative to the initial sub prop model are calculated for each video frame image, a normal direction vector of a tracking point of each sub prop model and a red axis (X axis) direction vector (i.e., the above-mentioned standard direction vector) specified by the sub prop model need to be obtained.

In the last step, the normal direction vector of the tracking point of each prop model is calculated, the specified red axis direction vector is a direction vector parallel to the middle axis direction vector, and the middle axis direction is the middle axis direction through which the target object passes by aiming at the prop model. As shown in fig. 8, fig. 8 is an axial schematic view of the model of the sub prop provided in the embodiment of the present application. Here, the prop model is a hand prop model of a garment, the red axis represents an x-axis, the green axis represents a y-axis, and the blue axis represents a z-axis, and the medial axis direction vector is parallel to the red axis (x-axis) direction vector, but the direction is the negative axis direction of the red axis.

Next, the calculation of the rotation parameter includes:

firstly, calculating a projection vector from a red axis direction vector of the subsidiary prop model to a normal plane corresponding to a normal direction vector of a tracking point. Let the red axis direction vector of the sub prop model be a (ax, ay, az), the normal direction vector of the tracking point be B (bx, by, bz), the dot product of vector a and vector B be c = a × B = ax × bx + ay × by + az × bz, and the modulo length square value of the normal vector be Bmag, then the projection vector D of a (ax, ay, az) on the normal plane is:

D＝(ax–bx*c/Bmag，ay–by*c/Bmag，az–bz*c/Bmag)。

and secondly, calculating a rotation quaternion of the vector of the red axis direction of the sub prop model turning to the vector of the normal direction.

Here, assuming that the rotation quaternion is QuatR, and the resultant vector of the red axis direction vector and the normal direction vector is M = (a.normal + b.normal).

QuatR＝Quat(0.0，-bx，-by，-bz)*Quat(0.0，M.x，M.y，M.z)；

Where Quat is a quaternion data structure comprising four components x, y, z, w.

And thirdly, calculating a final rotation parameter QuatF after the QuatR excludes the rotation in the tangential plane direction.

Here, a new direction vector VecR obtained by rotating QuatR around the red axis default direction (1.0, 0.0) is first calculated, the same calculation as in the second step is performed with the projection vector D calculated in the first step, and finally, the resultant rotation parameter QuatF is obtained by multiplying QuatR.

Therefore, the rotation parameters for ensuring the correct direction of the virtual prop model are obtained. In addition, in the video, the distance between the target object and the camera affects the occupation ratio of the target object in the screen, and further affects the overall size of the virtual prop, so that the scaling parameters of the virtual prop model under different object proportions need to be estimated, the scaling parameters of the virtual prop model are the scaling parameters of the prop model, and the scaling parameters can be represented by the scaling parameters of the three-dimensional object model relative to the standard three-dimensional object model.

Step 206: and calculating scaling parameters of the three-dimensional object model and updating a world coordinate transformation matrix.

Here, in calculating the scaling parameters of the three-dimensional object model of the frame of video frame image with respect to the standard three-dimensional object model, two tracking points may be selected on the torso of the standard three-dimensional object model of a default size, labeled as X and Y, and then the euclidean distance a (i.e., the first distance) of X to Y is calculated. Thus, when calculating the scaling parameter of the three-dimensional object model of each frame of video frame image relative to the standard three-dimensional object model, the real-time distance d (i.e. the second distance) from a (corresponding to X) to B (corresponding to Y) on the reconstructed three-dimensional object model corresponding to the video frame image is dynamically calculated, thereby calculating the scaling parameter as d/a.

After the world coordinate (x, y, z), the rotation parameter QuatF and the scaling parameter of the tracking point of the sub prop model of the video frame image are obtained through calculation, a 4x4 world coordinate conversion matrix of the sub prop model is constructed based on the world coordinate (x, y, z), the rotation parameter QuatF and the scaling parameter of the tracking point, and therefore the world coordinate conversion matrix of each sub prop model is obtained. And finally, determining the world coordinates of each sub prop model based on the world coordinate conversion matrix, wherein specifically, the product of the model coordinates of the sub prop model and the world coordinate conversion matrix is equal to the world coordinates of the sub prop model.

Next, (3) a PBR-based virtual item rendering unit is explained.

Step 207: and (5) self-defining material analysis.

Step 208: and rendering the virtual prop.

The PBR is a set of rendering technology based on a real world illumination physical model, and a global illumination calculation formula which is more consistent with real physical rules is adopted to simulate light and shadow, so that a vivid rendering effect is achieved. In the application, the PBR-based rendering pipeline expands the function of the custom material in the special effect rendering engine, and the purpose of correctly drawing the PBR effect of the virtual prop in a specific space is achieved.

Specifically, the user-defined material is analyzed to obtain a user-defined material attribute, a realistic rendering method based on PBR physics is used for generating a real pixel color of each pixel point of a rendering object according to the material attribute and the surface characteristic of a real world object and in combination with a physically correct illumination calculation method, and therefore pixel rendering of the virtual prop is performed based on the pixel colors of the pixel points included in the virtual prop.

The application also provides a self-development engine-based custom Material system, wherein the custom Material mainly comprises three parts of Pass, material and Lshader. Wherein the content of the first and second substances,

and the Material is used for managing the virtual item model rendering resource configuration, and comprises model resources, map resources, an Lshader and the like. One virtual prop model corresponds to one Material file, and a plurality of virtual prop models can share the same Material to reduce DrawCall (namely, the number of times of GPU rendering is submitted). All the configurations of the virtual item required in the PBR rendering process are defined by configuration Material.

And the Lshader is used for managing rendering states, shader (shader program) and Pass. The method mainly defines the correct order for drawing the virtual prop model on the GPU and the states required to be set.

Pass, which is used to define rendering contents of each rendering submission, including which data needs to be submitted to the GPU in the rendering process, how to finally output correct color results through the vertex shader and the fragment shader in the single rendering process, and the like.

Therefore, a set of material system which can be defined by user is built by constructing the three main functional units, and the problem that the fixed PBR pipeline rendering effect is incorrect when the internal parameter and the external parameter of the camera change is solved by transmitting the coordinate system conversion matrix in a uniform parameter form. Meanwhile, because the coloring language can be customized, the upper body effect of the virtual prop can be more flexibly iterated according to the requirements of designers.

By applying the embodiment of the application, a full link from the construction of the virtual three-dimensional object model to the final physical-based screen rendering is opened, the effect and performance of the shooting scene at the mobile terminal are balanced, and the shooting scene at the mobile terminal can run on most mobile phone devices. Compared with the similar technology, the method has obvious advantages on the mobile platform in the aspects of the fitting degree, stability, rendering effect and performance overhead of the virtual prop.

Continuing with the description of the rendering apparatus 555 for virtual items provided in this embodiment, in some embodiments, the rendering apparatus for virtual items may be implemented by using a software module. Referring to fig. 9, fig. 9 is a schematic structural diagram of a rendering apparatus 555 of a virtual prop provided in the embodiment of the present application, where the rendering apparatus 555 of a virtual prop provided in the embodiment of the present application includes:

the reconstruction module 5551 is configured to perform three-dimensional reconstruction on a target object based on a video frame image including the target object in a video, so as to obtain a three-dimensional object model corresponding to the target object;

an obtaining module 5552, configured to obtain a virtual item model corresponding to a virtual item of the three-dimensional object model, and multiple sub item models included in the virtual item model;

a first determining module 5553, configured to determine a tracking point of each of the prop models, and determine a model coordinate of each of the prop models; the tracking points are used for indicating the connection positions between the corresponding sub prop models and the three-dimensional object model;

a second determining module 5554, configured to obtain model parameters of the three-dimensional object model, and determine world coordinates of each of the sub-prop models based on the model parameters of the three-dimensional object model, the tracking points of each of the sub-prop models, and the model coordinates of each of the sub-prop models;

a rendering module 5555, configured to render the virtual prop based on the world coordinates of each of the sub-prop models, so as to present the target object equipped with the virtual prop in the video.

In some embodiments, the reconstruction module 5551 is further configured to perform image recognition processing on the video frame image to obtain two-dimensional image coordinates of an object key point of the target object in the video frame image;

In some embodiments, the first determining module 5553 is further configured to perform the following processing for each of the sub-prop models to determine model coordinates of each of the sub-prop models:

In some embodiments, the second determining module 5554 is further configured to determine a first model vertex and a second model vertex on the standard three-dimensional object model and determine a first distance between the first model vertex and the second model vertex when the model parameter is a scaling parameter of the three-dimensional object model relative to the standard three-dimensional object model;

In some embodiments, the second determining module 5554 is further configured to determine, based on the tracking point of each of the prop models, a rotation parameter of each of the prop models relative to the initial prop model;

acquiring world coordinates of tracking points of the various prop models;

In some embodiments, the second determining module 5554 is further configured to obtain model coordinates of a model vertex corresponding to each tracking point in the three-dimensional object model, and use the model coordinates of the model vertex as the model coordinates of the corresponding tracking point;

obtaining the coordinate transformation parameters of the model, and

and determining the world coordinates of the tracking points of each sub-prop model based on the model coordinates of each tracking point and the model coordinate transformation parameters.

In some embodiments, the second determining module 5554 is further configured to, for each of the prop models, determine a rotation parameter of each of the prop models relative to an initial prop model by:

determining, based on the rotation vector and the projection vector, a rotation parameter of the child prop model relative to the initial child prop model.

In some embodiments, the second determining module 5554 is further configured to determine model vertices in the three-dimensional object model corresponding to the tracking points of the prop sub model, and determine a plurality of model triangles where the model vertices are located;

In some embodiments, the second determining module 5554 is further configured to synthesize the normal direction vector and a standard direction vector to obtain a first synthesized direction vector;

In some embodiments, the second determining module 5554 is further configured to rotate the rotation vector in a direction of a standard direction vector, so as to obtain an intermediate rotation vector;

In some embodiments, the second determining module 5554 is further configured to determine a world coordinate transformation parameter of each of the subset models based on the world coordinates of the tracking point of each of the subset models, the rotation parameter, and the model parameter;

In some embodiments, the rendering module 5555 is further configured to perform vertex rendering on a virtual item model corresponding to the virtual item based on the world coordinates of each of the sub item models, so as to obtain an intermediate virtual item;

determining the pixel color of a pixel point contained in the virtual prop based on the preset material attribute corresponding to the virtual prop;

and performing pixel rendering on the intermediate virtual item based on the pixel color of the pixel point contained in the virtual item to obtain the virtual item, so as to present the target object assembled with the virtual item in the video.

In some embodiments, the apparatus further comprises:

presenting prop function items in the video acquisition interface;

and responding to the selection operation aiming at the target candidate virtual prop, and determining the target candidate virtual prop to be the virtual prop of the three-dimensional object model.

An embodiment of the present application further provides an electronic device, where the electronic device includes:

a memory for storing executable instructions;

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the rendering method of the virtual item provided in the embodiment of the application.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may, but need not, correspond to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A method for rendering a virtual item, the method comprising:

determining a tracking point of each sub prop model, and determining a model coordinate of each sub prop model; the tracking point is used for indicating the connecting position between the corresponding prop model and the three-dimensional object model;

2. The method of claim 1, wherein the three-dimensional reconstruction of the target object based on the video frame image including the target object in the video to obtain a three-dimensional object model corresponding to the target object comprises:

carrying out image recognition processing on the video frame image to obtain two-dimensional image coordinates of object key points of the target object in the video frame image;

3. The method of claim 1, wherein said determining model coordinates for each of said sub-prop models comprises:

performing the following processing for each of the prop models to determine model coordinates of each of the prop models:

4. The method of claim 1, wherein obtaining model parameters of the three-dimensional object model comprises:

when the model parameter is a scaling parameter of the three-dimensional object model relative to a standard three-dimensional object model, determining a first model vertex and a second model vertex on the standard three-dimensional object model, and determining a first distance between the first model vertex and the second model vertex;

5. The method of claim 1, wherein the determining world coordinates of each of the sub-prop models based on model parameters of the three-dimensional object model, tracking points of each of the sub-prop models, and model coordinates of each of the sub-prop models comprises:

acquiring world coordinates of tracking points of each sub prop model;

and determining the world coordinates of each sub-prop model based on the model coordinates of each sub-prop model by combining the model parameters of the three-dimensional object model, the world coordinates of the tracking point of each sub-prop model and the rotation parameters of each sub-prop model.

6. The method of claim 5, wherein the obtaining world coordinates of the tracking point of each of the subset prop models comprises:

obtaining model coordinates of model vertexes corresponding to the tracking points in the three-dimensional object model, and taking the model coordinates of the model vertexes as model coordinates of the corresponding tracking points;

obtaining the coordinate transformation parameters of the model, and

7. The method of claim 5, wherein determining rotation parameters of each of the sub-prop models relative to an initial sub-prop model based on the tracking points of each of the sub-prop models comprises:

8. The method according to claim 7, wherein the obtaining of the normal direction vector of the tracking point of the sub prop model comprises:

determining model vertexes of tracking points corresponding to the prop models in the three-dimensional object models, and determining a plurality of model triangles where the model vertexes are located;

9. The method of claim 7, wherein the determining rotation parameters of the child prop model relative to the initial child prop model based on the rotation vector and the projection vector comprises:

rotating the rotation vector according to the direction of a standard direction vector to obtain a middle rotation vector;

determining rotation parameters of the sub-prop model relative to the initial sub-prop model based on the third rotation quaternion, the fourth rotation quaternion, and the rotation vector.

10. The method of claim 5, wherein the determining the world coordinates of each of the sub-prop models based on the model coordinates of each of the sub-prop models in combination with the model parameters of the three-dimensional object model, the world coordinates of the tracking point of each of the sub-prop models, and the rotation parameters of each of the sub-prop models comprises:

determining a world coordinate transformation parameter of each of the sub prop models based on the world coordinates of the tracking point of each of the sub prop models, the rotation parameter, and the model parameter;

11. The method of claim 1, wherein the rendering the virtual prop based on the world coordinates of each of the sub-prop models to present a target object equipped with the virtual prop in the video comprises:

performing vertex rendering on the virtual item model corresponding to the virtual item based on the world coordinates of the sub item models to obtain a middle virtual item;

12. The method according to claim 1, wherein before obtaining the virtual item model corresponding to the virtual item of the three-dimensional object model, the method further comprises:

presenting a video capture interface for video frame image capture of the target object, and

presenting prop function items in the video acquisition interface;

13. An apparatus for rendering a virtual item, the apparatus comprising:

the first determining module is used for determining a tracking point of each sub-prop model and determining a model coordinate of each sub-prop model; the tracking point is used for indicating the connecting position between the corresponding prop model and the three-dimensional object model;

the second determining module is used for acquiring model parameters of the three-dimensional object model and determining world coordinates of each sub-prop model based on the model parameters of the three-dimensional object model, tracking points of each sub-prop model and model coordinates of each sub-prop model;

14. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

a processor, configured to execute the executable instructions stored in the memory, to implement the method of rendering a virtual item as claimed in any one of claims 1 to 12.

15. A computer-readable storage medium, having stored thereon executable instructions for, when executed, implementing a method of rendering a virtual item as claimed in any one of claims 1 to 12.