CN117745986A - Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis - Google Patents

Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis Download PDF

Info

Publication number
CN117745986A
CN117745986A CN202311616920.1A CN202311616920A CN117745986A CN 117745986 A CN117745986 A CN 117745986A CN 202311616920 A CN202311616920 A CN 202311616920A CN 117745986 A CN117745986 A CN 117745986A
Authority
CN
China
Prior art keywords
foreground
camera
picture
virtual
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311616920.1A
Other languages
Chinese (zh)
Inventor
郑培枫
袁慧晶
陈子恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Digital Video Beijing Ltd
Original Assignee
China Digital Video Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Digital Video Beijing Ltd filed Critical China Digital Video Beijing Ltd
Priority to CN202311616920.1A priority Critical patent/CN117745986A/en
Publication of CN117745986A publication Critical patent/CN117745986A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a method, a device, electronic equipment and a storage medium for realizing virtual-real crossing of an object in XR virtual-real synthesis. Wherein the method comprises the following steps: establishing a three-dimensional model in a virtual scene based on the LED large screen, and setting a self-defined depth buffer value; setting the three-dimensional model to be invisible and a custom depth buffer value can be detected; setting the part of the traversing object, which is positioned in front of the space of the self-defined depth buffer value, to be invisible, and performing screen projection rendering and screen projection; delay processing is carried out on the tracking data of the camera, and a corresponding camera shooting picture after delay processing is obtained; setting that a part of the traversing object behind a space with a designated self-defined depth buffer value is invisible, and performing foreground rendering to obtain a foreground picture; the foreground picture and the picture shot by the camera are mixed, so that the effect that the virtual object (or role) passes through the LED screen in a seamless manner in XR film and television shooting synthesis can be realized, and the seamless synthesis effect of the virtual and real synthesized picture is ensured.

Description

Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, an electronic device, and a storage medium for implementing virtual-real traversal of an object in XR virtual-real synthesis.
Background
With the application of the large LED screen, the large LED screen is used for replacing a blue box to realize virtual-real synthesis of film shooting, namely: the LED large screen directly outputs the virtual background, the host shoots the large screen, and the shot picture enters the virtual foreground implantation system to carry out final virtual-real synthesis. This technology integrates VR, AR and expands, and is therefore called XR (Extended Reality). However, the effect of precisely and seamlessly traversing the virtual object (or character) inside and outside the LED screen is difficult to achieve, and the lack of precise control over the front and rear parts of the traversing object may result in an undesirable composite effect during the rendering and screen throwing of the traversing object.
Disclosure of Invention
The embodiment of the invention provides a method, a device, electronic equipment and a storage medium for realizing virtual-real crossing of an object in XR virtual-real synthesis, aiming at solving the problems in the background technology.
In order to solve the technical problems, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a method for implementing virtual-real traversal of an object in XR virtual-real synthesis, where the method includes:
based on the LED large screen, a three-dimensional model with the same shape and size as the LED large screen is built in a virtual scene, and a self-defined depth buffer value is set for the three-dimensional model;
setting the three-dimensional model to be invisible, and detecting a self-defined depth buffer value in the respective rendering flows of the projection rendering server and the foreground rendering server;
setting that a part of the traversing object positioned in front of the space of the self-defined depth buffer value is invisible in the screen-throwing rendering server, and performing screen-throwing rendering and screen throwing of the traversing object according to current camera tracking data, wherein the camera tracking data is data for tracking the position and the gesture of a camera;
in the foreground rendering server, delay processing is carried out on current camera tracking data, and a camera shooting picture corresponding to the camera tracking data after delay processing is obtained; setting that the part of the traversing object located behind the space of the self-defined depth buffer value is invisible, and performing foreground rendering of the traversing object according to the delayed camera tracking data to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
Optionally, the method further comprises:
determining response time of an IO acquisition module, wherein the IO acquisition module is a module for acquiring a camera picture, and the response time refers to time from the moment when image data are captured by a camera to the moment when the image data are completely received by the IO acquisition module and are ready to be transmitted to the foreground rendering server;
determining the transmission rate of the IO acquisition module, and calculating the data transmission time of the IO acquisition module based on the transmission rate, wherein the transmission rate refers to the speed of the IO acquisition module for transmitting image data to the foreground rendering server;
determining the sum of the response time and the data transmission time of the IO acquisition module as a delay time;
performing foreground rendering of the traversing object according to the camera tracking data after delay processing to obtain a foreground picture, wherein the foreground picture comprises:
and performing foreground rendering of the traversing object according to the camera tracking data before the delay time length to obtain a foreground picture.
Optionally, the custom depth buffer value is determined based on the actual position of the LED large screen and the distance between the LED large screen and the viewer according to the following steps:
determining position data of the LED large screen in a physical space, wherein the position data comprises: three-dimensional coordinates of the LED large screen in a physical space and the orientation of the LED large screen;
determining position data of a viewer, the position data of the viewer including a head position of the viewer and an orientation of the viewer;
calculating the relative depth between the viewer and the LED large screen by using the position data of the LED large screen and the position data of the viewer;
and defining a range of depth buffer values, and mapping the relative depth between the viewer and the LED large screen into a target interval to obtain the custom depth buffer values.
Optionally, setting the three-dimensional model to invisible includes:
a special texture or shader is applied to the three-dimensional model so that it does not display any pixels in the final rendered output.
Optionally, mixing the foreground frame with the camera shooting frame includes:
and setting a calculation formula of a pixel color channel calculation result R to be R=R0×A+R1×1-A, wherein R0 is a pixel of the foreground picture, R1 is a pixel of the picture shot by the camera, and A is the transparency of the pixel of the foreground picture.
A second aspect of an embodiment of the present invention proposes a device for implementing virtual-to-real traversal of an object in XR virtual-to-real synthesis, the device comprising:
the building module is configured to build a three-dimensional model with the same shape and size as the LED large screen in a virtual scene based on the LED large screen, and set a self-defined depth buffer value for the three-dimensional model;
the first setting module is configured to set the three-dimensional model invisible, and can detect a custom depth buffer value in the respective rendering flows of the projection screen rendering server and the foreground rendering server;
the second setting module is configured to set that a part of the traversing object positioned in front of the space of the self-defined depth buffer value is invisible in the screen-throwing rendering server, and perform screen-throwing rendering and screen throwing of the traversing object according to current camera tracking data, wherein the camera tracking data is data for tracking the position and the gesture of a camera;
the delay module is configured to perform delay processing on current camera tracking data in the foreground rendering server, and obtain a camera shooting picture corresponding to the camera tracking data after delay processing;
the third setting module is configured to set that a part of the traversing object located behind the space of the self-defined depth buffer value is invisible, and perform foreground rendering of the traversing object according to the camera tracking data after delay processing to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
Optionally, the delay module further includes:
a first sub-module configured to determine a response time of the IO acquisition module, the IO acquisition module being a module for acquiring a camera picture, the response time being a time from a moment when image data is captured by the camera to a moment when the image data is completely received by the IO acquisition module and ready for transmission to the foreground rendering server;
the second sub-module is configured to determine a transmission rate of the IO acquisition module, and calculate data transmission time of the IO acquisition module based on the transmission rate, wherein the transmission rate refers to a speed of the IO acquisition module for transmitting image data to the foreground rendering server;
the third sub-module is used for determining the sum of the response time and the data transmission time of the IO acquisition module as the delay time;
and the fourth sub-module is configured to perform foreground rendering of the traversing object according to the camera tracking data before the delay time length to obtain a foreground picture.
Optionally, the establishing module further includes:
a fifth sub-module configured to determine position data of the LED large screen in a physical space, the position data comprising: three-dimensional coordinates of the LED large screen in a physical space and the orientation of the LED large screen;
a sixth sub-module configured to determine position data of a viewer, the position data of the viewer including a head position of the viewer and an orientation of the viewer;
a seventh sub-module configured to calculate a relative depth between a viewer and the LED large screen using the position data of the LED large screen and the position data of the viewer;
and an eighth sub-module, configured to define a range of depth buffer values, and map a relative depth between the viewer and the LED large screen into a target interval, so as to obtain the custom depth buffer values.
A third aspect of the embodiments of the invention proposes an electronic device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program when executed by the processor implementing the steps of a method for implementing virtual-real traversal of an object in XR virtual-real composition.
In a fourth aspect, an embodiment of the present invention proposes a computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of a method for implementing virtual-real traversal of an object in XR virtual-real synthesis.
The embodiment of the invention has the following advantages:
in the invention, the virtual-real crossing effect is realized by establishing a three-dimensional model which is identical to the shape and the size of the actual LED screen in the virtual scene and setting the custom depth buffer of the three-dimensional model as a designated value, so that the precise and seamless crossing effect of the virtual object (or role) inside and outside the LED screen is ensured. In the screen-cast rendering server, the part of the traversing object located in front of the space of the appointed custom depth buffer value is not visible. In the foreground implantation rendering server, the part of the traversing object positioned behind the space of the appointed self-defined depth buffer value is invisible, so that the rendering range of the traversing object can be accurately controlled, and the problem of non-ideal synthesis effect is avoided. And setting delay processing for camera tracking data in the foreground rendering server to ensure the seamless composite effect of virtual and real composite pictures. By means of time delay processing, the problem that an LED generated picture is not collected when a foreground rendering server performs rendering is solved, and the technical scheme of the invention can achieve the effect that a virtual object (or role) passes through an LED screen in an XR film and television shooting synthesis mode in an accurate and seamless mode, and ensure the seamless synthesis effect of virtual and real synthesized pictures.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of a scene of a vehicle traversing inside and outside a large LED screen in an embodiment of the invention;
FIG. 2 is a flow chart of a method for realizing virtual-real traversal of an object in XR virtual-real synthesis according to an embodiment of the invention;
fig. 3 is a block diagram of an apparatus for implementing virtual-real traversal of an object in XR virtual-real synthesis according to an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention can be applied to an XR film and television shooting composite product based on an LED so as to realize the requirement of virtual objects (or roles) on-screen inner-outer virtual-real crossing. Fig. 1 is a schematic view of a scene that a vehicle traverses inside and outside an LED large screen in an embodiment of the present invention, as shown in fig. 1, in a television program, a host stands in front of an LED large screen to shoot, on the LED large screen, a virtual background is rendered by a three-dimensional engine to form a virtual scene, including streets and building objects with light shadows and colors, a shooting picture is collected by a camera, the shooting picture is a picture captured by an actual camera, including a projection image in the LED large screen and an image of the host in front of the screen, and the shooting picture enters a virtual foreground implantation system to perform final virtual-real synthesis. In the scene, the technical scheme of the invention can realize the accurate and seamless crossing effect of the virtual object (such as an automobile) on the virtual street and inside and outside the LED screen.
The embodiment of the invention provides a method for realizing virtual-real crossing of an object in XR virtual-real synthesis, which is suitable for an XR film and television shooting synthetic product based on an LED. Fig. 2 is a flow chart of a method for implementing virtual-real traversal of an object in XR virtual-real synthesis according to an embodiment of the invention, as shown in fig. 2, the method includes:
step S101, in a virtual scene, based on an LED large screen, a three-dimensional model with the same shape and size as the LED large screen is built in the virtual scene, and a self-defined depth buffer value is set for the three-dimensional model.
In this embodiment, according to the specific shape and size of the LED large screen, a three-dimensional model with the same shape and size is built in the virtual environment, so that the physical characteristics of the LED large screen, including the size, shape and position, can be reproduced in the virtual environment, so that the virtual environment and the LED screen in the real world are better fused, and a more real interactive experience is provided for the user. Depth Buffering (Depth Buffering) is a technique in computer graphics for processing Depth information of objects in image data to decide which parts should be rendered in front and which parts should be hidden in the back. In XR applications, specifying custom depth buffer values means that the position and manner of display of virtual objects in three-dimensional space can be precisely controlled. This is critical for creating realistic three-dimensional scenes.
Step S102, setting the three-dimensional model to be invisible, and detecting a self-defined depth buffer value in the respective rendering flows of the projection rendering server and the foreground rendering server.
"invisible" here does not directly make the viewer invisible to the naked eye, but means with respect to the rendering process that the three-dimensional model is not displayed in the final composite screen, and thus visually appears as "disappearing" to the viewer. By setting the three-dimensional model invisible but detecting the custom depth buffer value, the actual object overlapped with the model can be identified in the rendering process and processed correspondingly. By detecting the self-defined depth buffer value, it can be determined which parts are virtual objects and which parts are actual objects, thereby realizing the effect of virtual-real crossing.
Step 103, setting that the part of the traversing object positioned in front of the space of the self-defined depth buffer value is invisible in the screen-throwing rendering server, and carrying out screen-throwing rendering and screen throwing on the traversing object according to current camera tracking data, wherein the camera tracking data is data for tracking the position and the gesture of a camera.
The front and rear of the embodiment of the invention only take the camera as a reference point, and the closer to the camera is to the front, the farther to the camera is to the rear. Taking the scenario shown in fig. 1 as an example, assume that we have a virtual scenario, and the traversing objects in the scenario are automobile models, we want to realize the effect that the automobile models traverse LED screens in the virtual scenario. In the cast rendering server, we set the portion of the automobile model that is in front of the space of custom depth buffer values to be invisible. This means that during rendering only the part of the car model that is behind the space of the custom depth buffer value (corresponding to one side within the LED screen) will be displayed in the final composite picture. For example, in the second case in fig. 1, the head part of the automobile model is the invisible part, which is hidden in the final rendering picture, and the spectator can only see the tail part on the final LED screen, while the head part is generated by rendering by the foreground rendering server, so that the spectator can feel the effect of the vehicle crossing the LED screen.
The camera tracking data refers to data for tracking the position, posture and motion of objects in a scene in real time through a camera. In virtual-to-real synthesis, camera tracking data is used to determine the position and pose of an object in a virtual scene for corresponding rendering and synthesis. Specifically, the camera tracking data includes the following information:
position: three-dimensional coordinates of the camera, representing the position of the camera in the scene;
the direction is: the orientation of the camera, which indicates the direction in which the camera is pointing;
viewing angle: the view angle range of the camera represents the scene range that the camera can see;
posture: the rotation angle of the camera indicates the tilt and rotation state of the camera.
By acquiring and updating the camera tracking data in real time, the position and the posture of the traversing object which are consistent with the actual shooting picture in the synthesis process can be ensured. In the screen projection rendering server, the position of the traversing object in the virtual scene can be determined according to the tracking data of the camera, and corresponding rendering and screen projection operations are performed to realize the virtual-real traversing effect of the object.
Step S104, in the foreground rendering server, delay processing is carried out on the current camera tracking data, and a camera shooting picture corresponding to the camera tracking data after delay processing is obtained; setting that the part of the traversing object located behind the space of the self-defined depth buffer value is invisible, and performing foreground rendering of the traversing object according to the delayed camera tracking data to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
In the virtual-real synthesis, a certain delay exists when the foreground rendering server collects the camera picture, namely, when the foreground rendering server performs rendering, the frame picture generated by the LEDs is not collected yet. This results in an inability to accurately acquire real-time camera tracking data when rendering in the foreground rendering server. To solve this problem, we need to perform a delay process, which is to synchronize the camera tracking data with the LED generated picture to ensure the seamless combining effect of the virtual and real combined picture. Specifically, delay processing for the camera tracking data, namely delaying a certain frame number to acquire the camera tracking data, can be arranged in the foreground rendering server, and can enable the foreground rendering server to accurately acquire the camera tracking data corresponding to the picture generated by the LEDs during rendering, so that seamless combination effect of virtual and real combined pictures is realized.
Still taking the scene shown in fig. 1 as an example, in the foreground rendering server, we set the portion of the automobile model that is located behind the space of custom depth buffer values to be invisible. This means that only the portion of the car model that is in front of the space of the custom depth buffer value (corresponding to the side of the LED outside the screen) will be displayed in the final composite picture during rendering. For example, in the second case of fig. 1, the tail part of the automobile model is the invisible part of the setting, and is hidden in the final rendering picture, so that the viewer can only see the part of the head in the final foreground, and the tail part is the effect that the projection rendering server performs rendering projection, so that the viewer can feel the effect that the vehicle passes through the LED screen.
In a possible embodiment, the method further comprises:
step S201, determining a response time of the IO acquisition module, where the IO acquisition module is a module for acquiring a camera picture, and the response time is a time from a moment when the camera captures image data to a moment when the image data is completely received by the IO acquisition module and is ready to be transmitted to the foreground rendering server.
Step S202, determining a transmission rate of the IO acquisition module, and calculating data transmission time of the IO acquisition module based on the transmission rate, wherein the transmission rate refers to the speed of the IO acquisition module for transmitting image data to the foreground rendering server.
And step S203, determining the sum of the response time and the data transmission time of the IO acquisition module as the delay time.
Step S204, according to the camera tracking data before the delay time, performing foreground rendering of the traversing object to obtain a foreground picture.
The system sets the foreground rendering server to render the traversing object and the picture shot by the camera according to the camera tracking data before the delay time length, in other words, the rendering server uses the camera tracking data before the delay time length instead of the real-time data to render. In this way, the rendered content and the actual shooting picture can be ensured to be kept synchronous, so that real and virtual seamless synthesis is realized.
For example, in a live scene of virtual-real composition, the camera may track the presenter, LED large screen, and other field elements. If the time from the capture of the image by the IO acquisition module to the transmission of the data to the foreground rendering server (i.e., the response time and the data transmission time) is 100 milliseconds, then the rendering server needs to track the data using the camera 100 milliseconds ago instead of the current data when rendering the elements. Therefore, no matter how the camera moves, the rendered image can be consistent with the picture shot in real time, and seamless fusion of virtual and reality is realized.
In a possible implementation, the custom depth buffer value is determined based on the actual position of the LED large screen and the distance between the LED large screen and the viewer, according to the following steps:
step S301, determining position data of the LED large screen in a physical space, where the position data includes: and the three-dimensional coordinates of the LED large screen in the physical space and the orientation of the LED large screen.
The position data of the LED large screen in the physical space includes its three-dimensional coordinates (x, y, Z) and its orientation, for which we can use three euler angles to represent its rotation based on coordinate axes, typically about X, Y and Z axes. Pitch (Pitch): rotation about the X-axis, yaw (Yaw angle): rotation about the Y-axis, roll: rotation about the Z-axis is the basis for ensuring that the virtual content is properly aligned with the real environment.
Step S302, position data of the viewer including a head position of the viewer and an orientation of the viewer is determined.
And step S303, calculating the relative depth between the viewer and the LED large screen by using the position data of the LED large screen and the position data of the viewer.
Assuming that in a virtual news studio, the LED large screen is placed 30 meters in front of the audience, based on the position data of the LED large screen and the audience, a specific relative depth value can be obtained, which is critical to ensure correct interaction between the virtual content (e.g. augmented reality elements) and the real environment (e.g. physical objects in the studio). If a virtual news anchor is presented on an LED screen, we need to ensure that this virtual anchor visually appears to be sitting behind a table as true, rather than being seen by the viewer as abnormally floating in the air or being erroneously rendered in front of a physical object.
And step S304, defining a range of depth buffer values, and mapping the relative depth between the viewer and the LED large screen into a target interval to obtain the self-defined depth buffer values.
This step is to map the actual depth between the viewer and the screen into a predefined depth buffer interval so that their position in three-dimensional space can be handled more accurately when rendering virtual objects, mainly to ensure a correct representation of the depth and a correct occlusion relationship between objects. First, we define a depth buffer interval, such as 0 to 1. This interval represents the virtual depth range from the closest point (0) to the furthest point (1) of the screen. Suppose that the viewer is 2 meters from the LED screen, we know that the viewer can approach 0.5 meters and can stand up to 5 meters at the most. This actual depth of 2 meters needs to be mapped into our defined depth buffer interval of 0 to 1. By calculation we map a distance of 2 meters to a certain point of the depth buffer interval, such as 0.3. This means that in depth buffering the viewer's position is considered to be a depth value of 0.3 from the screen, which changes accordingly as the viewer moves, ensuring that the representation of the car model on the screen matches the viewing angle and position of the viewer. For example, as the viewer approaches the screen, the depth value decreases, and the rendered traversing object enlarges and adjusts the angle to simulate the visual effect of the viewer approaching the traversing object.
In one possible embodiment, setting the three-dimensional model to be invisible includes:
a special texture or shader is applied to the three-dimensional model so that it does not display any pixels in the final rendered output.
Texture defines the surface properties of a three-dimensional model, such as color, gloss, etc., while shader is a program running on a graphics processor, and a special texture or shader may be applied in order to render the three-dimensional model invisible. Such a material or shader is configured to output no pixels, or completely transparent pixels, during rendering. With this application, the rendering engine may skip their pixel output when processing the models, or output completely transparent pixels, visually "vanishing" the models.
In a possible embodiment, mixing the foreground frame with the camera shot frame includes:
and setting a calculation formula of a pixel color channel calculation result R to be R=R0×A+R1×1-A, wherein R0 is a pixel of the foreground picture, R1 is a pixel of the picture shot by the camera, and A is the transparency of the pixel of the foreground picture.
In virtual-real synthesis, the mixing of the foreground picture and the camera shooting picture according to Alpha mixing mode is to realize seamless fusion of the virtual object and the actual shooting picture. Specifically, by adjusting the transparency of the foreground picture pixels, the transparency of the virtual object in the synthesized picture can be controlled so as to be mixed with the actual photographed picture.
In the scenario shown in fig. 1, it is assumed that there is a virtual car model to be synthesized with the background of the actual photographing. In the synthesis process, we need to mix the foreground picture of the car model with the background picture taken by the camera. First, we need to control the transparency of the automobile model in the composite frame by adjusting the transparency a of the foreground frame pixels according to the specific scene requirements. If A is 1, the foreground picture is completely opaque, and the automobile model is completely displayed in the synthesized picture; if A is 0, the foreground picture is completely transparent, and the automobile model is not displayed in the synthesized picture; if A is 0.5, the foreground picture is semitransparent, and the automobile model is displayed in the synthesized picture with certain transparency. By adjusting the transparency of the pixels of the foreground picture, the foreground picture and the camera shooting picture are mixed in an Alpha mixing mode, and seamless fusion of the virtual object and the actual shooting picture can be realized, so that the synthesized picture looks more real and natural.
In the invention, the virtual-real crossing effect is realized by establishing a three-dimensional model which is identical to the shape and the size of the actual LED screen in the virtual scene and setting the custom depth buffer of the three-dimensional model as a designated value, so that the precise and seamless crossing effect of the virtual object (or role) inside and outside the LED screen is ensured. In the screen-cast rendering server, the part of the traversing object located in front of the space of the appointed custom depth buffer value is not visible. In the foreground implantation rendering server, the part of the traversing object positioned behind the space of the appointed self-defined depth buffer value is invisible, so that the rendering range of the traversing object can be accurately controlled, and the problem of non-ideal synthesis effect is avoided. And setting delay processing for camera tracking data in the foreground rendering server to ensure the seamless composite effect of virtual and real composite pictures. By means of time delay processing, the problem that an LED generated picture is not collected when a foreground rendering server performs rendering is solved, and the technical scheme of the invention can achieve the effect that a virtual object (or role) passes through an LED screen in an XR film and television shooting synthesis mode in an accurate and seamless mode, and ensure the seamless synthesis effect of virtual and real synthesized pictures.
The embodiment of the invention also provides a device for realizing the virtual-real crossing of the object in the XR virtual-real synthesis, and referring to fig. 3, a structural block diagram of the device for realizing the virtual-real crossing of the object in the XR virtual-real synthesis in the embodiment of the invention is shown. As shown in fig. 3:
the device comprises:
the building module 401 is configured to build a three-dimensional model with the same shape and size as the LED large screen in a virtual scene based on the LED large screen, and set a custom depth buffer value for the three-dimensional model;
a first setting module 402, configured to set the three-dimensional model to be invisible, and may detect a custom depth buffer value in a rendering flow of each of the projection rendering server and the foreground rendering server;
a second setting module 403, configured to set, in the screen-projection rendering server, that a part of the traversing object located before the space of the custom depth buffer value is invisible, and perform screen-projection rendering and screen projection of the traversing object according to current camera tracking data, where the camera tracking data is data for tracking a position and an attitude of a camera;
the delay module 404 is configured to perform delay processing on current camera tracking data in the foreground rendering server, and obtain a camera shooting picture corresponding to the camera tracking data after delay processing;
a third setting module 405, configured to set that a portion of the traversing object located behind the space of the custom depth buffer value is invisible, and perform foreground rendering of the traversing object according to the delayed camera tracking data, so as to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
In one possible implementation, the delay module 404 further includes:
a first sub-module 501 configured to determine a response time of the IO acquisition module, which is a module for acquiring a camera picture, from a moment when the camera captures image data, to a time when the image data is completely received by the IO acquisition module and ready for transmission to the foreground rendering server;
a second sub-module 502 configured to determine a transmission rate of the IO acquisition module, and calculate a data transmission time of the IO acquisition module based on the transmission rate, where the transmission rate is a speed at which the IO acquisition module sends image data to the foreground rendering server;
a third sub-module 503, configured to determine a sum of the response time and the data transmission time of the IO acquisition module as a delay time length;
and a fourth sub-module 504, configured to perform foreground rendering of the traversing object according to the camera tracking data before the delay time length, so as to obtain a foreground picture.
In a possible implementation, the establishing module 401 includes:
a fifth sub-module 601 configured to determine position data of the LED large screen in a physical space, the position data comprising: three-dimensional coordinates of the LED large screen in a physical space and the orientation of the LED large screen;
a sixth sub-module 602 configured to determine position data of a viewer, the position data of the viewer including a head position of the viewer and an orientation of the viewer;
a seventh sub-module 603 configured to calculate a relative depth between a viewer and the LED large screen using the position data of the LED large screen and the position data of the viewer;
an eighth sub-module 604 is configured to define a range of depth buffer values, and map a relative depth between the viewer and the LED large screen into a target interval, to obtain the custom depth buffer values.
The embodiment of the invention also provides electronic equipment, which comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes the processes of the method embodiment for realizing the virtual-real crossing of the object in the XR virtual-real synthesis when being executed by the processor, can achieve the same technical effect, and is not repeated here for avoiding repetition.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the above-mentioned processes of the method embodiment for implementing virtual-real traversal of objects in XR virtual-real synthesis, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here. In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD ∈ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element. The above detailed description of the method, the device, the electronic equipment and the storage medium for realizing the virtual-real crossing of the object in the XR virtual-real synthesis provided by the invention applies specific examples to illustrate the principle and the implementation of the invention, and the above examples are only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (10)

1. A method for implementing virtual-to-real traversal of an object in XR virtual-to-real synthesis, comprising:
based on the LED large screen, a three-dimensional model with the same shape and size as the LED large screen is built in a virtual scene, and a self-defined depth buffer value is set for the three-dimensional model;
setting the three-dimensional model to be invisible, and detecting a self-defined depth buffer value in the respective rendering flows of the projection rendering server and the foreground rendering server;
setting that a part of the traversing object positioned in front of the space of the self-defined depth buffer value is invisible in the screen-throwing rendering server, and performing screen-throwing rendering and screen throwing of the traversing object according to current camera tracking data, wherein the camera tracking data is data for tracking the position and the gesture of a camera;
in the foreground rendering server, delay processing is carried out on current camera tracking data, and a camera shooting picture corresponding to the camera tracking data after delay processing is obtained; setting that the part of the traversing object located behind the space of the self-defined depth buffer value is invisible, and performing foreground rendering of the traversing object according to the delayed camera tracking data to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
2. The method according to claim 1, wherein the method further comprises:
determining response time of an IO acquisition module, wherein the IO acquisition module is a module for acquiring a camera picture, and the response time refers to time from the moment when image data are captured by a camera to the moment when the image data are completely received by the IO acquisition module and are ready to be transmitted to the foreground rendering server;
determining the transmission rate of the IO acquisition module, and calculating the data transmission time of the IO acquisition module based on the transmission rate, wherein the transmission rate refers to the speed of the IO acquisition module for transmitting image data to the foreground rendering server;
determining the sum of the response time and the data transmission time of the IO acquisition module as a delay time;
performing foreground rendering of the traversing object according to the camera tracking data after delay processing to obtain a foreground picture, wherein the foreground picture comprises:
and performing foreground rendering of the traversing object according to the camera tracking data before the delay time length to obtain a foreground picture.
3. The method of claim 1, wherein the custom depth buffer value is determined based on an actual position of the LED large screen and a distance of the LED large screen to a viewer, according to the steps of:
determining position data of the LED large screen in a physical space, wherein the position data comprises: three-dimensional coordinates of the LED large screen in a physical space and the orientation of the LED large screen;
determining position data of a viewer, the position data of the viewer including a head position of the viewer and an orientation of the viewer;
calculating the relative depth between the viewer and the LED large screen by using the position data of the LED large screen and the position data of the viewer;
and defining a range of depth buffer values, and mapping the relative depth between the viewer and the LED large screen into a target interval to obtain the custom depth buffer values.
4. The method of claim 1, wherein setting the three-dimensional model to be invisible comprises:
a special texture or shader is applied to the three-dimensional model so that it does not display any pixels in the final rendered output.
5. The method of claim 1, wherein mixing the foreground frame with the camera shot frame comprises:
and setting a calculation formula of a pixel color channel calculation result R to be R=R0×A+R1×1-A, wherein R0 is a pixel of the foreground picture, R1 is a pixel of the picture shot by the camera, and A is the transparency of the pixel of the foreground picture.
6. An apparatus for implementing virtual-to-real traversal of an object in XR virtual-to-real synthesis, the apparatus comprising:
the building module is configured to build a three-dimensional model with the same shape and size as the LED large screen in a virtual scene based on the LED large screen, and set a self-defined depth buffer value for the three-dimensional model;
the first setting module is configured to set the three-dimensional model invisible, and can detect a custom depth buffer value in the respective rendering flows of the projection screen rendering server and the foreground rendering server;
the second setting module is configured to set that a part of the traversing object positioned in front of the space of the self-defined depth buffer value is invisible in the screen-throwing rendering server, and perform screen-throwing rendering and screen throwing of the traversing object according to current camera tracking data, wherein the camera tracking data is data for tracking the position and the gesture of a camera;
the delay module is configured to perform delay processing on current camera tracking data in the foreground rendering server, and obtain a camera shooting picture corresponding to the camera tracking data after delay processing;
the third setting module is configured to set that a part of the traversing object located behind the space of the self-defined depth buffer value is invisible, and perform foreground rendering of the traversing object according to the camera tracking data after delay processing to obtain a foreground picture; and mixing the foreground picture with the picture shot by the camera.
7. The apparatus of claim 6, wherein the delay module further comprises:
a first sub-module configured to determine a response time of the IO acquisition module, the IO acquisition module being a module for acquiring a camera picture, the response time being a time from a moment when image data is captured by the camera to a moment when the image data is completely received by the IO acquisition module and ready for transmission to the foreground rendering server;
the second sub-module is configured to determine a transmission rate of the IO acquisition module, and calculate data transmission time of the IO acquisition module based on the transmission rate, wherein the transmission rate refers to a speed of the IO acquisition module for transmitting image data to the foreground rendering server;
the third sub-module is used for determining the sum of the response time and the data transmission time of the IO acquisition module as the delay time;
and the fourth sub-module is configured to perform foreground rendering of the traversing object according to the camera tracking data before the delay time length to obtain a foreground picture.
8. The apparatus of claim 6, wherein the means for establishing further comprises:
a fifth sub-module configured to determine position data of the LED large screen in a physical space, the position data comprising: three-dimensional coordinates of the LED large screen in a physical space and the orientation of the LED large screen;
a sixth sub-module configured to determine position data of a viewer, the position data of the viewer including a head position of the viewer and an orientation of the viewer;
a seventh sub-module configured to calculate a relative depth between a viewer and the LED large screen using the position data of the LED large screen and the position data of the viewer;
and an eighth sub-module, configured to define a range of depth buffer values, and map a relative depth between the viewer and the LED large screen into a target interval, so as to obtain the custom depth buffer values.
9. An electronic device, comprising: a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor performs the steps of the method according to any of claims 1-5.
10. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the method according to any of claims 1-5.
CN202311616920.1A 2023-11-29 2023-11-29 Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis Pending CN117745986A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311616920.1A CN117745986A (en) 2023-11-29 2023-11-29 Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311616920.1A CN117745986A (en) 2023-11-29 2023-11-29 Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis

Publications (1)

Publication Number Publication Date
CN117745986A true CN117745986A (en) 2024-03-22

Family

ID=90249914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311616920.1A Pending CN117745986A (en) 2023-11-29 2023-11-29 Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis

Country Status (1)

Country Link
CN (1) CN117745986A (en)

Similar Documents

Publication Publication Date Title
CN106210861B (en) Method and system for displaying bullet screen
US20200288113A1 (en) System and method for creating a navigable, three-dimensional virtual reality environment having ultra-wide field of view
Agrawala et al. Artistic multiprojection rendering
CN106157359B (en) Design method of virtual scene experience system
CN107341832B (en) Multi-view switching shooting system and method based on infrared positioning system
US20080246759A1 (en) Automatic Scene Modeling for the 3D Camera and 3D Video
US20110084983A1 (en) Systems and Methods for Interaction With a Virtual Environment
US11425283B1 (en) Blending real and virtual focus in a virtual display environment
AU2018249563B2 (en) System, method and software for producing virtual three dimensional images that appear to project forward of or above an electronic display
EP3057316B1 (en) Generation of three-dimensional imagery to supplement existing content
CN113692734A (en) System and method for acquiring and projecting images, and use of such a system
CN213461894U (en) XR-augmented reality system
US20160037148A1 (en) 3d-mapped video projection based on on-set camera positioning
CN110870304A (en) Method and apparatus for providing information to a user for viewing multi-view content
Kim et al. 3-d virtual studio for natural inter-“acting”
CN117745986A (en) Method and device for realizing virtual-real crossing of object in XR virtual-real synthesis
Kansy et al. Real-time integration of synthetic computer graphics into live video scenes
CN111240630B (en) Multi-screen control method and device for augmented reality, computer equipment and storage medium
JP7152873B2 (en) Image processing device, image processing method, and program
CN108280882B (en) Method and system for implanting AR foreground object position in virtual display space
CN114071115A (en) Free viewpoint video reconstruction and playing processing method, device and storage medium
CN117765210A (en) Method and device for realizing picture compensation outside LED area in XR virtual-real synthesis
CN116993642A (en) Panoramic video fusion method, system and medium for building engineering temporary construction engineering
CN114708405A (en) Image processing method, apparatus, system and storage medium
Cortes et al. Depth Assisted Composition of Synthetic and Real 3D Scenes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination