WO2022057308A1

WO2022057308A1 - Display method and apparatus, display device, and computer-readable storage medium

Info

Publication number: WO2022057308A1
Application number: PCT/CN2021/096358
Authority: WO
Inventors: 欧华富; 石盛传; 赵代平; 李国雄
Original assignee: 北京市商汤科技开发有限公司
Priority date: 2020-09-16
Filing date: 2021-05-27
Publication date: 2022-03-24
Also published as: CN112132940A

Abstract

A display method and apparatus, a display device, and a storage medium. The method comprises: acquiring current posture information corresponding to the current video frame collected from a real scene (S101); acquiring historical posture information of a historical video frame before the current video frame, and determining virtual object posture data on the basis of the historical posture information and the current posture information (S102); and by using the virtual object posture data, rendering, in the current video frame displayed by a display device, a virtual object corresponding to the posture data virtual object (S103), such that no matter where the start position of the display device is, the display device can adjust and display the virtual object according to the offset of the current posture information and of the historical posture information, and can display the virtual object in an accurate position.

Description

Display method, apparatus, display device and computer-readable storage medium

Cross-reference notes

This patent application claims the priority of the Chinese patent application number 2020109761713 filed on September 16, 2020, and the application name is "display method, device, display device and storage medium", the full text of which is incorporated into this application by reference middle.

technical field

The present disclosure relates to the technical field of image processing, and in particular, to a display method, an apparatus, a display device, and a computer-readable storage medium.

Background technique

The existing three-dimensional model display method is to control the traveling position of the display device through the motor. During the sliding process, the corresponding Augmented Reality (AR) model is displayed according to the position information of the display device. The disadvantage is that the display position of the AR model is bound to the initial position of the sliding display device. Once the initial position of the device changes, the display position of the AR model will also change, resulting in a delay between the AR model and the real scene, and the AR rendering effect is not good. good.

SUMMARY OF THE INVENTION

An embodiment of the present disclosure provides a display method, the method includes:

Obtain the current pose information of the current video frame collected from the real scene;

Obtain the historical pose information of the historical video frame before the current video frame, and determine virtual object pose data based on the historical pose information and the current pose information;

Using the virtual object pose data, rendering the virtual object in the current video frame displayed by the display device;

The augmented reality effect in which the real scene and the virtual object are superimposed is displayed through the display device.

In the above solution, the obtaining of the current pose information of the current video frame collected from the real scene includes:

The current video frame of the real scene is collected by the collection part of the display device;

The current video frame is processed using a positioning algorithm to obtain the current pose information of the current video frame in the camera sensor coordinate system.

In the above solution, the obtaining of the historical pose information of the historical video frame before the current video frame, based on the historical pose information and the current pose information, determining the virtual object pose data, including:

Obtain the historical pose information of the historical video frame before the current video frame, and determine the historical pose offset information based on the historical pose information;

The virtual object pose data is obtained based on the historical pose offset information and the current pose information.

In the above solution, determining the historical pose offset information based on the historical pose information includes:

Obtain the first pose information of the historical video frame in the visual space coordinate system; and, use a positioning algorithm to process the historical video frame to obtain the second pose of the historical video frame in the camera sensor coordinate system information; wherein, the historical pose information includes the first pose information and the second pose information;

The historical pose offset information is determined based on an offset between the second pose information and the first pose information.

Obtain the first pose information of the historical video frame in the visual space coordinate system;

acquiring the third sensing data when the acquisition part of the display device acquires the historical video frame;

Determine historical sensing pose information based on the offset between the first sensing data and the third sensing data; the historical pose information includes the first pose information and the historical sensing position attitude information; the first sensing data includes the data of the initial video frame collected by the collecting part when the display device is started;

The historical pose offset information is determined based on an offset between the first pose information and the historical sensed pose information.

In the above solution, the obtaining of the virtual object pose data based on the historical pose offset information and the current pose information includes:

According to the historical pose offset information, the current pose information is offset to obtain corrected pose information;

The virtual object pose data corresponding to the corrected pose information is determined.

acquiring the first sensing data of the acquisition part of the display device when collecting the initial video frame, and the second sensing data of the acquisition part of the display device when collecting the current video frame;

Calculate the offset between the first sensing data and the second sensing data, and determine the current pose information when the display device collects the current video frame.

In the above solution, the virtual object pose data includes: the coordinate position of each pixel constituting the virtual object; the virtual object is rendered in the current video frame corresponding to the display device by using the virtual object pose data, include:

The coordinate position of each pixel in the virtual object is mapped to the rendering engine coordinate system to obtain the target coordinate position of each pixel;

Using a rendering engine, the virtual object is rendered at the target coordinate position in the current video frame.

Embodiments of the present disclosure provide a display device, which is used based on a display device, including:

The collection part is configured to obtain the current pose information of the current video frame collected from the real scene, and obtain the historical pose information of the historical video frame before the current video frame;

a processing part configured to determine virtual object pose data based on the historical pose information and the current pose information;

a rendering part, configured to use the virtual object pose data to render the virtual object corresponding to the virtual object pose data in the current video frame displayed by the display device;

The presentation part is configured to display, through the display device, an augmented reality effect in which the real scene and the virtual object are superimposed.

In the above device, the augmented reality effect includes one of the following:

At least part of at least one real object in the real scene is occluded by the virtual object;

The virtual object is rendered at the edge of the target real object in the real scene;

The virtual object is rendered in the background area in the real scene.

In the above device, the processing part is further configured to determine the virtual object model corresponding to the display object in the preset three-dimensional virtual scene based on the real scene image; the preset three-dimensional virtual scene is based on the real scene image. The virtual model obtained by scene modeling; obtain the judgment result of whether the virtual object model has preset rendering data; in the case that the judgment result is characterized as the virtual object model has preset rendering data, the The preset rendering data is used as the virtual object data.

In the above device, the processing part is further configured to determine the current pose information of the display object in the real scene according to the real scene image; the virtual object model corresponding to the current pose information in the preset three-dimensional virtual scene; the real coordinate system is the coordinate system corresponding to the real scene; the virtual coordinate system is the corresponding coordinate system of the preset three-dimensional virtual scene coordinate system.

In the above device, the processing part is further configured to determine the position area corresponding to the current pose information in the preset three-dimensional virtual scene according to the preset mapping relationship between the real coordinate system and the virtual coordinate system; The corresponding preset virtual model in the location area is used as the virtual object model.

In the above device, the preset three-dimensional virtual scene is a model reconstructed in real time, or a model pre-stored in the cloud.

In the above device, the image acquisition unit includes a binocular camera; the display device further includes a modeling part, and the acquisition part is further configured to determine the display object based on the display object included in the real scene image and the display device. Before describing the virtual object data matching the display object, obtain the image information and depth information of the real scene image through the binocular camera; the modeling part is also configured to be based on the image information and the depth information of the real scene image. depth information; three-dimensional modeling is performed on the display object in the real scene image to obtain the preset three-dimensional virtual scene.

In the above device, the display device further includes an update part, and the update part is configured to display the augmented reality AR effect in which the real scene image and the virtual object are superimposed on the display device, During the movement of the display device, the collected real scene image is updated, and an updated virtual object is obtained based on the updated real scene image; the display part is also configured to display the real scene image on the display device in real time. An augmented reality AR effect in which the updated real scene image is superimposed with the updated virtual object.

In the above device, at least one display device is arranged around the display object, and each display device in the at least one display device is used to collect the display in real time at the respective current position according to the respective collection direction of the display object. The real scene image of the object is obtained, and the corresponding virtual object is obtained based on the real scene image collected respectively, and the augmented reality AR effect in which the corresponding real scene image and the virtual object are superimposed is displayed.

An embodiment of the present disclosure provides a display device that moves along a target trajectory, including:

a display screen configured to display an augmented reality effect in which the real scene and the virtual object are superimposed on the display device;

a memory configured to store executable instructions;

When the processor is configured to execute the executable instructions stored in the memory, in combination with the display screen, the method described in the embodiments of the present disclosure is implemented.

Embodiments of the present disclosure provide a computer-readable storage medium, which stores executable instructions and is configured to implement the methods described in the embodiments of the present disclosure when executed by a processor.

The embodiments of the present disclosure have the following beneficial effects:

By obtaining the current pose information of the current video frame collected from the real scene, obtaining the historical pose information of the historical video frame before the current video frame, and determining the virtual object based on the historical pose information and the current pose information pose data, using the virtual object pose data to render the virtual object corresponding to the virtual object pose data in the current video frame displayed by the display device, so that no matter where the initial position of the display device is, the display device Both can adjust the display of virtual objects according to the offset of current pose information and historical pose information, display virtual objects to accurate positions, and improve the fusion of virtual objects and real scenes and the AR rendering effect.

Description of drawings

FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of another application scenario provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of another application scenario provided by an embodiment of the present disclosure;

FIG. 4 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure;

5 is a schematic diagram of a device structure of a display device provided by an embodiment of the present disclosure;

FIG. 6a is a schematic diagram of the relationship between three camera coordinate systems according to an embodiment of the present disclosure;

FIG. 6b is a schematic diagram of a world coordinate system defined by a SLAM algorithm according to an embodiment of the present disclosure;

6c is a schematic diagram of the relationship between the world coordinate system and the visual space world coordinate system adopted by the rendering engine provided by the embodiment of the present disclosure;

7a is a schematic diagram of an optional effect of a virtual object provided by an embodiment of the present disclosure;

7b is a schematic diagram of an optional effect of a virtual object provided by an embodiment of the present disclosure;

7c is a schematic diagram of an optional effect of a virtual object provided by an embodiment of the present disclosure;

7d is a schematic diagram of an optional effect of a virtual object provided by an embodiment of the present disclosure;

FIG. 8 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure;

FIG. 9 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure;

FIG. 10 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure;

11 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure;

12a is a schematic diagram of an optional effect of displaying a sand table model through a display screen provided by an embodiment of the present disclosure;

12b is a schematic diagram of an optional effect of displaying a sand table model through a display screen provided by an embodiment of the present disclosure;

12c is a schematic diagram of an optional effect of displaying the sand table model through a display screen provided by an embodiment of the present disclosure;

12d is a schematic diagram of an optional effect of displaying a sand table model through a display screen provided by an embodiment of the present disclosure;

12e is a schematic diagram of an optional effect of displaying a sand table model through a display screen provided by an embodiment of the present disclosure;

FIG. 13a is a schematic diagram of a device structure of a display device provided by an embodiment of the present disclosure;

13b is a schematic diagram of a device structure of a display device provided by an embodiment of the present disclosure;

14 is an optional schematic structural diagram of a display device provided by an embodiment of the present disclosure;

FIG. 15 is an optional schematic structural diagram of a display device provided by an embodiment of the present disclosure.

detailed description

In order to make the purpose, technical solutions and advantages of the present disclosure clearer, the technical solutions of the present disclosure will be further elaborated below with reference to the accompanying drawings and embodiments. The described embodiments should not be regarded as limitations of the present disclosure. All other embodiments obtained under the premise of not making creative efforts fall within the protection scope of the present disclosure.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" can be the same or a different subset of all possible embodiments, and Can be combined with each other without conflict.

If a similar description of "first/second" appears in the published documents, the following description will be added. In the following description, the term "first\second\third" involved is only to distinguish similar objects, and does not mean For a specific ordering of objects, it is understood that "first\second\third" may be interchanged in a specific order or sequence where permitted, so that the embodiments of the present disclosure described herein can be performed in an order other than that shown or described.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiments of the present disclosure belong. The terminology used herein is for the purpose of describing the embodiments of the present disclosure only and is not intended to limit the present disclosure.

Embodiments of the present disclosure provide a display method, apparatus, display device, and computer-readable storage medium, which can improve the intuitiveness and richness of display. The display method provided by the embodiment of the present disclosure is applied to a display device. The following describes an exemplary application of the display device provided by the embodiment of the present disclosure. The display device provided by the embodiment of the present disclosure can be implemented as AR glasses, a notebook computer, a tablet computer, a desktop computer Computers, set-top boxes, mobile devices (eg, mobile phones, portable music players, personal digital assistants, dedicated messaging devices, portable game devices) and other various terminals with display screens.

The following provides exemplary descriptions of application scenarios applicable to the embodiments of the present disclosure. Application scenarios of the embodiments of the present disclosure include but are not limited to object display scenarios, such as real estate sand table display, construction site building display, or other object display scenarios.

For example, please refer to FIG. 1 , which is a schematic diagram of an application scenario provided by an embodiment of the present disclosure. As shown in FIG. 1 , the display screen 101 can be installed in a building and can slide along a preset track. In other embodiments, the display screen 101 may be positioned at the edge of the building or outside the building. The display screen 101 can be used to collect the real scene image of the building, and superimpose the virtual effect related to the building on the real scene image of the building, thereby presenting the AR effect. In some embodiments, when the display screen 101 slides to the building A, the display screen 101 collects a real image of the building A, and the building model of the building A can be determined as A' according to the real image of the building A, and the display screen 101 According to the preset rendering data corresponding to A', the virtual effect image of building A is obtained by rendering, and the virtual effect image of building A is superimposed on the real image of building A to display the AR effect. When the display screen 101 slides to the building B, the display screen 101 can obtain the real image of the building B, and determine the building model of the building B as B', and then superimpose the virtual effect image of the building B on the real image of the building B. The content displayed on the screen is updated, and the AR effect on each moving position is displayed in real time during the movement of the display screen 101 .

FIG. 2 is a schematic diagram of another application scenario provided by an embodiment of the present disclosure. As shown in FIG. 2 , the display device in the embodiment of the present disclosure may also be a terminal device 102, and a user may hold or wear the terminal device 102 to enter between buildings , and by photographing the building, the terminal device 102 displays an AR effect in which the building image and the virtual effect image related to the building are superimposed.

FIG. 3 is a schematic diagram of another application scenario provided by an embodiment of the present disclosure. As shown in FIG. 3 , the display device 101 is set on a preset sliding track 102, the display device 101 can move along the sliding track 102, and the display device 101 is set In front of the display table, at least one target entity 103 is arranged on the display table, and a virtual object 104 is displayed on the display device 101 , and the virtual object 104 is used for explaining the target entity 103 . Exemplarily, during the explanation of the virtual object 104, a virtual effect image related to the target entity 103 may be displayed on the screen of the display device 101 based on the image of the target entity 103 collected during the movement of the display device 101 to present the target entity 103. AR effect superimposed with virtual effect image.

Referring to FIG. 4 , FIG. 4 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure, which will be described in conjunction with the steps shown in FIG. 4 .

Step S101: Obtain current pose information of a current video frame collected from a real scene.

Exemplarily, the display method provided in the embodiment of the present disclosure is applied to a display device. As shown in FIG. 5 , the display device 500 can be fixed on a bracket 530 with pulleys, movably connected with a preset sliding track 510 on the booth 520 , and can slide left and right along the preset sliding track 510 . Exemplarily, the display method provided in the embodiment of the present disclosure may also be applied to a scenario in which a user moves with a hand-held display device. When the user moves with the display device in hand, the display device displays virtual objects and real scenes for fusion. The display method provided in the embodiments of the present disclosure is mainly used to solve the problem that when the movable display device is not started at the initial position, the virtual object displayed by the display device is difficult to be integrated with the real scene, and the AR rendering effect is not good.

In the embodiment of the present disclosure, the display device may collect an image of the current real scene through the acquisition part, and determine the current pose information of the display device according to the current video frame in the image or the image at the current moment. Exemplarily, the real scene may be a sand table model or other exhibits on a booth, a construction site in a building, an interior scene of a building, a street scene, or other objects, and the augmented reality is presented by superimposing virtual objects on the real scene. Effect. The collection range of the collection part may include all the displayed objects, or may only include some of the displayed objects. The embodiment of the present disclosure does not limit the collection range of the collection part.

In this embodiment of the present disclosure, the pose information of the display device may include the location of the display component used to display the virtual object when the display device moves on the sliding track, when the user moves the display device by hand, or when the intelligent mobile display device navigates and moves autonomously. position and/or display angle. The current pose information corresponding to the current video frame can be acquired through the acquisition part set on the display device. The acquisition part in the embodiment of the present disclosure may include: a binocular camera, a sensor, a gyroscope, an inertial measurement unit (Inertial measurement unit, IMU), and the like.

For example, it may be based on a global positioning system (Global Positioning System, GPS), a global navigation satellite system (Global Navigation Satellite System, GLONASS), and may also include an angular velocity sensor for determining the display angle of the display device.

In the embodiment of the present disclosure, there is a pose offset when the display device is started at a non-initial position. When the display device needs to display a virtual object in the current video frame, it first needs to obtain the current pose information of the current video frame, and convert the current pose information to the current video frame. It is superimposed with the pose offset, and then the virtual object can be displayed at the position corresponding to the display object. In the embodiment of the present disclosure, after the display device is turned on, the collection part collects the video image in the real scene in real time, selects the video frame at the current moment of the video image as the current video frame, processes the current video frame, and obtains the current video frame. Current pose information.

In some embodiments of the present disclosure, the display device may use a positioning algorithm to process the current video frame to obtain current pose information of the current video frame. The positioning algorithm is, for example, a map construction (simultaneous localization and mapping, SLAM) algorithm, and the SLAM algorithm can calculate the current pose information based on a pre-defined virtual coordinate system in combination with the current video frame.

In the embodiment of the present disclosure, the acquisition part may be a sensor provided on the display device. After the display device is turned on, the sensor acquires the position coordinate information of the display device in real time. The current pose information corresponding to the current video frame may be the position coordinate information of the display device acquired by the sensor at the current moment. In some embodiments of the present disclosure, the acquisition part may include an angle sensor disposed on the display device, and the angle sensor acquires display angle information of the display device in real time after the display device is turned on. The current pose information corresponding to the current video frame may include the display angle information of the display device acquired by the angle sensor at the current moment. The pose offset may be the difference between the position coordinate information when the display device is started at a non-initial position and the position coordinate information when the display device is started at the initial position. The pose offset can also be the difference between the display angle information when the display device is started at a non-initial position when the display device is started at a non-initial position, and the display angle information when the display device is started at an initial position.

In some embodiments of the present disclosure, the current pose information corresponding to the current video frame may also include combination information of the position coordinate information obtained by the sensor and the display angle information obtained by the angle sensor.

Step S102: Obtain historical pose information of a historical video frame before the current video frame, and determine virtual object pose data based on the historical pose information and the current pose information.

In the embodiment of the present disclosure, the acquisition part (for example, the binocular camera in the acquisition part) of the display device acquires the image of the current real scene in real time. After the display device starts up for a predetermined duration (exemplarily, the duration is set to be within 1-3 seconds), the historical video frames of the display device are acquired through the acquisition part set on the display device. The historical video frame is a video frame at any moment before the current moment. The display device can use a positioning algorithm to process the historical video frames to obtain historical pose information of the historical video frames. In the embodiments of the present disclosure, the positioning algorithms are, for example, SLAM algorithms, visual positioning algorithms, and inertial sensor-based positioning algorithms, which are not limited in the embodiments of the present disclosure. Exemplarily, the display device processes the historical video frames through a visual positioning algorithm to obtain the first pose information of the historical video frames, the display device processes the historical video frames through the SLAM algorithm to obtain the second pose information of the historical video frames, and displays the information. The device processes the inertial sensor data at the corresponding moment of the historical video frame through the positioning algorithm based on the inertial sensor to obtain the historical sensing pose information of the historical video frame. The first pose information includes visual space pose information, the second pose information includes camera pose information, and the historical sensing pose information includes inertial sensing pose information. Exemplarily, the processing part of the display device may calculate the offset based on the first pose information and the second pose information to obtain historical pose offset information.

In the embodiment of the present disclosure, the processing part of the display device acquires the current pose information and the historical pose offset information, and the processing part of the display device superimposes the current pose information on the historical pose offset information to obtain virtual object pose data . The processing part sends the virtual object pose data to the rendering part of the display device, and the rendering part can render the virtual object at the corresponding position, which improves the fusion degree of the virtual object and the real scene and the AR rendering effect.

In the embodiments of the present disclosure, for the convenience of explaining the pose information, the concept of a coordinate system is introduced here. First, the virtual camera sensor coordinate system slam_c, the visual space coordinate system phy_c, the rendering engine camera coordinate system render_c, the camera sensor world coordinate system slam_w, the rendering engine camera world coordinate system render_w and the visual space world coordinate system phy_w can be defined respectively. Among them, the camera sensor coordinate system slam_c is the camera coordinate system defined by the SLAM algorithm, the camera sensor world coordinate system slam_w is the world coordinate system defined by the SLAM algorithm, the visual space coordinate system phy_c is the camera coordinate system defined by the visual positioning algorithm, and the visual space world coordinate system The system phy_w is the world coordinate system defined by the visual positioning algorithm, the rendering engine camera coordinate system render_c is the camera coordinate system defined by the rendering engine algorithm, and the rendering engine camera world coordinate system render_w is the world coordinate system defined by the rendering engine algorithm.

Since in the embodiment of the present disclosure, the acquisition part on the display device includes a camera, the origins of the camera sensor coordinate system slam_c, the visual space coordinate system phy_c and the rendering engine camera coordinate system render_c defined based on the camera are coincident. Among them, the camera sensor world coordinate system slam_w defined by the SLAM algorithm is related to the starting position of the SLAM algorithm, and the starting position is the origin position of the camera sensor world coordinate system slam_w. The origin of the rendering engine camera world coordinate system render_w and the visual space world coordinate system phy_w can be any point in the real scene.

In the camera sensor coordinate system slam_c, as shown in Figure 6a, Xslam_c represents the X axis of the coordinate system, Yslam_c represents the Y axis of the coordinate system, and Zslam_c represents the Z axis of the coordinate system. The display device can calculate the current video frame or historical video frame through the SLAM algorithm, and obtain the current pose information or historical pose information of the current video frame or historical video frame in the camera sensor coordinate system slam_c.

In the visual space coordinate system phy_c, as shown in Fig. 6a, Xphy_c represents the X axis of the coordinate system, Yphy_c represents the Y axis of the coordinate system, and Zphy_c represents the Z axis of the coordinate system. The display device may calculate the current video frame or the historical video frame through the visual positioning algorithm, and obtain the current pose information or historical pose information of the current video frame or the historical video frame in the visual space coordinate system phy_c. In the rendering engine camera coordinate system render_c, as shown in FIG. 6A , Xrender_c represents the X axis of the coordinate system, Yrender_c represents the Y axis of the coordinate system, and Zrender_c represents the Z axis of the coordinate system. The display device obtains the historical pose offset information by calculating the historical pose information, and superimposes the historical pose offset information and the current pose information to obtain virtual object pose data. The virtual object pose data includes the pose data of the virtual object in the camera sensor coordinate system slam_c, and the display device converts the virtual object pose data in the camera sensor coordinate system slam_c to the rendering engine camera coordinate system render_c to obtain the virtual object. The target position coordinates of the object can then be rendered by the rendering engine at a position corresponding to the rendering engine coordinate system according to the target position coordinates of the virtual object.

In the camera sensor world coordinate system slam_w, as shown in Figure 6b, Xslam_w, Yslam_w and Zslam_w are three coordinate axes.

In the rendering engine camera world coordinate system render_w, as shown in Figure 6c, Xrender_w, Yrender_w and Zrender_w are the three coordinate axes.

In the visual space world coordinate system phy_w, as shown in Figure 6c, Xphy_w, Yphy_w and Zphy_w are three coordinate axes. As shown in Figure 6c, the origins of the rendering engine camera world coordinate system render_w and the visual space world coordinate system phy_w (ie, the real world coordinate system) coincide, that is, the rendering space and the physical space should have a one-to-one correspondence.

Exemplarily, the historical pose information or current pose information in the visual space coordinate system phy_c includes the coordinate position of the display component of the display device in the visual space coordinate system phy_c, or includes the display component of the display device and the visual space coordinate system phy_c. The included angle of each coordinate axis in , or simultaneously includes the coordinate position of the display component of the display device in the visual space coordinate system phy_c and the included angle with each coordinate axis in the visual space coordinate phy_c, which is not specifically limited here.

Exemplarily, the historical pose information of the historical video frames of the display device includes first pose information and second pose information. The display device may calculate and obtain the first pose information of the historical video frame as (X1, Y1, Z1), and the second pose information as (X2, Y2, Z2). Among them, X1 in the first pose information (X1, Y1, Z1) is the coordinate point of the historical video frame on the X axis in the visual space coordinate system phy_c, and Y1 is the Y of the historical video frame in the visual space coordinate system phy_c The coordinate point on the axis, Z1 is the coordinate point on the Z axis of the historical video frame in the visual space coordinate system phy_c. X2 in the second pose information (X2, Y2, Z2) is the coordinate point of the historical video frame on the X axis in the camera sensor coordinate system slam_c, and Y2 is the historical video frame on the Y axis in the camera sensor coordinate system slam_c The coordinate point, Z2 is the coordinate point of the historical video frame on the Z axis in the camera sensor coordinate system slam_c. Then, the offset between the first pose information (X1, Y1, Z1) and the second pose information (X2, Y2, Z2) is obtained by the processing part of the display device, and the historical pose offset information is obtained. The historical pose offset information and the current pose information are fused to obtain the virtual object pose data.

Step S103, using the virtual object pose data to render the virtual object in the current video frame displayed by the display device.

In the embodiment of the present disclosure, after the virtual object pose data determined by the rendering part of the display device, according to the coordinate position of each pixel constituting the virtual object included in the virtual object pose data, the display object is associated with the display object in the real scene image. The virtual object corresponding to the pose data of the virtual object is rendered at the display position.

In this embodiment of the present disclosure, the virtual object rendered by the display device according to the pose data of the virtual object may include at least one of the following:

The virtual scene effect corresponding to the display object; as shown in Figure 7a, when the sand table model on the booth is displayed through the display screen on the slide rail, the virtual object can be the completed effect corresponding to the building model in the sand table, and The sand table area has different scene effects during the day and night.

The virtual detail map corresponding to the display object; as shown in Figure 7b, when the car on the booth is displayed through the display screen on the slide rail, the display screen is not started at the initial position, when the current pose information and After the historical pose information is obtained, the pose information of the virtual object can be determined, so as to display the virtual virtual object. The virtual object may be a detailed view of the structure inside the body part corresponding to the current position of the display screen.

The virtual three-dimensional animation effect corresponding to the display object; as shown in Figure 7 c, the virtual object can be the virtual three-dimensional animation effect 53 of the component on the car such as the steering wheel, and the display device can display the virtual three-dimensional animation effect 53 corresponding to the steering wheel component on the display screen. The upper area is played, and the steering wheel is displayed in an all-round rotation.

The virtual label corresponding to the display object. As shown in Fig. 7d, when displaying the sand table model on the booth through the display screen on the slide rail, the virtual object can display the description information corresponding to the building model in the form of a text label or a picture label.

In the embodiment of the present disclosure, after the display device determines the pose data of the virtual object through the current pose information and the historical pose information, the display device can obtain the display position associated with the display object from the real scene image. In this way, according to the virtual object pose data After rendering, after obtaining the virtual object, the virtual object is correspondingly superimposed on the display position associated with the display object in the real scene image, and then cooperate with the setting of the transparency of the virtual object data to achieve the enhanced display AR in which the real scene image and the virtual object are superimposed. Effect. As shown in Figures 7a to 7d above, the size of the virtual object and the display object in the real scene image is 1:1, and the virtual object is overlaid on the same position of the display object in the real scene image, so it is like the sand table completion effect, Virtual effects such as car body detail images can be overlaid and superimposed on part of the model image of the sand table, and displayed on the part of the car body image, thus presenting an AR effect in which the virtual image is further superimposed on the display object.

S104 , displaying an augmented reality effect in which the real scene and the virtual object are superimposed on the display device.

In the embodiment of the present disclosure, after the display device renders the virtual object, the AR effect in which the real scene image and the virtual object are superimposed may be displayed on the display screen on the display device.

Exemplarily, generally, the display device needs to determine the pose of the display device through the camera sensor coordinate system slam_c, so as to display the virtual object according to the pose of the display device. When the display device is not started at the initial position, the origin of the camera sensor coordinate system slam_c defined by the display device is not at the preset start position, resulting in an error in the positioning pose of the display device. In the embodiment of the present disclosure, the processing part on the display device obtains the historical pose offset information when the display device is not started at the initial position by processing the historical pose information. For example, the historical pose information includes the first pose information and the second pose information, the first pose information includes the visual space pose information of the historical video frame in the visual space coordinate system phy_c, and the visual space coordinate system phy_c is related to the real For scene matching, the second pose information includes the camera pose information of historical video frames in the camera sensor coordinate system slam_c. The display device may obtain historical pose offset information by calculating the first pose information and the second pose information. The historical pose offset information is the offset information of the camera sensor coordinate system slam_c defined by the display device in the real scene when the display device is not started at the initial position. The display device superimposes the current pose information and the historical pose offset information to correct the pose error of the display device, so as to determine the virtual object pose data. The display position is further rendered, so that the display device can display the virtual object in the display position associated with the object in the real scene when it is started at any position, and then display the AR effect in which the virtual object and the real scene image are superimposed, so that the virtual object displayed by the display device can be displayed. The object is highly integrated with the real scene, which optimizes the AR display effect.

In some possible implementations, referring to FIG. 8 , FIG. 8 is an optional schematic flowchart of the display method provided by an embodiment of the present disclosure. S101 shown in FIG. 4 can be implemented by S1011 to S1012 , which will be described in conjunction with each step. .

S1011. Collect the current video frame of the real scene through the collection part of the display device.

In the embodiment of the present disclosure, after the display device is turned on, since the display device is not started at the initial position, the virtual object displayed by the display device will not be displayed at the corresponding position. In this case, the video frame at the current moment after the display device is turned on is obtained through the acquisition part set on the display device.

S1012 , using a positioning algorithm to process the current video frame to obtain current pose information of the current video frame in the camera sensor coordinate system.

In this embodiment of the present disclosure, in order to obtain the current pose information of the current video frame. The positioning algorithm first extracts and matches the features of the current video frame, selects the key points of the current video frame, and then uses the basic matrix to calculate based on the relevant data of the key points to obtain the current pose information corresponding to the current video frame. In this embodiment of the present disclosure, the positioning algorithm includes a SLAM positioning algorithm. In the embodiment of the present disclosure, the display device can obtain the transformation matrix of the current pose information from the camera sensor coordinate system slam_c to the camera sensor world coordinate system slam_w through algorithm calculation, that is, the current pose information matrix. In the embodiment of the present disclosure, the front pose information matrix is: Tslam_w_slam_c_X. Among them, Tslam_w_slam_c_X represents the transformation matrix of the X-th video frame, that is, the pose information of the current video frame from the camera sensor coordinate system slam_c to the camera sensor world coordinate system slam_w.

In some possible implementations, referring to FIG. 9 , FIG. 9 is an optional schematic flowchart of the display method provided by the embodiment of the present disclosure. S101 shown in FIG. 4 can be implemented by S1013 to S1014 , which will be described in conjunction with each step. .

S1013: Acquire the first sensing data when the acquisition part of the display device collects the initial video frame, and the second sensor data when the acquisition part of the display device collects the current video frame.

In this embodiment of the present disclosure, in order to obtain the current pose information of the display device, a virtual sensing coordinate system imu_c may be defined first. The sensing coordinate system imu_c is a coordinate system defined by the inertial sensor positioning algorithm, and the origin and coordinate axis of the sensing coordinate system imu_c are defined with reference to the camera sensor coordinate system slam_c in S102 . After the display device is turned on, the acquisition part (which can be an inertial sensor) set on the display device obtains the first sensing data of the initial video frame when the display device is turned on, and the second transmission of the current video frame after the display device is turned on for a period of time. sense data. Exemplarily, the first sensing data includes acceleration data and orientation data of the initial video frame, and the second sensing data includes acceleration data and orientation data of the current video frame.

S1014. Based on the offset between the first sensing data and the second sensing data, determine the current pose information when the display device collects the current video frame.

In the embodiment of the present disclosure, the processing part of the display device calculates and obtains the offset between the first sensing data and the second sensing data. The display device can obtain the current pose information of the current video frame in the pre-established sensing coordinate system imu_c by calculating and converting the offset. In the embodiment of the present disclosure, the display device can obtain the transformation matrix of the current pose information from the sensing coordinate system imu_c to the sensing world coordinate system imu_w through algorithm calculation, that is, the current pose information matrix. In this embodiment of the present disclosure, the current pose matrix is: Timu_w_imu_c_X. Wherein, Timu_w_imu_c_X represents the transformation matrix of the X-th video frame, that is, the pose information of the current video frame from the sensing coordinate system imu_c to the sensing world coordinate system imu_w.

In some possible implementations, referring to FIG. 8 , FIG. 8 is an optional schematic flowchart of the display method provided by the embodiment of the present disclosure. S102 shown in FIG. 4 can be implemented by S1021 to S1022 , which will be described in conjunction with each step. .

S1021. Obtain historical pose information of a historical video frame before the current video frame, and determine the historical pose offset information based on the historical pose information.

In the embodiment of the present disclosure, the acquisition part of the display device acquires images of the real scene in real time, and the display device acquires historical video frames through the acquisition part set on the display device after the display device starts for a predetermined duration (usually set to a duration within 1-3 seconds). The processing part of the display device processes the information of the historical video frame to obtain the first pose information and the second pose information of the historical video frame. The processing part of the display device then calculates the historical pose offset information based on the first pose information and the second pose information of the historical video frame.

In some possible implementations, referring to FIG. 10 , FIG. 10 is an optional schematic flowchart of the display method provided by the embodiment of the present disclosure. S1021 shown in FIG. 8 can be implemented by S201 to S202 , which will be described in conjunction with each step. .

S201. Obtain the first pose information of the historical video frame in the visual space coordinate system; and use a positioning algorithm to process the historical video frame to obtain the second pose information of the historical video frame in the camera sensor coordinate system.

In the embodiment of the present disclosure, after the acquisition part of the display device acquires the historical video frame, the first pose information of the historical video frame in the visual space coordinate system phy_c can be obtained by calculating through the visual positioning algorithm. Then, the historical video frame is calculated by the SLAM algorithm to obtain the second pose information of the historical video frame in the camera sensor coordinate system slam_c. The first pose information includes the position coordinates of the historical video frame in the visual space coordinate system phy_c. The second pose information includes the position coordinates of the historical video frame in the camera sensor coordinate system slam_c.

In the embodiment of the present disclosure, the display device can obtain the transformation matrix of the first pose information from the visual space coordinate system phy_c to the visual space world coordinate system phy_w through algorithm calculation, that is, the first pose information matrix. In the embodiment of the present disclosure, the first pose information matrix is: Tphy_w_phy_c_N. Among them, Tphy_w_phy_c_N represents the transformation matrix of the pose information of the Nth historical video frame from the visual space coordinate system phy_c to the visual space world coordinate system phy_w.

In this embodiment of the present disclosure, the display device can obtain a transformation matrix of the second pose information from the camera sensor coordinate system slam_c to the camera sensor world coordinate system slam_w through algorithm calculation, that is, the second pose information matrix. In the embodiment of the present disclosure, the second pose information matrix is: Tslam_w_slam_c_N. Among them, Tslam_w_slam_c_N represents the transformation matrix of the pose information of the Nth historical video frame from the camera sensor coordinate system slam_c to the camera sensor world coordinate system slam_w.

S202. Determine historical pose offset information based on the offset between the second pose information and the first pose information.

In this embodiment of the present disclosure, the historical pose offset information may be an offset matrix Toffset 1. When calculating and obtaining the offset matrix Toffset 1, it is necessary to obtain the first pose information matrix, the second pose information matrix, and the camera sensor defined by the SLAM algorithm. The first transformation matrix from the coordinate system slam_c to the rendering engine camera coordinate system render_c, the second transformation matrix from the camera sensor coordinate system slam_c defined by the SLAM algorithm to the visual space coordinate system phy_c, and the rendering engine camera world coordinate system render_w to the visual space world coordinate system The third transformation matrix of phy_w. Convert the second pose information matrix to the visual space coordinate system phy_c corresponding to the first pose information matrix based on the first transformation matrix, the second transformation matrix and the third transformation matrix, and convert the second pose information matrix to The matrix information in the visual space coordinate system phy_c is calculated with the first pose information matrix, and the offset matrix Toffset 1 can be obtained.

Exemplarily, the calculation formula (1) is:

Among them, Tphy_w_phy_c_N is the first pose information matrix, Tslam_w_slam_c_N is the second pose information matrix, and Trotate_Y_UP is the first transformation matrix from the camera sensor coordinate system slam_c defined by the SLAM algorithm to the rendering engine camera coordinate system render_c. The first transformation matrix Trotate_Y_UP can Make the Y direction of the camera sensor coordinate system slam_c defined by the SLAM algorithm consistent with the UP direction of the rendering engine camera coordinate system render_c. Tphy_c_slam_c is the second conversion matrix from the camera sensor coordinate system slam_c defined by the SLAM algorithm to the visual space coordinate system phy_c, and Trend_w_phy_w is the third conversion matrix from the rendering engine camera world coordinate system render_w to the visual space world coordinate system phy_w. The display device can calculate through the first transformation matrix Trotate_Y_UP, the second transformation matrix Tphy_c_slam_c, the third transformation matrix Trend_w_phy_w, the first pose information matrix Tphy_w_phy_c_N and the second pose information matrix Tslam_w_slam_c_N, and realize the camera sensor coordinates slam_c, visual space coordinates The alignment between the system phy_c and the rendering engine camera coordinate system render_c, and then the offset matrix Toffset 1 can be calculated.

The display device multiplies the third transformation matrix and the first attitude information matrix to obtain a first intermediate matrix, then multiplies the first intermediate matrix and the second transformation matrix to obtain a second intermediate matrix, and then multiplies the first intermediate matrix by the second transformation matrix. The second intermediate matrix is multiplied by the first transformation matrix to obtain a third intermediate matrix. Then multiply the third intermediate matrix with the inverse matrix of the second pose information matrix to obtain the historical pose offset information, that is, the offset matrix Toffset 1.

In some possible implementations, referring to FIG. 11 , FIG. 11 is an optional schematic flowchart of the display method provided by an embodiment of the present disclosure. S1021 shown in FIG. 8 can be implemented through S301 to S304 , which will be described in conjunction with each step. .

S301. Acquire first pose information of a historical video frame in a visual space coordinate system.

In the embodiment of the present disclosure, after the acquisition part acquires the historical video frame, the processing part can determine the pose information of the historical video frame in the visual space coordinate system phy_c, so as to obtain the first position of the historical video frame in the visual space coordinate system phy_c pose information.

S302: Acquire third sensing data when the collection part of the display device collects historical video frames.

In the embodiment of the present disclosure, after the display device is turned on, the display device is not started at the initial position. At this time, the third sensing data of the historical video frames of the display device is acquired through the acquisition part (which may be an inertial sensor) provided on the display device. The third sensory data includes acceleration data and orientation data of historical video frames.

S303. Determine historical sensing pose information based on the offset between the first sensing data and the third sensing data.

In the embodiment of the present disclosure, the processing part of the display device calculates and obtains the offset between the first sensing data and the third sensing data. The offset can be obtained by calculating and transforming the historical sensing pose information of the historical video frame in the pre-established sensing coordinate system imu_c. Wherein, the first sensing data includes the data of the initial video frame collected by the collecting part when the display device is started.

In the embodiment of the present disclosure, the display device can obtain the transformation matrix of the historical sensing pose information from the sensing coordinate system imu_c to the sensing world coordinate system imu_w through algorithm calculation, that is, the historical sensing pose information matrix. In the embodiment of the present disclosure, the historical sensing pose information matrix is: Timu_w_imu_c_N. Among them, Timu_w_imu_c_N represents the transformation matrix of the pose information of the Nth historical video frame from the sensing coordinate system imu_c to the sensing world coordinate system imu_w.

S304. Determine historical pose offset information based on the offset between the first pose information and the historical sensed pose information.

In this embodiment of the present disclosure, the historical pose offset information may be an offset matrix Toffset 2 . When the display device calculates and obtains the offset matrix Toffset 2, it needs to obtain the first attitude information matrix, the historical sensing pose information matrix, and the first transformation matrix from the sensing coordinate system imu_c defined by the inertial sensor positioning algorithm to the rendering engine camera coordinate system render_c , the second transformation matrix of the sensor coordinate system imu_c defined by the inertial sensor positioning algorithm to the visual space coordinate system phy_c, and the third transformation matrix of the rendering engine camera world coordinate system render_w to the visual space world coordinate system phy_w. The historical sensing pose information matrix is transformed into the visual space coordinate system phy_c corresponding to the first attitude information matrix based on the first transformation matrix, the second transformation matrix and the third transformation matrix, and the historical sensing pose information is converted into The matrix information in the visual space coordinate system phy_c is calculated with the first pose information matrix, and the offset matrix Toffset 2 can be obtained.

Exemplarily, the calculation formula (2) is:

Among them, Tphy_w_phy_c_N is the first pose information matrix, Timu_w_imu_c_N is the historical sensor pose information matrix, Trotate_Y_UP is the first transformation matrix from the sensor coordinate system imu_c defined by the inertial sensor positioning algorithm to the rendering engine camera coordinate system render_c, the first transformation The matrix Trotate_Y_UP can make the Y direction of the sensor coordinate system imu_c consistent with the UP direction of the rendering engine camera coordinate system render_c. Tphy_c_imu_c is the second transformation matrix from the sensor coordinate system imu_c defined by the inertial sensor positioning algorithm to the visual space coordinate system phy_c, and Trend_w_phy_w is the third transformation matrix from the rendering engine camera world coordinate system render_w to the visual space world coordinate system phy_w. The display device can calculate through the first transformation matrix Trotate_Y_UP, the second transformation matrix Tphy_c_imu_c, the third transformation matrix Trend_w_phy_w, the first attitude information matrix Tphy_w_phy_c_N and the historical sensing pose information matrix Timu_w_imu_c_N, so as to realize the sensing coordinate system imu_c, visual The alignment between the space coordinate system phy_c and the rendering engine camera coordinate system render_c, and then the offset matrix Toffset 2 can be calculated.

The display device multiplies the third transformation matrix and the first attitude information matrix to obtain a first intermediate matrix, then multiplies the first intermediate matrix and the second transformation matrix to obtain a second intermediate matrix, and then multiplies the first intermediate matrix by the second transformation matrix. The second intermediate matrix is multiplied by the first transformation matrix to obtain a third intermediate matrix. The third intermediate matrix is then multiplied by the inverse matrix of the historical sensing pose matrix to obtain the historical pose offset information, that is, the offset matrix Toffset 2.

S1022 , obtaining virtual object pose data based on the historical pose offset information and the current pose information.

In the embodiment of the present disclosure, the display device may obtain virtual object pose data by fusing current pose information and historical pose offset information.

In the embodiment of the present disclosure, the pose data view1 of the virtual object can be obtained by calculation through formula (3). Formula (3) is:

view1=Toffset 1×(Tslam_w_slam_c_X) ^-1 (3);

Wherein, Toffset 1 is the offset matrix calculated from S201 to S202, and (Tslam_w_slam_c_X) ^-1 is the inverse matrix of the current pose information matrix calculated from S1011 to S1012. The display device can calculate the pose data of the virtual object by multiplying Toffset 1 and (Tslam_w_slam_c_X) ^-1 , which is denoted as matrix view1.

In the embodiment of the present disclosure, the pose data view2 of the virtual object can be obtained by calculation through formula (4). Formula (4) is:

view2=Toffset 2×(Timu_w_imu_c_X) ^-1 (4);

Wherein, Toffset 2 is the offset matrix calculated from S301 to S304 , and (Tslam_w_slam_c_X) ⁻¹ is the inverse matrix of the current pose information matrix calculated from S1013 to S1014 . The display device can calculate the virtual object pose data by multiplying Toffset 2 and (Tslam_w_slam_c_X) ^-1 , which is denoted as matrix view2.

In some possible implementations, referring to FIG. 10 , FIG. 10 is an optional schematic flowchart of the display method provided by the embodiment of the present disclosure. S103 shown in FIG. 4 can be implemented through S204 to S205 , which will be described in conjunction with each step. .

S204: Map the coordinate position of each pixel in the virtual object to the rendering engine coordinate system to obtain the target coordinate position of each pixel.

In the embodiment of the present disclosure, the pose data of the virtual object includes the coordinate position of each pixel of the virtual object. The coordinate 1 of one of the marked pixel positions of the virtual object is (X3, Y3, Z3), and the coordinate 1 of the marked pixel position (X3, Y3, Z3) is the coordinate of the marked pixel in the camera sensor coordinate system slam_c, and X3 is the mark The coordinate of the pixel on the X axis of the camera sensor coordinate system slam_c, Y3 is the coordinate of the marked pixel on the Y axis of the camera sensor coordinate system slam_c, and Z3 is the coordinate of the marked pixel on the Z axis of the camera sensor coordinate system slam_c. After the display device obtains the virtual object pose data, the coordinate position of each pixel of the virtual object to be displayed is mapped to the rendering engine coordinate system render_c according to the coordinate position of the virtual object pose data, and each pixel of the virtual object to be displayed is obtained. The target coordinate position in pixels. In this embodiment of the present disclosure, since the origins of the camera sensor coordinate system slam_c and the rendering engine coordinate system render_c are coincident, the display device can map the coordinate 1 of the marked pixel to the target coordinate position in the rendering engine coordinate system render_c through coordinate transformation. , the target coordinate position is coordinate 2 (X4, Y4, Z4). Among them, the coordinate 2 (X4, Y4, Z4) of the target coordinate position of the marked pixel is the coordinate of the target coordinate position in the rendering engine coordinate system render_c, and X4 is the coordinate of the target coordinate position on the X axis of the rendering engine coordinate system render_c, Y4 is the coordinate of the target coordinate position on the Y axis of the rendering engine coordinate system render_c, and Z4 is the coordinate of the marker pixel on the Z axis of the rendering engine coordinate system render_c.

S205, using a rendering engine to render the virtual object at the target coordinate position in the current video frame.

In the embodiment of the present disclosure, the rendering engine of the display device obtains the target coordinate position of each pixel of the virtual object that needs to be displayed in the previous step, and the target coordinate position of each pixel of the virtual object to be displayed according to the preset Predetermined programs render virtual objects in real scenes. Exemplarily, the marked pixel is rendered by the rendering engine at a position with coordinates (X4, Y4, Z4).

Referring to FIG. 11 , FIG. 11 is an optional schematic flowchart of a display method provided by an embodiment of the present disclosure. Based on FIG. 4, after S104, S105-S106 may also be executed, which will be described in conjunction with each step.

S105. During the movement of the display device, update the collected real scene image, and obtain an updated virtual object based on the updated real scene image.

In the embodiments of the present disclosure, during the movement of the display device, the real scene image collected by the collecting part will also change. The sand table model can be scanned through the camera on the back. The real scene image collected by the camera will be updated in real time with the scanned position, and the display objects contained in the real scene image will also be updated in real time. Therefore, during the movement of the display device, The display device will update the collected real scene image, and obtain an updated virtual object based on the updated real scene image.

S106 , displaying an augmented reality effect in which the updated real scene image and the updated virtual object are superimposed in real time on the display device.

In the embodiment of the present disclosure, the display device will display the augmented reality AR effect in which the updated real scene image and the updated virtual object are superimposed in real time, so that during the movement of the display device, the display object can be displayed according to the collected display objects at each different moving position. Different display parts display the smooth display effect of different AR images combining virtual and real in real time.

In some embodiments of the present disclosure, at least one display device may be arranged around the display object, and each display device in the at least one display device is configured to collect and display the display object in real time at the respective current position according to the respective collection direction of the display object. The real scene image of the object is obtained, and the corresponding virtual object is obtained based on the real scene image collected respectively, and the augmented reality AR effect in which the corresponding real scene image and the virtual object are superimposed is displayed. When the display device is not started at the initial position, the virtual object pose data is determined through the current pose information and historical pose information of the display device, and the virtual object is displayed at the corresponding position. In some embodiments, in the scene where the construction site is presented with an AR effect, a four-sided transparent glass room may be set up on the construction site, and a display screen that slides along a preset track may be set behind each glass wall , which is used for a comprehensive display of buildings at various locations on the construction site.

Below, an exemplary application of the embodiments of the present disclosure in a practical application scenario will be described.

In some embodiments, for a real estate sand table display scene, a sliding track may be set beside the booth where the sand table model is located, and a slidable display screen may be set on the sliding track as a display device, wherein the display screen includes a preset sliding track And a display screen with a camera, the front of the display screen is the screen part, facing the viewer, for the display of the final AR effect, and the back of the display screen is a camera for image acquisition of the sand table model. Since the sand table model occupies a large area, the display range of the display screen can be only a part of the sand table model, and the whole sand table model can be scanned by moving on the preset sliding track.

In some embodiments, as shown in Fig. 12a, when the display screen slides to the left side of the sand table, the camera behind the camera captures the image on the left side of the sand table model as the real scene image. In the preset three-dimensional virtual scene in the internal storage space of the display device, the virtual left sand table model matching the left sand table model is determined as the virtual object model, and the rendering data of the virtual sand table model associated with the virtual left sand table model is used as the virtual object model. Object data, so that the completed building diagram corresponding to the left sand table model can be rendered through the virtual sand table model rendering data, and the completed building diagram can be superimposed on the real left sand table model image and displayed on the display screen. display, thus showing the finished effect corresponding to the sand table model on the left. When the display screen moves to the right side of the sand table model, as shown in Figure 12b, the virtual effect corresponding to the sand table model on the right side is obtained according to the same process, and superimposed on the real image of the sand table model on the right side for display, showing the right side The finished effect corresponding to the sand table model. In some embodiments, the virtual sand table model rendering data can also be set to two different types of virtual object data: daytime effect and nighttime effect. Rendering the virtual object according to the daytime effect can present the AR effect shown in FIG. 12c. Rendering virtual objects with night effect can present the AR effect as shown in Figure 12d.

In some embodiments, as shown in FIG. 12e , display theme controls may also be correspondingly set on the display screen. In this way, when different display theme controls are triggered by a click event, the display device can display the corresponding For different virtual object data, virtual objects with different theme types are superimposed on the real sand table model image. For example, based on the same sand table model, various virtual effect themes such as traffic analysis, regional planning, time display, enterprise introduction, etc. can be displayed.

It can be understood that, in the embodiment of the present disclosure, for the same display object, the display device can present different AR effects combining virtual and real through virtual objects with different rendering effects, thereby enriching the display mode and improving the display effect.

In some embodiments, as shown in FIG. 13a, the display methods provided by the embodiments of the present disclosure are as follows:

In the embodiment of the present disclosure, when the display device 101 is started at a non-initial position of the sliding track 105, the binocular camera 123 on the display device acquires an image of a real scene in real time. The display device 101 acquires the first pose information of the Nth video frame in the visual space coordinate system phy_c after the display device 101 is started through the binocular camera 123 provided on the display device 101 . Then, the second pose information of the Nth video frame in the camera sensor coordinate system slam_c is acquired through the binocular camera 123 provided on the display device 101 . The display device 101 can obtain the pose offset information of the Nth video frame by converting the second pose information into the visual space coordinate system phy_c of the first pose information, and performing calculation with the first pose information. The display device 101 acquires the current pose information of the N+X th video frame (current video frame) in the camera sensor coordinate system slam_c through the binocular camera 123 provided on the display device 101 . The display device 101 can obtain the virtual object pose data by multiplying the inverse matrix of the current pose information and the historical pose offset information. In the embodiment of the present disclosure, the virtual object may be the virtual car 122 . The rendering engine of the display device 101 obtains the virtual car 122 pose data, renders the virtual car 122 based on the virtual car 122 pose data, and superimposes the virtual car 122 on the display position associated with the building model 121 to obtain the building model 121 and the virtual car. 122 superimposed enhanced display AR effects. The fusion degree of the virtual car 122 and the building model 121 and the AR rendering effect are improved.

In some embodiments, as shown in FIG. 13b, the steps of another display method provided by an embodiment of the present disclosure may be as follows:

In this embodiment of the present disclosure, when the display device 101 starts at a non-initial position of the sliding track 125 , the display device 101 obtains the first sensor at the initial time T0 when the display device 101 starts up through the inertial sensor 124 provided on the display device 101 . The first sensing data includes acceleration data and orientation data of the display device 101 at the initial time T0. The display device 101 obtains the first pose information of the historical video frame corresponding to the historical time T1 before the current video frame in the visual space coordinate system phy_c through the binocular camera 123 provided on the display device 101 . The display device 101 acquires the second sensing data of the historical time T1 corresponding to the historical video frame through the inertial sensor 124 provided on the display device 101 , and the second sensing data includes acceleration data and direction data of the historical time T1 . Then, by calculating the offset between the first sensing data and the second sensing data, historical sensing pose information can be obtained. The display device 101 can obtain the historical pose offset information by converting the historical sensing pose information into the visual space coordinate system phy_c of the first pose information, and performing calculation with the first pose information. The display device acquires third sensing data of the current time T2 of the display device 101 through the inertial sensor 124 provided on the display device 101 , and the third sensing data includes acceleration data and direction data at the current time. The display device 101 can obtain the current pose information by calculating the offset between the first sensing data and the third sensing data. The display device 101 can obtain the virtual object pose data by multiplying the inverse matrix of the current pose information and the historical pose offset information. In the embodiment of the present disclosure, the virtual object may be the virtual car 122 . The rendering engine of the display device 101 obtains the virtual car 122 pose data, renders the virtual car 122 based on the virtual car 122 pose data, and superimposes the virtual car 122 on the display position associated with the building model 121 to obtain the building model 121 and the virtual car. 122 superimposed enhanced display AR effects. The fusion degree of the virtual car 122 and the building model 121 and the AR rendering effect are improved.

An embodiment of the present disclosure provides a display device. FIG. 14 is a schematic diagram of an optional composition structure of the display device provided by the embodiment of the present disclosure. As shown in FIG. 14 , the display device 455 includes:

The collection part 4551 is configured to obtain the current pose information of the current video frame collected from the real scene, and obtain the historical pose information of the historical video frame before the current video frame;

processing part 4552, configured to determine virtual object pose data based on historical pose information and current pose information;

The rendering part 4553 is configured to use the virtual object pose data to render the virtual object corresponding to the virtual object pose data in the current video frame displayed by the display device;

The presentation part 4554 is configured to display the augmented reality effect in which the real scene and the virtual object are superimposed through the display device.

After the display device 455 is started in a non-initial position, the acquisition part 4551 acquires images and videos in the real scene in real time. The image video includes the current video frame and the historical video frame. The processing part 4552 performs processing based on the current video frame and the historical video frame, and can obtain the current pose information and the historical pose information, respectively. The processing part 4552 calculates the historical pose information to obtain historical pose offset information, and superimposes the current pose information and the historical pose offset information to obtain virtual object pose data. The processing part 4552 sends the virtual object pose data to the rendering part 4553, and the rendering part 4553 renders the virtual object corresponding to the virtual object pose data in the current video frame corresponding to the display device. The presentation section 4554 displays an augmented reality effect in which a real scene is superimposed with a virtual object.

In some embodiments, the augmented reality effect includes one of the following:

The virtual object is rendered in the background area in the real scene.

A real object in the real scene is combined with a virtual object to present an AR effect.

A real object in the real scene is combined with a virtual object enlarged in a certain proportion to present an AR effect.

In some embodiments, the processing part 4552 is further configured to determine, based on the real scene image, a virtual object model corresponding to the display object in a preset three-dimensional virtual scene; the preset three-dimensional virtual scene is based on The virtual model obtained by modeling the real scene; obtaining the judgment result of whether the virtual object model has preset rendering data; in the case that the judgment result is characterized by the existence of preset rendering data in the virtual object model , using the preset rendering data as the virtual object data.

In some embodiments, the processing part 4552 is further configured to determine the current pose information of the display object in the real scene according to the real scene image; according to the preset mapping between the real coordinate system and the virtual coordinate system determine the virtual object model corresponding to the current pose information in the preset three-dimensional virtual scene; the real coordinate system is the coordinate system corresponding to the real scene; the virtual coordinate system is the preset three-dimensional virtual scene The coordinate system corresponding to the virtual scene.

In some embodiments, the processing part 4552 is further configured to determine the position area corresponding to the current pose information in the preset three-dimensional virtual scene according to the preset mapping relationship between the real coordinate system and the virtual coordinate system ; take the corresponding preset virtual model in the location area as the virtual object model.

In some embodiments, the preset three-dimensional virtual scene is a model reconstructed in real time, or a model pre-stored in the cloud.

In some embodiments, the acquisition part 4551 includes a binocular camera; the display device 455 further includes a modeling part, and the acquisition part 4551 is further configured to display objects included in the image based on the real scene , before determining the virtual object data matching the display object, obtain the image information and depth information of the real scene image through the binocular camera; the modeling part is further configured to be based on the real scene image 3D modeling of the display objects in the real scene image to obtain the preset 3D virtual scene.

In some embodiments, the display device 455 further includes an update part, and the update part is configured to display, on the display device, an augmented reality AR effect in which the real scene image and the virtual object are superimposed, During the movement of the display device, the collected real scene image is updated, and an updated virtual object is obtained based on the updated real scene image; the display part 4554 is also configured to display on the display device The augmented reality AR effect in which the updated real scene image and the updated virtual object are superimposed is displayed in real time.

In some embodiments, at least one display device is arranged around the display object, and each display device in the at least one display device is configured to collect all the displayed objects in real time at their respective current positions according to their respective collection directions for the display object. The real scene image of the display object is described, and the corresponding virtual object is obtained based on the real scene image collected respectively, and the augmented reality AR effect in which the corresponding real scene image and the virtual object are superimposed is displayed.

In the embodiment of the present disclosure, when the display device 455 is not started at the initial position, the current pose information of the current video frame and the historical pose information of the historical video frames are obtained respectively through the acquisition part 4551 provided on the display device. The processing part 4552 obtains the historical pose offset information when the display device 455 is not started at the initial position by processing the historical pose information. The processing part 4552 superimposes the current pose information and the historical pose offset information to determine virtual object pose data, where the virtual object pose data includes the coordinate position of each pixel constituting the virtual object. The rendering part 4553 further renders the display position associated with the display object in the real scene according to the pose data of the virtual object, so that the display device 455 can display the virtual object at the display position associated with the display object in the real scene no matter when it is started at any position, and then The display part 4554 displays the augmented reality AR effect by superimposing the virtual object and the real scene image, so that the virtual object displayed by the display device and the real scene are seamlessly integrated.

It should be noted that the descriptions of the above apparatus embodiments are similar to the descriptions of the above method embodiments, and have similar beneficial effects to the method embodiments. For technical details not disclosed in the device embodiments of the present disclosure, please refer to the descriptions of the method embodiments of the present disclosure for understanding.

It should be noted that, in the embodiment of the present disclosure, if the above-mentioned display method is implemented in the form of a software function part and sold or used as an independent product, it may also be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present disclosure essentially or the parts that make contributions to the prior art can be embodied in the form of a software product, and the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a terminal, a server, etc.) is caused to execute all or part of the methods described in the various embodiments of the present disclosure. The aforementioned computer-readable storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes. As such, embodiments of the present disclosure are not limited to any particular combination of hardware and software.

Correspondingly, an embodiment of the present disclosure further provides a computer program product, where the computer program product includes computer-executable instructions, and the computer-executable instructions are used to implement the steps in the presentation method provided by the embodiment of the present disclosure.

Correspondingly, an embodiment of the present disclosure further provides a computer-readable storage medium, where computer-executable instructions are stored on the computer-readable storage medium, and the computer-executable instructions are used to implement the steps of the presentation method provided by the foregoing embodiments.

An embodiment of the present disclosure further provides a display device. FIG. 15 is a schematic diagram of an optional composition structure of the display device provided by the embodiment of the present disclosure. As shown in FIG. 15 , the display device 110 includes: a display screen 1101 ;

memory 1102 configured to store computer programs;

When the processor 1103 is configured to execute the computer program stored in the memory 1102, in conjunction with the display screen 1101, the steps of the display methods provided in the above embodiments are implemented.

The display device 110 also includes a communication bus 1104 . The communication bus 1104 is configured to enable connection communication between these components.

In the embodiment of the present disclosure, the display screen 1101 includes, but is not limited to, a liquid crystal display screen, an organic light-emitting diode display screen, a touch screen display screen, and the like, which is not limited in the embodiment of the present disclosure.

The memory 1102 is configured to store computer programs and applications by the processor 1101, and may also cache data to be processed or processed by the processor 1103 and various parts of the display device 110 (eg, image data, audio data, voice communication data, and video communication data). data), which can be implemented through flash memory (FLASH) or random access memory (Random Access Memory, RAM).

When the processor 1103 executes the program, the steps of any one of the above-mentioned methods are implemented. The processor 1103 generally controls the overall operation of the display device 110 .

The above-mentioned processor 1103 may be an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (Programmable Logic Device) At least one of a Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor. It can be understood that the electronic device that implements the function of the above processor may also be other, which is not limited by the embodiment of the present disclosure.

The above-mentioned computer-readable storage medium/memory can be a read-only memory (Read Only Memory, ROM), a programmable read-only memory (Programmable Read-Only Memory, PROM), an erasable programmable read-only memory (Erasable Programmable Read-Only Memory) Memory, EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Magnetic Random Access Memory (FRAM), Flash Memory (Flash Memory), Magnetic Surface Memory, optical disk, or memory such as Compact Disc Read-Only Memory (CD-ROM); it can also be various terminals including one or any combination of the above memories, such as mobile phones, computers, tablet devices, personal digital Assistant etc.

It should be pointed out here that the descriptions of the above computer-readable storage medium and device embodiments are similar to the descriptions of the above method embodiments, and have similar beneficial effects to the method embodiments. For technical details that are not disclosed in the computer-readable storage medium and device embodiments in the present disclosure, please refer to the description of the method embodiments of the present disclosure for understanding.

It is to be understood that reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic associated with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present disclosure, the size of the sequence numbers of the above-mentioned processes does not imply the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, rather than the embodiments of the present disclosure. implementation constitutes any limitation. The above-mentioned serial numbers of the embodiments of the present disclosure are only for description, and do not represent the advantages or disadvantages of the embodiments.

It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.

The unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit; it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solutions of the embodiments of the present disclosure.

In addition, each functional unit in each embodiment of the present disclosure may be all integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration The unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.

Alternatively, if the above-mentioned integrated units of the embodiments of the present disclosure are implemented in the form of software functional parts and sold or used as independent products, they may also be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of software products in essence or the parts that make contributions to related technologies. The computer software products are stored in a storage medium and include several instructions for making The device automated test line performs all or part of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes various media that can store program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.

The methods disclosed in the several method embodiments provided by the embodiments of the present disclosure can be arbitrarily combined under the condition of no conflict to obtain new method embodiments.

The features disclosed in the several method or device embodiments provided by the embodiments of the present disclosure can be combined arbitrarily under the condition of no conflict to obtain new method embodiments or device embodiments.

The above description is only the implementation of the embodiments of the present disclosure, but the protection scope of the embodiments of the present disclosure is not limited thereto. Any person skilled in the technical field can easily think of changes or Substitutions should be included within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Industrial Applicability

The embodiment of the present disclosure obtains the historical pose information of the historical video frame before the current video frame by acquiring the current pose information corresponding to the current video frame collected from the real scene, based on the historical pose information and the current pose information, determine the virtual object pose data, and use the virtual object pose data to render the virtual object corresponding to the virtual object pose data in the current video frame displayed by the display device, so that the initial position of the display device is no matter where it is. , the display device can adjust the display of the virtual object according to the offset of the current pose information and the historical pose information, and display the virtual object to an accurate position, which improves the fusion of the virtual object and the real scene and the AR rendering effect.

Claims

A display method, the method comprising:

Obtain the current pose information of the current video frame collected from the real scene;

Obtain the historical pose information of the historical video frame before the current video frame, and determine virtual object pose data based on the historical pose information and the current pose information;

Using the virtual object pose data, rendering the virtual object in the current video frame displayed by the display device;

The augmented reality effect in which the real scene and the virtual object are superimposed is displayed through the display device.
The display method according to claim 1, wherein the acquiring the current pose information of the current video frame collected from the real scene comprises:

The current video frame of the real scene is collected by the collection part of the display device;

The current video frame is processed using a positioning algorithm to obtain the current pose information of the current video frame in the camera sensor coordinate system.
The display method according to claim 1 or 2, wherein the acquiring the historical pose information of the historical video frame before the current video frame, determines the virtual pose based on the historical pose information and the current pose information Object pose data, including:

Obtain the historical pose information of the historical video frame before the current video frame, and determine the historical pose offset information based on the historical pose information;

The virtual object pose data is obtained based on the historical pose offset information and the current pose information.
The display method according to claim 3, wherein the determining the historical pose offset information based on the historical pose information comprises:

Obtain the first pose information of the historical video frame in the visual space coordinate system; and, use a positioning algorithm to process the historical video frame to obtain the second pose of the historical video frame in the camera sensor coordinate system information; wherein, the historical pose information includes the first pose information and the second pose information;

The historical pose offset information is determined based on an offset between the second pose information and the first pose information.
The display method according to claim 3, wherein the determining the historical pose offset information based on the historical pose information comprises:

Obtain the first pose information of the historical video frame in the visual space coordinate system;

acquiring the third sensing data when the acquisition part of the display device acquires the historical video frame;

Determine historical sensing pose information based on the offset between the first sensing data and the third sensing data; the historical pose information includes the first pose information and the historical sensing position attitude information; the first sensing data includes the data of the initial video frame collected by the collecting part when the display device is started;

The historical pose offset information is determined based on an offset between the first pose information and the historical sensed pose information.
The display method according to any one of claims 3 to 5, wherein the obtaining the virtual object pose data based on the historical pose offset information and the current pose information includes:

According to the historical pose offset information, the current pose information is offset to obtain corrected pose information;

The virtual object pose data corresponding to the corrected pose information is determined.
The display method according to any one of claims 1 to 4, wherein the acquiring the current pose information of the current video frame collected from the real scene comprises:

Acquiring first sensing data when the acquisition part of the display device collects the initial video frame, and second sensing data when the acquisition part of the display device collects the current video frame;

Based on the offset between the first sensing data and the second sensing data, the current pose information when the display device collects the current video frame is determined.
The display method according to any one of claims 1 to 7, wherein the virtual object pose data comprises: the coordinate position of each pixel constituting the virtual object;

The rendering of the virtual object in the current video frame displayed by the display device by using the virtual object pose data includes:

The coordinate position of each pixel in the virtual object is mapped to the rendering engine coordinate system to obtain the target coordinate position of each pixel;

Using a rendering engine, the virtual object is rendered at the target coordinate position in the current video frame.
The display method according to any one of claims 1 to 8, wherein the display device moves along a target trajectory.
A display device, used based on a display device, comprising:

The collection part is configured to obtain the current pose information of the current video frame collected from the real scene, and obtain the historical pose information of the historical video frame before the current video frame;

a processing part configured to determine virtual object pose data based on the historical pose information and the current pose information;

a rendering part, configured to use the virtual object pose data to render the virtual object corresponding to the virtual object pose data in the current video frame displayed by the display device;

The display part is configured to display, through the display device, an augmented reality effect in which the real scene and the virtual object are superimposed.
A display device that moves on a preset sliding track, comprising:

a display screen configured to display an augmented reality effect in which the real scene and the virtual object are superimposed on the display device;

a memory configured to store executable instructions;

The processor, when configured to execute the executable instructions stored in the memory, implements the method of any one of claims 1 to 9 in combination with the display screen.
A computer-readable storage medium storing executable instructions configured to, when executed by a processor, implement the method of any one of claims 1 to 9.