CN117455974A

CN117455974A - Display method and device and electronic equipment

Info

Publication number: CN117455974A
Application number: CN202210828381.7A
Authority: CN
Inventors: 王宝林; 吴涛
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-07-13
Filing date: 2022-07-13
Publication date: 2024-01-26

Abstract

The disclosure provides a display method, a display device and electronic equipment, relates to the technical field of display, and is used for solving the problems that in the prior art, in the process of virtual reality interaction, users generate dizziness and symptoms such as nausea and dizziness due to incomplete current environment image display. The method comprises the following steps: acquiring a first camera to acquire a plurality of images; determining the degree of freedom information of the virtual reality device according to the multi-view image; converting camera coordinates of the feature points under a camera coordinate system, and determining at least one current map point; determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point and the degree of freedom information; rendering the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determining a rendered image; and controlling the virtual reality device to display the rendered image.

Description

Display method and device and electronic equipment

Technical Field

The disclosure relates to the technical field of display, and in particular relates to a display method, a display device and electronic equipment.

Background

At present, in a perspective scene of virtual reality, a user needs to display a current environment image to the user in real time in the virtual reality interaction process, and when the current environment image is displayed incompletely, dizziness is caused to occur to the user, and motion sickness symptoms such as nausea and dizziness appear, so that the user experience is poor.

Disclosure of Invention

In view of the above, the present disclosure provides a display method, apparatus, and electronic device, for solving the problem in the prior art that, in the process of performing virtual reality interaction, a user generates dizziness due to incomplete display of a current environment image, and motion sickness, dizziness, and other motion sickness symptoms occur.

In order to achieve the above object, the present disclosure provides the following technical solutions:

in a first aspect, the present disclosure provides a display method applied to a virtual reality device, the virtual reality device including a first camera, including: and acquiring a multi-view image acquired by the first camera. According to the multi-view images acquired by the first camera, determining the degree of freedom information of the virtual reality device; wherein the degree of freedom information includes at least six degrees of freedom in directions; determining feature points in the multi-view image and camera coordinates of each feature point under a camera coordinate system; converting camera coordinates of the feature points under a camera coordinate system, and determining at least one current map point; wherein the next current map point in the world coordinate system corresponds to a world coordinate; determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point and the degree of freedom information; wherein, a vertex coordinate corresponds to a texture coordinate, and the history data comprises any one of a history map point and a history depth map; rendering the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determining a rendered image; and controlling the virtual reality device to display the rendered image.

As an optional embodiment of the present disclosure, determining feature points in a multi-view image, and camera coordinates of each feature point under a camera coordinate system, includes: performing image preprocessing on the multi-view image, and determining the preprocessed multi-view image; wherein the image preprocessing includes one or more of distortion correction and stereo correction; extracting features of the preprocessed multi-view images, and determining at least one feature point; and carrying out feature matching on at least one feature point, and determining the camera coordinates of each feature point under a camera coordinate system.

As an optional embodiment of the present disclosure, performing feature extraction on the preprocessed multi-view image, determining at least one feature point includes: and inputting the preprocessed multi-view images into a preconfigured extraction model, and determining at least one characteristic point.

As an optional embodiment of the present disclosure, performing feature extraction on the preprocessed multi-view image, determining at least one feature point includes: and extracting the characteristics of the preprocessed multi-view images by adopting an optical flow method, and determining at least one characteristic point.

As an optional embodiment of the present disclosure, feature matching is performed on at least one feature point, and determining a camera coordinate of each feature point in a camera coordinate system includes: and carrying out feature matching on at least one feature point by adopting a block matching method, and determining the camera coordinates of each feature point under a camera coordinate system.

As an optional embodiment of the present disclosure, feature matching is performed on at least one feature point, and determining a camera coordinate of each feature point in a camera coordinate system includes: and carrying out feature matching according to the pixel coordinates corresponding to each feature point in the at least one feature point, and determining the camera coordinates of each feature point under a camera coordinate system.

As an alternative embodiment of the present disclosure, the history data includes history map points; according to the historical data, each current map point and the degree of freedom information, determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, wherein the method comprises the following steps: projecting the historical map points and the current map points by the current map points, and determining a current depth map; and determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

As an alternative embodiment of the present disclosure, the historical data includes a historical depth map; according to the historical data, each current map point and the degree of freedom information, determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, wherein the method comprises the following steps: the method comprises the steps of carrying out a first treatment on the surface of the Projecting the historical depth map and the current map point, and determining the current depth map; and determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

In a second aspect, the present disclosure provides a virtual reality device, the virtual reality device including a first camera, comprising: the acquisition unit is used for acquiring the multi-view images acquired by the first camera; the processing unit is used for determining the degree of freedom information of the virtual reality device according to the multi-view images acquired by the first camera acquired by the acquisition unit; wherein the degree of freedom information includes at least six degrees of freedom in directions; the processing unit is also used for determining the characteristic points in the multi-view images acquired by the acquisition unit and the camera coordinates of each characteristic point under the camera coordinate system; the processing unit is also used for converting camera coordinates of the feature points in a camera coordinate system and determining at least one current map point; wherein the next current map point in the world coordinate system corresponds to a world coordinate; the processing unit is also used for determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point and the degree of freedom information; wherein, a vertex coordinate corresponds to a texture coordinate, and the history data comprises any one of a history map point and a history depth map; the processing unit is also used for rendering the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determining a rendered image; and the processing unit is also used for controlling the display unit to display the rendered image.

As an optional implementation manner of the disclosure, the processing unit is specifically configured to perform image preprocessing on the multiple images acquired by the acquisition unit, and determine the preprocessed multiple images; wherein the image preprocessing includes one or more of distortion correction and stereo correction; the processing unit is specifically used for extracting the characteristics of the preprocessed multi-view images and determining at least one characteristic point; the processing unit is specifically configured to perform feature matching on at least one feature point, and determine a camera coordinate of each feature point under a camera coordinate system.

As an optional embodiment of the disclosure, the processing unit is specifically configured to input the preprocessed multi-view image into a preconfigured extraction model, and determine at least one feature point.

As an optional implementation manner of the disclosure, the processing unit is specifically configured to perform feature extraction on the preprocessed multi-view image by using an optical flow method, and determine at least one feature point.

As an optional implementation manner of the disclosure, the processing unit is specifically configured to perform feature matching on at least one feature point by using a block matching method, and determine a camera coordinate of each feature point under a camera coordinate system.

As an optional implementation manner of the disclosure, the processing unit is specifically configured to perform feature matching according to pixel coordinates corresponding to each feature point in at least one feature point, and determine camera coordinates of each feature point in a camera coordinate system.

As an alternative embodiment of the present disclosure, the history data includes history map points; the processing unit is specifically used for projecting the historical map points and the current map points and determining the current depth map; the processing unit is specifically configured to determine at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

As an alternative embodiment of the present disclosure, the historical data includes a historical depth map; the processing unit is specifically used for projecting the historical depth map and the current map point to determine the current depth map; the processing unit is specifically configured to determine at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

It should be noted that the above-mentioned computer instructions may be stored in whole or in part on the first computer readable storage medium. The first computer readable storage medium may be packaged together with the processor of the virtual reality device or may be packaged separately from the processor of the virtual reality device, which is not limited in this disclosure.

In a third aspect, the present disclosure provides an electronic device comprising: a memory and a processor, the memory for storing a computer program; the processor is configured to cause the electronic device to implement any one of the display methods as provided in the first aspect, when executing the computer program.

In a fourth aspect, the present disclosure is directed to a computer-readable storage medium having stored thereon a computer program which, when executed by a computing device, causes the computing device to implement any one of the display methods as provided in the first aspect.

In a fifth aspect, the present disclosure is directed to a computer program product for, when run on a computer, causing the computer to implement any one of the display methods as provided in the first aspect.

The descriptions of the second, third, fourth, and fifth aspects of the present disclosure may be referred to the detailed description of the first aspect; further, the advantageous effects described in the second aspect, the third aspect, the fourth aspect, and the fifth aspect may refer to the advantageous effect analysis of the first aspect, and are not described herein.

In the present disclosure, the names of the above-described virtual reality devices do not constitute limitations on the devices or function modules themselves, and in actual implementations, these devices or function modules may appear under other names. Insofar as the function of each device or function module is similar to the present disclosure, it is within the scope of the claims of the present disclosure and the equivalents thereof.

These and other aspects of the disclosure will be more readily apparent from the following description.

Compared with the prior art, the technical scheme provided by the disclosure has the following advantages:

the degree of freedom information of the virtual reality device can be determined through the multi-view images acquired by the first camera. Thereafter, feature points in the multi-view image, and camera coordinates of each feature point in a camera coordinate system are determined. Then, camera coordinates of the feature points in a camera coordinate system are converted to determine at least one current map point. And then, according to the historical data, each current map point and the degree of freedom information, determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map. Because the current depth map contains map points in the historical data, the map points in the current depth map are more, and therefore, at least one vertex coordinate and at least one texture coordinate corresponding to the obtained current depth map are more accurate. Because at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map in the rendered image displayed by the virtual reality device are more accurate, the obtained rendered image is more complete when the multi-view image is rendered according to the at least one vertex coordinate and the at least one texture coordinate corresponding to the current depth map. Therefore, when a user views the rendered image, the problems of dizziness, nausea, dizziness and other motion sickness symptoms caused by incomplete display of the current environment image are avoided. The method solves the problems that in the prior art, in the process of virtual reality interaction, due to incomplete current environment image display, the user generates dizziness, nausea, dizziness and other motion sickness symptoms occur.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

Fig. 1 is a front view of a VR device to which a display method provided in an embodiment of the present disclosure is applied;

fig. 2 is a schematic flow chart of a display method according to an embodiment of the disclosure;

FIG. 3 is a second flow chart of a display method according to an embodiment of the disclosure;

FIG. 4 is a third flow chart of a display method according to an embodiment of the disclosure;

FIG. 5 is a flow chart of a display method according to an embodiment of the disclosure;

FIG. 6 is a flowchart of a display method according to an embodiment of the disclosure;

fig. 7 is a schematic structural diagram of a virtual reality device according to an embodiment of the disclosure;

Fig. 8 is a second schematic structural diagram of a VR device according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of a computer program product of a display method according to an embodiment of the disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.

It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.

Fig. 1 is a front view of a Virtual Reality (VR) device 10, such as a head mounted display (Head Mounted Display, HMD), to which the display method provided by embodiments of the present disclosure is applied. The VR device 10 includes a first camera for providing services to a tracking system (e.g., a synchronized locating and mapping (simultaneous localization and mapping, SLAM) tracking system), and for acquiring an environmental image of a current environment.

Illustratively, the tracking system is a SLAM tracking system, and the first camera is a fisheye camera. As shown in fig. 1, the VR device 10 includes 4 fisheye cameras (a fisheye camera 1, a fisheye camera 2, a fisheye camera 3, and a fisheye camera 4, respectively). The SLAM tracking system may output 6 degrees of freedom (degree of freedom, DOF) information, and map points in the world coordinate system, based on information collected by the fisheye camera 1, the fisheye camera 2, the fisheye camera 3, and the fisheye camera 4. Meanwhile, the fish-eye camera 1, the fish-eye camera 2, the fish-eye camera 3 and the fish-eye camera 4 are also used for shooting an environment image of the current environment.

After wearing the VR device, the user may interact with the virtual reality provided by the VR device. During this time, the SLAM tracking system of the VR device 10 needs to determine current 6DOF information and map points in the world coordinate system from the information acquired by the fisheye camera 1, the fisheye camera 2, the fisheye camera 3, and the fisheye camera 4. Meanwhile, the VR device 10 may capture an environmental image of the current environment through the four of the fisheye camera 1, the fisheye camera 2, the fisheye camera 3, and the fisheye camera 4 so as to provide a multi-view image of the current environment to the user. Wherein the multi-view image comprises at least one ambient image.

In some examples, a user may experience virtual scenes in multiple directions from a left-eye rendered image and a right-eye rendered image provided within a VR device while the user is interacting with a virtual display provided by the VR device. Such as: the left-eye rendering image can be obtained by rendering according to the environment image acquired by any two of the fisheye cameras 1, 2, 3 and 4. The right eye rendering image can be obtained by rendering according to the environment image acquired by any two of the fisheye cameras 1, 2, 3 and 4.

It should be noted that, the above example is described by taking VR device to collect environmental images of the current environment through any 2 fisheye cameras as an example. In some other examples, the VR device may acquire an environmental image of the current environment by using any 3 or more fisheye cameras, and when determining the camera coordinates of each feature point in the camera coordinate system, it is required to determine the first distance according to the pixel coordinates corresponding to the feature point in the first image and the pixel coordinates corresponding to each feature point in the first image in each image except the first image with different depth information (for example, 1 m-2 m). Such as: the VR equipment gathers the environment image through fish-eye camera 1, fish-eye camera 2 and fish-eye camera 3, the environment image that the fish-eye camera gathered is first image, the environment image that the fish-eye camera 2 gathered is the second image, the environment image that the fish-eye camera 3 gathered is the third image, when carrying out characteristic point matching to first image and second image, if do not have in the second image with the characteristic point assorted characteristic point in the first image, can carry out characteristic point matching through carrying out the first image with the third image, so can avoid carrying out the characteristic point matching through two environment images only when, because do not have the characteristic point of matching and lead to the characteristic not enough, can reduce the mismatching that similar texture leads to simultaneously.

Specifically, in order to ensure the integrity of the rendered image, a fisheye camera is generally selected on two sides of the symmetry axis of the VR device. Further, the left-eye rendering image and the right-eye rendering image can be rendered according to the environment images shot by the two selected fish-eye cameras.

It should be noted that, the above example is described by taking the VR device 10 including 4 fisheye cameras as an example, and specifically, the number of fisheye cameras in the VR device 10 may be set according to needs, which is not limited herein.

The following describes, by way of example, a display method provided in an embodiment of the present disclosure, with a VR device including 4 fisheye cameras as a virtual reality device provided in the embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a display method according to an exemplary embodiment, including the following S11-S17, as shown in fig. 2.

S11, the VR equipment acquires the multi-view images acquired by the first camera. The multi-view image comprises at least one environment image, and one environment image corresponds to one first camera.

S12, the VR equipment determines the degree of freedom information of the VR equipment according to the multi-view images acquired by the first camera. Wherein the degree of freedom information includes at least six degrees of freedom in the directions.

In some examples, the degree of freedom information includes depth information. Such as: the degree of freedom information includes degrees of freedom in six directions. Thus, the VR device can calculate the depth information of each feature point in the multi-view image according to the degree of freedom information, and user experience is guaranteed.

S13, the VR device determines feature points in the multi-view image, and camera coordinates of each feature point in a camera coordinate system.

S14, the VR equipment converts camera coordinates of the feature points in a camera coordinate system to determine at least one current map point. Wherein the next current map point in the world coordinate system corresponds to a world coordinate.

And S15, the VR equipment determines at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point and the degree of freedom information. Wherein one vertex coordinate corresponds to one texture coordinate, and the history data includes any one of a history map point and a history depth map.

S16, rendering the multi-view image by at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map of the VR equipment, and determining a rendered image.

In some examples, the VR device is required to provide left-eye and right-eye rendered images to the user. Such as: in combination with the front view of the VR device shown in fig. 1, when the VR device generates the left-eye rendered image, the VR device may execute the above-mentioned processes S11-S16 according to the multi-view images acquired by the four fisheye cameras 1, 2, 3 and 4, so as to determine the left-eye rendered image. When generating the right-eye rendering image, the VR device may execute the above-described processes S11 to S16 according to the multi-view images acquired by the four of the fisheye camera 1, the fisheye camera 2, the fisheye camera 3, and the fisheye camera 4, so as to determine the right-eye rendering image.

And S17, displaying the rendered image by the VR device.

As can be seen from the foregoing, according to the display method provided by the embodiment of the present disclosure, by introducing the history data, the current depth map includes map points in the history data, so that the current depth map has more map points, and therefore, at least one vertex coordinate and at least one texture coordinate corresponding to the obtained current depth map are more accurate. Because at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map in the rendered image displayed by the virtual reality device are more accurate, the obtained rendered image is more complete when the multi-view image is rendered according to the at least one vertex coordinate and the at least one texture coordinate corresponding to the current depth map. Therefore, when a user views the rendered image, the problems of dizziness, nausea, dizziness and other motion sickness symptoms caused by incomplete display of the current environment image are avoided. The method solves the problems that in the prior art, in the process of virtual reality interaction, due to incomplete current environment image display, the user generates dizziness, nausea, dizziness and other motion sickness symptoms occur. .

As an alternative embodiment of the present disclosure, in connection with fig. 2, as shown in fig. 3, the above S13 may be specifically implemented by the following S130 to S132.

S130, the VR equipment performs image preprocessing on the multi-view images and determines the preprocessed multi-view images. Wherein the image preprocessing includes one or more of distortion correction and stereo correction.

In some examples, the multi-view image acquired by the first camera may be distorted, so that in order to ensure that the rendered image can be displayed normally, the VR device needs to perform image preprocessing on the multi-view image, so as to avoid the problem that the user experience is poor due to image distortion. Such as: and the VR equipment performs distortion correction on the multi-view image to obtain a corrected image. The VR device then performs a stereoscopic correction on the corrected image to determine a corrected image, i.e., a preprocessed multi-view image.

The above example is described taking as an example that the VR device performs distortion correction and stereoscopic correction on a plurality of images at the same time. In other examples, the VR device may only correct distortion, or only correct stereo, of the multi-view image, and the invention is not limited.

In some other examples, the VR device needs to determine whether the multi-view image is distorted before image preprocessing the multi-view image. Under the condition that the multi-view image is not distorted, the VR equipment determines the preprocessed multi-view image as the multi-view image; and under the condition that the multi-view image is distorted, the VR equipment performs image preprocessing on the multi-view image and determines the preprocessed multi-view image. In this way, the occupation of computing resources of the VR device may be reduced.

S131, the VR equipment performs feature extraction on the preprocessed multi-view image and determines at least one feature point.

And S132, the VR equipment performs feature matching on at least one feature point, and determines camera coordinates of each feature point under a camera coordinate system.

In some examples, each feature point in the multi-view image corresponds to a pixel coordinate, as the pixel coordinate is unable to characterize depth information for the feature point. Therefore, it is necessary to convert pixel coordinates into camera coordinates. Such as: the VR device has a first conversion relationship between the planar coordinate system and the camera coordinate system pre-configured therein. Then, the VR device determines the camera coordinates of each feature point in the camera coordinate system by multiplying the first conversion relationship by the pixel coordinates corresponding to each feature point.

Specifically, the first transformation relationship includes an internal reference matrix. Wherein the reference matrix is determined by the focal length and offset of the camera.

As an alternative embodiment of the present disclosure, in connection with fig. 3, as shown in fig. 4, the above S131 may be specifically implemented by the following S1310.

S1310, the VR device inputs the preprocessed multi-view images into a preconfigured extraction model, and at least one feature point is determined.

In some examples, in order to better identify feature points in a multi-view image, a display method provided by an embodiment of the present disclosure collects a historical environment image and feature points corresponding to the environment image. And learning the received historical environment image and the feature points corresponding to the environment image through a deep learning model, thereby obtaining a preconfigured extraction model. Thus, when extracting the feature points of the multi-view image, the VR device may input the multi-view image into the pre-configured extraction model, and may further extract the feature points included in the multi-view image.

Specifically, the training process of the preconfigured extraction model is as follows:

and obtaining a training sample image and a labeling result of the training sample image. The training sample image comprises a historical environment image and at least one characteristic point contained in each historical environment image.

The training sample image is input into a deep learning model.

And determining whether a prediction comparison result of the training sample image output by the deep learning model is matched with the labeling result or not based on the target loss function.

And repeatedly updating network parameters of the deep learning model until the model converges to obtain a preconfigured extraction model when the prediction comparison result is not matched with the labeling result.

As an alternative embodiment of the present disclosure, in connection with fig. 3, as shown in fig. 5, S131 may be specifically implemented by S1311 described below.

S1311, the VR device performs feature extraction on the preprocessed multi-view image by adopting an optical flow method, and determines at least one feature point.

In some examples, the display method provided by the embodiments of the present disclosure may reduce the computing resources required by the VR device when extracting the feature points in the environmental image when performing feature extraction on the preprocessed multi-view image by using an optical flow method.

As an alternative embodiment of the present disclosure, in connection with fig. 3, as shown in fig. 4, S132 may be specifically implemented by S1320 described below.

S1320, the VR device performs feature matching on at least one feature point by adopting a block matching method, and determines the camera coordinates of each feature point under a camera coordinate system.

In some examples, when the VR device performs feature matching on at least one feature point by using the block matching method, the VR device finds a best-matched block from adjacent blocks, and may limit the range and number of blocks by using some constraints. For example, on a straight line segment, the corresponding relation of the feature points in the two figures is obtained by utilizing block matching, so that the depth information of the feature points can be calculated (converted into a world coordinate system). Then, according to the pixel positions of the feature points in the single image, the camera coordinates in the camera coordinate system can be obtained. Therefore, the feature matching efficiency can be greatly improved, and the user experience is ensured.

As an alternative embodiment of the present disclosure, in connection with fig. 3, as shown in fig. 4, the above S132 may be specifically implemented by the following S1321.

And S1321, the VR equipment performs feature matching according to the pixel coordinates corresponding to each feature point in the at least one feature point, and determines the camera coordinates of each feature point under the camera coordinate system.

In some examples, in conjunction with the front view of the VR device given in fig. 1, it can be seen that the VR device can simultaneously acquire 4 environmental images of the current environment. Therefore, feature matching can be performed in each of the other 3 environmental images based on the feature points contained in one of the environmental images. Compared with the VR equipment which performs feature extraction on the preprocessed environment image by adopting an optical flow method, when the VR equipment performs feature matching according to the pixel coordinates corresponding to each feature point in at least one feature point, the matching precision of the feature points can be greatly improved.

Taking an environment image acquired by the VR device through the fisheye camera 1 as a first image and an environment image acquired by the VR device through the fisheye camera 2 as a second image as an example, feature matching is performed on the VR device according to pixel coordinates corresponding to each feature point in at least one feature point, and camera coordinates of each feature point under a camera coordinate system are determined to be described:

the VR device determines pixel coordinates corresponding to each feature point in the first image.

The VR device converts the pixel coordinates corresponding to each feature point in the first image into first camera coordinates according to the planar coordinate system of the fisheye camera 1 (the pixel coordinate system of the captured image), the second conversion relationship of the camera coordinate system, and the pre-configured depth information (e.g., 1 m-2 m).

The VR device determines pixel coordinates corresponding to each feature point in the second image with different depth information (e.g., 1 m-2 m) according to epipolar geometry constraints, a conversion relationship between the fisheye camera 1 and the fisheye camera 2 (including a rotation matrix R and a translation matrix t), and the first camera coordinates.

And the VR equipment matches the pixel coordinates corresponding to each characteristic point in the first image with the pixel coordinates corresponding to each characteristic point in the second image with different depth information (such as 1-2 m) to determine the first distance. The first distance is equal to the distance between the pixel coordinates corresponding to the feature points in the first image and the pixel coordinates corresponding to the feature points in the second image with different depth information (for example, 1 m-2 m).

And the VR equipment determines the depth information corresponding to the minimum first distance as the depth information corresponding to the feature points in the first image.

The VR device determines the camera coordinates of the feature points in the first image according to the planar coordinate system of the fisheye camera 2, the second conversion relationship of the camera coordinate system, and the depth information corresponding to the feature points.

As an alternative embodiment of the present disclosure, the history data includes history map points; referring to fig. 2, as shown in fig. 3, S15 may be implemented by the following S150 and S151.

And S150, the VR equipment projects the historical map points and the current map points to determine the current depth map.

In some examples, in combination with the example given in S1321 above, the VR device uses a depth algorithm (e.g., a stereo matching algorithm) to calculate pixel coordinates of the feature point in a plane coordinate system corresponding to different fisheye cameras, so that depth information corresponding to the feature point may be determined. In this way, the VR device may determine, from the camera coordinates, the depth information, and the degree of freedom information, a map point corresponding to each feature point in the world coordinate system. Wherein each map point corresponds to a world coordinate.

Specifically, the VR device stores in advance a fourth conversion relationship between the camera coordinate system corresponding to the different fisheye cameras and the world coordinate system. Such as: in combination with the front view of the VR device shown in fig. 1, the conversion relationship between the camera coordinate system of the fisheye camera 1 and the world coordinate system and the conversion relationship between the camera coordinate system of the fisheye camera 2 and the world coordinate system are stored in the VR device in advance.

In some examples, to ensure that the rendered images viewed by the user are all on the same plane, the display method provided by the embodiments of the present disclosure constructs a projection plane based on the gaze direction of the human eye of the user centered on the user's head. Wherein the eye gaze direction is perpendicular to the projection plane. Thus, after both the historical map points and the current map points are projected onto the projection plane, a depth map can be obtained. Because the depth map contains more characteristic points, the precision of vertex coordinates and texture coordinates corresponding to each map point can be improved.

Specifically, there are differences in depth values in the depth maps determined by different scene VR devices. Such as: the VR device in the indoor scene has a distance to the object in the indoor that is less than or equal to the distance threshold, so that the VR device can better determine depth information of the object in the depth map. Alternatively, the VR device may be located at a distance from an object in the room that is greater than a distance threshold in an outdoor scene, such that the VR device may not accurately determine depth information for the object in the depth map. Thus, the VR device may select the shape of the projection plane based on the current scene. Such as: when the VR device determines that the current scene is an indoor scene, the projection plane of the spherical shape is selected, so that the real scene can be better displayed. Or when the VR device determines that the current scene is an outdoor scene, selecting a projection plane except for a spherical shape so that the depth information of an object with a distance from the VR device being larger than a distance threshold value is consistent in the depth map.

Specifically, projecting both the historical map points and the current map points onto a projection plane includes the steps of:

the projection plane is divided into a number of sparse grids.

And respectively projecting the historical map points and the current map points into different sparse grids, so that the process of projecting the historical map points and the current map points onto a projection plane is completed.

And S151, the VR equipment determines at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

In some examples, VR devices need to utilize a graphics processor (graphics processing unit, GPU) to calculate vertex coordinates and texture coordinates for each map point in a rendered depth map when rendering a rendered image. Such as: the depth map includes 100 map points, and the image size of the environment image is 640×480. At this time, when the 100 map points are projected onto the environment image, 100 vertices (world coordinates of the map points corresponding to each vertex are vertex coordinates) can be obtained, and pixel information corresponding to each vertex is texture coordinates.

As an alternative embodiment of the present disclosure, the historical data includes a historical depth map; referring to fig. 2, as shown in fig. 6, S15 may be implemented by the following S151 and S152.

And S152, the VR equipment projects the historical depth map and the current map point to determine the current depth map.

In some examples, at least one historical map point is included in the historical depth map. Therefore, when the history depth map is projected with the current map point, each history map point included in the history depth map is projected with the current map point, thereby determining the current depth map.

Specifically, the process of projecting each historical map point and the current map point included in the historical depth map to determine the current depth map is similar to the process of projecting the historical map point and the current map point by the VR device in S151 to determine the current depth map, and will not be described herein.

The foregoing description of the solution provided by the embodiments of the present invention has been mainly presented in terms of a method. To achieve the above functions, it includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The embodiment of the invention can divide the functional modules of the virtual reality device according to the method example, for example, each functional module can be divided corresponding to each function, and two or more functions can be integrated in one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiment of the present invention, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation.

As shown in fig. 7, an embodiment of the present invention provides a schematic structural diagram of a virtual reality device 10. The virtual reality device 10 includes an acquisition unit 101, a processing unit 102, and a display unit 103.

An acquiring unit 101, configured to acquire a multi-view image acquired by the first camera. A processing unit 102, configured to determine degree of freedom information of the virtual reality device according to the multi-view image acquired by the first camera acquired by the acquiring unit 101; wherein the degree of freedom information includes at least six degrees of freedom in directions; the processing unit 102 is further configured to determine feature points in the multi-view image acquired by the acquisition unit, and camera coordinates of each feature point under a camera coordinate system. The processing unit 102 is further configured to convert camera coordinates of the feature point in a camera coordinate system, and determine at least one current map point. Wherein the next current map point in the world coordinate system corresponds to a world coordinate. The processing unit 102 is further configured to determine at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point, the degree of freedom information and the sum; wherein, a vertex coordinate corresponds to a texture coordinate, and the history data comprises any one of a history map point and a history depth map; the processing unit 102 is further configured to render the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determine a rendered image; the processing unit 102 is further configured to control the display unit 103 to display the rendered image.

As an optional embodiment of the present disclosure, the processing unit 102 is specifically configured to perform image preprocessing on the multiple images acquired by the acquiring unit 101, and determine the preprocessed multiple images; wherein the image preprocessing includes one or more of distortion correction and stereo correction; the processing unit 102 is specifically configured to perform feature extraction on the preprocessed multi-view image, and determine at least one feature point; the processing unit 102 is specifically configured to perform feature matching on at least one feature point, and determine camera coordinates of each feature point in a camera coordinate system.

As an optional embodiment of the present disclosure, the processing unit 102 is specifically configured to input the preprocessed multi-view image into a preconfigured extraction model, and determine at least one feature point.

As an optional implementation manner of the present disclosure, the processing unit 102 is specifically configured to perform feature extraction on the preprocessed multi-view image by using an optical flow method, and determine at least one feature point.

As an optional implementation manner of the present disclosure, the processing unit 102 is specifically configured to perform feature matching on at least one feature point by using a block matching method, and determine a camera coordinate of each feature point under a camera coordinate system.

As an optional implementation manner of the present disclosure, the processing unit 102 is specifically configured to perform feature matching according to the pixel coordinates corresponding to each feature point in the at least one feature point, and determine the camera coordinates of each feature point in the camera coordinate system.

As an alternative embodiment of the present disclosure, the history data includes history map points; the processing unit 102 is specifically configured to project the historical map point and the current map point, and determine a current depth map; the processing unit 102 is specifically configured to determine at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

As an alternative embodiment of the present disclosure, the historical data includes a historical depth map; the processing unit 102 is specifically configured to project the historical depth map and the current map point, and determine the current depth map; the processing unit 102 is specifically configured to determine at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information. All relevant contents of each step related to the above method embodiment may be cited to the functional descriptions of the corresponding functional modules, and their effects are not described herein.

Of course, the virtual reality device 10 provided in the embodiment of the present invention includes, but is not limited to, the above modules, for example, the virtual reality device 10 may further include the storage unit 104. The storage unit 104 may be used for storing program code of the virtual reality device 10, and may also be used for storing data generated by the virtual reality device 10 during operation, such as data in a write request, etc.

Fig. 8 is a schematic structural diagram of a VR device according to an embodiment of the present invention, where, as shown in fig. 8, the VR device may include: at least one processor 51, a memory 52, a communication interface 53, a communication bus 54 and a display 55.

The following describes each component of the VR device in detail with reference to fig. 8:

the processor 51 is a control center of the VR device, and may be one processor or a generic name of a plurality of processing elements. For example, processor 51 is a central processing unit (Central Processing Unit, CPU), but may also be an integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present invention, such as: one or more DSPs, or one or more field programmable gate arrays (Field Programmable Gate Array, FPGAs).

In a particular implementation, processor 51 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 8, as an example. Also, as one example, the VR device may include multiple processors, such as processor 51 and processor 56 shown in fig. 8. Each of these processors may be a Single-core processor (Single-CPU) or a Multi-core processor (Multi-CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

The Memory 52 may be, but is not limited to, a Read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (Random Access Memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a compact disc (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 52 may be stand alone and be coupled to the processor 51 via a communication bus 54. Memory 52 may also be integrated with processor 51.

In a specific implementation, the memory 52 is used to store data in the present invention and to execute software programs of the present invention. The processor 51 may perform various functions of the air conditioner by running or executing a software program stored in the memory 52 and calling data stored in the memory 52.

The communication interface 53 uses any transceiver-like means for communicating with other devices or communication networks, such as a radio access network (Radio Access Network, RAN), a wireless local area network (Wireless Local Area Networks, WLAN), a terminal, a cloud, etc. The communication interface 53 may include an acquisition unit 101 to implement an acquisition function.

The communication bus 54 may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.

The display screen 55 is used to display images, videos, and the like. The display 55 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrixorganic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot lightemitting diodes, QLED), or the like. In some embodiments, the handset may include 1 or N displays 55, N being a positive integer greater than 1.

As an example, in connection with fig. 7, the function realized by the acquisition unit 101 in the virtual reality device 10 is the same as the function of the communication interface 53 in fig. 8, the function realized by the processing unit 102 in the virtual reality device 10 is the same as the function of the processor 51 in fig. 8, the function realized by the display unit 103 in the virtual reality device 10 is the same as the function of the display screen 55 in fig. 8, and the function realized by the storage unit 104 in the virtual reality device 10 is the same as the function of the memory 52 in fig. 8.

Another embodiment of the present invention also provides a computer-readable storage medium having a computer program stored thereon, which when executed by a computing device, causes the computing device to perform the method shown in the above-described method embodiment.

In some embodiments, the disclosed methods may be implemented as computer program instructions encoded on a computer-readable storage medium in a machine-readable format or encoded on other non-transitory media or articles of manufacture.

Fig. 9 schematically illustrates a conceptual partial view of a computer program product provided by an embodiment of the invention, the computer program product comprising a computer program for executing a computer process on a computing device.

In one embodiment, a computer program product is provided using signal bearing medium 410. The signal bearing medium 410 may include one or more program instructions that when executed by one or more processors may provide the functionality or portions of the functionality described above with respect to fig. 2. Thus, for example, referring to the embodiment shown in FIG. 2, one or more features of S11-S17 may be carried by one or more instructions associated with signal bearing medium 410. Further, the program instructions in fig. 9 also describe example instructions.

In some examples, signal bearing medium 410 may comprise a computer readable medium 411 such as, but not limited to, a hard disk drive, compact Disk (CD), digital Video Disk (DVD), digital tape, memory, read-only memory (ROM), or random access memory (random access memory, RAM), among others.

In some implementations, the signal bearing medium 410 may include a computer recordable medium 412 such as, but not limited to, memory, read/write (R/W) CD, R/W DVD, and the like.

In some implementations, the signal bearing medium 410 may include a communication medium 413 such as, but not limited to, a digital and/or analog communication medium (e.g., fiber optic cable, waveguide, wired communications link, wireless communications link, etc.).

The signal bearing medium 410 may be conveyed by a communication medium 413 in wireless form (e.g., a wireless communication medium conforming to the IEEE 802.41 standard or other transmission protocol). The one or more program instructions may be, for example, computer-executable instructions or logic-implemented instructions.

In some examples, a virtual reality device such as described with respect to fig. 7 may be configured to provide various operations, functions, or actions in response to program instructions through one or more of computer readable medium 411, computer recordable medium 412, and/or communication medium 413.

From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the method described in the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.

The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A display method, applied to a virtual reality device, the virtual reality device including a first camera, comprising:

acquiring a multi-view image acquired by the first camera;

determining degree of freedom information of the virtual reality device according to the multi-view image; wherein the degree of freedom information includes at least six degrees of freedom in directions;

determining feature points in the multi-view image and camera coordinates of each feature point under a camera coordinate system;

converting camera coordinates of the feature points under a camera coordinate system, and determining at least one current map point; wherein the next current map point in the world coordinate system corresponds to a world coordinate;

Determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to historical data, each current map point and the degree of freedom information; wherein, a vertex coordinate corresponds to a texture coordinate, and the history data comprises any one of a history map point and a history depth map;

rendering the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determining a rendered image;

and controlling the virtual reality device to display the rendered image.

2. The display method according to claim 1, wherein determining feature points in the multi-view image and camera coordinates of each of the feature points in a camera coordinate system includes:

performing image preprocessing on the multi-view image, and determining the preprocessed multi-view image; wherein the image preprocessing includes one or more of distortion correction and stereo correction;

extracting features of the preprocessed multi-view images, and determining at least one feature point;

and carrying out feature matching on the at least one feature point, and determining the camera coordinates of each feature point under a camera coordinate system.

3. The display method according to claim 2, wherein performing feature extraction on the preprocessed multi-view image to determine at least one feature point includes:

and inputting the preprocessed multi-view images into a preconfigured extraction model, and determining at least one characteristic point.

4. The display method according to claim 2, wherein performing feature extraction on the preprocessed multi-view image to determine at least one feature point includes:

and extracting the characteristics of the preprocessed multi-view images by adopting an optical flow method, and determining at least one characteristic point.

5. The display method according to claim 2, wherein the feature matching the at least one feature point, determining camera coordinates of each of the feature points in a camera coordinate system, includes:

and carrying out feature matching on the at least one feature point by adopting a block matching method, and determining the camera coordinates of each feature point under a camera coordinate system.

6. The display method according to claim 2, wherein the feature matching the at least one feature point, determining camera coordinates of each of the feature points in a camera coordinate system, includes:

And carrying out feature matching according to the pixel coordinates corresponding to each feature point in the at least one feature point, and determining the camera coordinates of each feature point under a camera coordinate system.

7. The display method according to any one of claims 1 to 6, wherein the history data includes history map points;

and determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the historical data, each current map point and the degree of freedom information, wherein the determining comprises the following steps:

the current map point projects the historical map point and the current map point, and a current depth map is determined;

and determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to the current depth map and the degree of freedom information.

8. The display method of any one of claims 1-6, wherein the historical data comprises a historical depth map;

the current map point projects the historical depth map and the current map point, and a current depth map is determined;

9. A virtual reality device, the virtual reality device comprising a first camera, comprising:

the acquisition unit is used for acquiring the multi-view images acquired by the first camera;

the processing unit is used for determining the degree of freedom information of the virtual reality device according to the multi-view images acquired by the acquisition unit; wherein the degree of freedom information includes at least six degrees of freedom in directions;

the processing unit is further used for determining the characteristic points in the multi-view images acquired by the acquisition unit and camera coordinates of each characteristic point under a camera coordinate system;

the processing unit is further used for converting camera coordinates of the feature points in a camera coordinate system and determining at least one current map point; wherein the next current map point in the world coordinate system corresponds to a world coordinate; the processing unit is further used for determining at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map according to historical data, each current map point and the degree of freedom information; wherein, a vertex coordinate corresponds to a texture coordinate, and the history data comprises any one of a history map point and a history depth map;

The processing unit is further used for rendering the multi-view image according to at least one vertex coordinate and at least one texture coordinate corresponding to the current depth map, and determining a rendered image;

the processing unit is further used for controlling the display unit to display the rendered image.

10. An electronic device, comprising: a memory and a processor, the memory for storing a computer program; the processor is configured to cause the electronic device to implement the display method of any one of claims 1-8 when executing the computer program.

11. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a computing device, causes the computing device to implement the display method of any of claims 1-8.