CN117689826A

CN117689826A - Three-dimensional model construction and rendering method, device, equipment and medium

Info

Publication number: CN117689826A
Application number: CN202211103452.3A
Authority: CN
Inventors: 范帝楷
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2024-03-12

Abstract

The application provides a three-dimensional model construction and rendering method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring a plurality of pose information and a plurality of first environment images provided by a visual inertial odometer; according to pose information, performing feature point matching on the image feature points extracted from each first environment image to obtain matching points; determining three-dimensional point coordinates according to the matching points; constructing a point cloud map according to the three-dimensional point coordinates and the pose information; and constructing a real-world three-dimensional model according to the point cloud map and the first environment image. The method and the device realize the construction of the real-world three-dimensional model by using pose information provided by the visual inertial odometer and the first environment image.

Description

Three-dimensional model construction and rendering method, device, equipment and medium

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a three-dimensional model construction and rendering method, device, equipment and medium.

Background

With the rapid development of Virtual Reality (VR) technology, the public has an increasing demand for Mixed Reality (MR) applications on VR devices. Since with MR applications it is necessary that the user sees the real world through the VR device while also seeing the interaction between the virtual object and the real world. Thus, it is desirable to construct a three-dimensional model of the real world through which a user can see the real world through a VR device, as well as see interactions between virtual objects and the real world.

Disclosure of Invention

The embodiment of the application provides a three-dimensional model construction and rendering method, device, equipment and medium, which can construct a real-world three-dimensional model.

In a first aspect, an embodiment of the present application provides a three-dimensional model building method, which is executed by a terminal device, where the method includes:

acquiring a plurality of pose information and a plurality of first environment images provided by a visual inertial odometer;

according to the pose information, performing feature point matching on the image feature points extracted from each first environment image to obtain matching points;

determining three-dimensional point coordinates according to the matching points;

constructing a point cloud map according to the three-dimensional point coordinates and the pose information;

and constructing a three-dimensional model of the real world according to the point cloud map and the first environment image.

In a second aspect, an embodiment of the present application provides a rendering method, which is performed by a virtual reality device, where a camera is installed on the virtual reality device, and the method includes:

acquiring a second environment image acquired by the camera;

determining current pose information of the virtual reality equipment according to the second environment image and a pre-constructed point cloud map;

Selecting a target local area from a pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera;

rendering according to the target local area and the second environment image;

the pre-built point cloud map and the three-dimensional model are built based on the three-dimensional model building method according to the embodiment of the first aspect.

In a third aspect, an embodiment of the present application provides a three-dimensional model building apparatus configured in a terminal device, including:

the data acquisition module is used for acquiring a plurality of pose information and a plurality of first environment images provided by the visual inertial odometer;

the feature matching module is used for carrying out feature point matching on the image feature points extracted from each first environment image according to the pose information to obtain matching points;

the coordinate determining module is used for determining three-dimensional point coordinates according to the matching points;

the map is constructed and used for constructing a point cloud map according to the three-dimensional point coordinates and the pose information;

and constructing a model, namely constructing a three-dimensional model of the real world according to the point cloud map and the first environment image.

In a fourth aspect, an embodiment of the present application provides a rendering apparatus configured in a virtual reality device, on which a camera is installed, including:

The image acquisition module is used for acquiring a second environment image acquired by the camera;

the pose determining module is used for determining current pose information of the virtual reality equipment according to the second environment image and a pre-constructed point cloud map;

the region determining module is used for selecting a target local region from a pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera;

the rendering module is used for rendering according to the target local area and the second environment image;

wherein the pre-built point cloud map and the three-dimensional model are built based on the three-dimensional model building device according to the embodiment of the third aspect.

In a fifth aspect, embodiments of the present application provide an electronic device, including:

a processor and a memory for storing a computer program, the processor being adapted to invoke and run the computer program stored in the memory for performing the three-dimensional model building method as described in the embodiments of the first aspect or for performing the rendering method as described in the embodiments of the second aspect.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, where the computer program causes a computer to execute the three-dimensional model building method according to the embodiment of the first aspect or the rendering method according to the embodiment of the second aspect.

In a seventh aspect, embodiments of the present application provide a computer program product comprising program instructions which, when run on an electronic device, cause the electronic device to perform the three-dimensional model building method according to the embodiments of the first aspect, or to perform the rendering method according to the embodiments of the second aspect.

The technical scheme disclosed by the embodiment of the application has at least the following beneficial effects:

according to the three-dimensional model construction method, the plurality of pose information and the plurality of first environment images provided by the visual inertial odometer are obtained, feature point matching is conducted on image feature points extracted from each first environment image according to the pose information to obtain matching points, then a point cloud map is constructed according to three-dimensional point coordinates and pose information determined by the matching points, and then a three-dimensional model of the real world is constructed according to the point cloud map and the first environment images. The method and the device realize that the pose information and the first environment image provided by the visual inertial odometer are utilized to construct the three-dimensional model of the real world, so that when a user uses the mixed reality application on the virtual reality device, the user can see the real world based on the three-dimensional model and simultaneously see the interaction between the virtual object and the real world, and the use requirement of the user is met.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a three-dimensional model construction method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of another three-dimensional model construction method according to an embodiment of the present application;

fig. 3 is a flow chart of a rendering method according to an embodiment of the present application;

fig. 4 is a schematic diagram of pose information respectively output by the point cloud map and the SLAM model provided in the embodiment of the present application;

FIG. 5 is a schematic block diagram of a three-dimensional model building apparatus provided in an embodiment of the present application;

FIG. 6 is a schematic block diagram of a rendering apparatus provided by an embodiment of the present application;

FIG. 7 is a schematic block diagram of an electronic device provided by an embodiment of the present application;

fig. 8 is a schematic block diagram of an electronic device provided in an embodiment of the present application as an HMD.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application in light of the embodiments herein.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The method and the device are suitable for constructing a real-world three-dimensional model scene on Virtual Reality (VR) equipment, and considering that with the development of the VR equipment, the requirements of the public for using Mixed Reality (MR) applications on the VR equipment are higher and higher. But because with MR applications it is necessary that the user sees the real world through the VR device while also seeing the interaction between the virtual object and the real world. Thus, it is desirable to construct a three-dimensional model on the VR device that corresponds to the real world, such that the user can see the real world through the VR device and see the interaction between the virtual object and the real world through the three-dimensional model. Therefore, the three-dimensional model construction method is designed, so that a three-dimensional model corresponding to the real world can be constructed for the VR device through the method, and a user can see the real world through the three-dimensional model and simultaneously see interaction between the virtual correspondence and the real world when using the MR application on the VR device, so that the use requirement of the user is met.

In order to facilitate understanding of embodiments of the present application, before describing various embodiments of the present application, some concepts related to all embodiments of the present application are first appropriately explained, specifically as follows:

1) Virtual Reality (VR) is a technology for creating and experiencing a Virtual world, determining to generate a Virtual environment, which is a multi-source information (the Virtual Reality mentioned herein at least includes visual perception, and may also include auditory perception, tactile perception, motion perception, and even include gustatory perception, olfactory perception, etc.), implementing a fused, interactive three-dimensional dynamic view of the Virtual environment and simulation of physical behavior, immersing a user in the simulated Virtual Reality environment, and implementing applications in various Virtual environments such as maps, games, videos, education, medical treatment, simulation, collaborative training, sales, assistance in manufacturing, maintenance, and repair.

2) A virtual reality device (VR device) may be provided in the form of glasses, a head mounted display (Head Mount Display, abbreviated as HMD), or a contact lens for realizing visual perception and other forms of perception, but the form of the virtual reality device is not limited thereto, and may be further miniaturized or enlarged according to actual needs.

Optionally, the virtual reality device described in the embodiments of the present application may include, but is not limited to, the following types:

2.1 Computer-side virtual reality (PCVR) equipment, which utilizes the PC side to perform the related computation of the virtual reality function and data output, and external computer-side virtual reality equipment utilizes the data output by the PC side to realize the effect of virtual reality.

2.2 Mobile virtual reality device, supporting the setting of a mobile terminal (e.g., a smart phone) in various ways (e.g., a head mounted display provided with a dedicated card slot), performing related calculations of virtual reality functions by the mobile terminal through wired or wireless connection with the mobile terminal, and outputting data to the mobile virtual reality device, e.g., viewing virtual reality video through the APP of the mobile terminal.

2.3 The integrated virtual reality device has a processor for performing the related computation of the virtual function, so that the integrated virtual reality device has independent virtual reality input and output functions, does not need to be connected with a PC end or a mobile terminal, and has high use freedom.

3) Mixed Reality (MR for short): it means that new environments and visualizations are created in combination with the real and virtual world, physical entities and digital objects coexist and can interact in real time to simulate real objects. Reality, augmented virtual, and virtual reality technologies are mixed. MR is a kind of Virtual Reality (VR) plus the synthetic Mixed Reality (MR) of Augmented Reality (AR), is the extension of Virtual Reality (VR) technique, through the mode that presents virtual scene in real scene, can increase user experience's sense of realism. The MR field relates to computer vision, which is a science of researching how to make a machine "look at", and further refers to that a camera and a computer replace human eyes to perform machine vision such as recognition, tracking and measurement on a target, and further perform image processing, and the image is processed by the computer into an image more suitable for human eyes to observe or transmit to an instrument to detect.

That is, MR is a simulated scenery that integrates computer-created sensory input (e.g., virtual objects) with sensory input from a physical scenery or a representation thereof, in some MR sceneries, the computer-created sensory input may be adapted to changes in sensory input from the physical scenery. In addition, some electronic systems for rendering MR scenes may monitor orientation and/or position relative to the physical scene to enable virtual objects to interact with real objects (i.e., physical elements from the physical scene or representations thereof). For example, the system may monitor movement such that the virtual plants appear to be stationary relative to the physical building.

4) A virtual scene is a virtual scene that an application program displays (or provides) when running on an electronic device. The virtual scene may be a simulation environment for the real world, a semi-simulation and semi-fictional virtual scene, or a pure fictional virtual scene. The virtual scene may be any one of a two-dimensional virtual scene, a 2.5-dimensional virtual scene or a three-dimensional virtual scene, and the dimension of the virtual scene is not limited in the embodiment of the present application. For example, a virtual scene may include sky, land, sea, etc., the land may include environmental elements of a desert, city, etc., and a user may control a virtual object to move in the virtual scene.

5) Virtual objects, objects that interact in a virtual scene, objects that are under the control of a user or a robot program (e.g., an artificial intelligence based robot program) are capable of being stationary, moving, and performing various actions in the virtual scene, such as various characters in a game, and the like.

Having described some of the concepts to which the present application relates, a detailed description of a three-dimensional model building method according to an embodiment of the present application is provided below with reference to the accompanying drawings. First, taking an execution body as a terminal device as an example, a three-dimensional model construction method in the embodiment of the present application will be described.

Fig. 1 is a schematic flow chart of a three-dimensional model construction method according to an embodiment of the present application. The embodiment of the application is suitable for constructing a real-world three-dimensional model scene, and the three-dimensional model constructing method can be executed by a three-dimensional model constructing device. The three-dimensional model building means may consist of hardware and/or software and may be integrated in the terminal device. In the embodiment of the present application, the terminal device may be any hardware device having a data processing function, such as a server and a virtual reality device.

As shown in fig. 1, the three-dimensional model construction method may include the steps of:

S101, acquiring a plurality of pose information and a plurality of first environment images provided by a visual inertial odometer.

The visual inertial odometer (Visual Inertial Odometry, abbreviated as VIO) refers to a visual sensing device composed of a camera (camera) and an inertial measurement unit (Inertial measurement unit, abbreviated as IMU) to implement an instant positioning and map building (Simultaneous Localization and Mapping, abbreviated as SLAM) algorithm based on data collected by the camera and the IMU.

In the embodiment of the present application, the first environment image refers to an environment image of the real world (real space) where the user is located when the virtual reality device is used.

Consider that a user needs to see real space when using an MR application on a virtual reality device. Thus, when a user starts any MR application on a virtual reality device, the MR application invokes a model identification process to identify whether a three-dimensional model corresponding to real space is built on the virtual reality device. If a three-dimensional model exists, a positioning operation is entered to determine pose information of the virtual reality device. And if the three-dimensional model does not exist, triggering a model construction mechanism to construct a three-dimensional model of the real space where the virtual reality equipment is currently located. When the three-dimensional model corresponding to the real space exists on the virtual reality device, the three-dimensional model can be identified according to a preset identification duration. In the embodiment of the present application, the preset identification duration may be flexibly set according to the actual application needs, for example, 10 seconds(s). In other words, in the process of identifying whether the three-dimensional model exists on the virtual reality device, once the identified spending time period exceeds the preset identification time period, it is determined that the three-dimensional model does not exist on the virtual reality device, and at the moment, the model construction mechanism is automatically triggered to carry out three-dimensional model construction.

Further, since the user moves from one real space to another, such as from a living room to a bedroom, a three-dimensional model of the living room is built on the virtual reality device in advance. Then when the user uses the MR application on the virtual reality device in the bedroom, although it can be recognized that a three-dimensional model exists on the virtual reality device, positioning based on the three-dimensional model cannot be successful. At this time, a model building mechanism is automatically triggered to perform a three-dimensional model building operation on a real space such as a bedroom where the current situation exists.

In order to avoid that the MR application cannot be normally used by a user due to the construction of a three-dimensional model of a real space, an initial real-time mapping mechanism is configured on the virtual reality device in advance. When the three-dimensional model of the current real space is not available on the virtual reality equipment, the mapping and positioning operation can be performed based on the initial real-time mapping mechanism, so that the user can normally use the MR application. And when the initial real-time map construction mechanism is utilized for map construction and positioning, the visual inertial odometer on the virtual reality equipment can output pose information of the virtual reality equipment and a corresponding first environment image in real time. And then the terminal equipment can construct a high-precision three-dimensional model corresponding to the current real space based on the pose information and the first environment image.

In this embodiment of the present application, the terminal device may be a virtual reality device, or may be a cloud server that establishes communication connection with the virtual reality device. Therefore, the method for constructing the three-dimensional model by acquiring the real-time pose information output by the visual inertial odometer and the real-time first environment image can comprise the following conditions:

case one

When the terminal equipment is virtual reality equipment, after the virtual reality equipment receives the real-time pose information and the real-time first environment image output by the visual inertial odometer, the real-time pose information and the real-time first environment image can be stored; when the three-dimensional model is determined to be in a non-working state, such as a standby state or a charging state, the stored pose information and the first environment images are read to perform three-dimensional model construction operation. And after the three-dimensional model is built, the three-dimensional model is stored so that the three-dimensional model can be directly utilized for positioning operation when a subsequent user uses the MR application on the virtual reality device in the real space again.

That is, by storing the acquired plurality of pose information and the plurality of first environment images and constructing the three-dimensional model based on the stored plurality of pose information and the plurality of first environment images in the non-working state, the virtual reality device can be ensured to have enough resources for MR application and other functions to normally operate, power consumption can be further reduced, and standby time of the virtual reality device can be prolonged.

Case two

When the terminal equipment is a cloud server, the cloud server can receive real-time pose information and a real-time first environment image which are output by a world inertial odometer and sent by the virtual reality equipment, and construct a three-dimensional model corresponding to a real space based on the received real-time pose information and the real-time first environment image. And after the three-dimensional model is built, the three-dimensional model is sent to the virtual reality device, so that when the user uses the MR application on the virtual reality device in the real space again, the three-dimensional model can be directly utilized for positioning operation. Therefore, the three-dimensional model of the current real space is built in real time on the basis of not influencing the MR application of the virtual reality equipment used by the user.

And S102, carrying out feature point matching on the image feature points extracted from each first environment image according to the pose information to obtain matching points.

The characteristic points of the image refer to more obvious points or representative points in the image, such as corner points, contour points, bright points in darker areas, dark points in lighter areas and the like in the image. The image feature point is composed of two parts, specifically, a key point and a descriptor (superpoint).

It should be noted that, in the present application, the image feature points extracted from each first environmental image are mainly descriptors in the extracted image feature points.

When the terminal equipment performs three-dimensional model construction based on the acquired pose information and the acquired first environment images, the image feature points can be extracted from each first environment image, and then feature point matching is performed on the extracted feature points according to the pose information so as to obtain matching points.

Specifically, when extracting the image feature points from each first environmental image, the feature point extraction algorithm may be used to extract the image feature points from each first environmental image. The feature point extraction algorithm may include: SIFT (Scale Invariant Feature Transform, size invariant feature transform) extraction algorithm, SURF (speed-Up Robust Features) extraction algorithm, ORB (Oriented FAST and Rotated BRIEF) extraction algorithm, and the like, which are not particularly limited herein.

After the image feature points are extracted, the application can determine two frames of first environment images (target first environment images) similar to the data frames according to the data frames corresponding to the pose information, such as two sides of the first environment images, two frames of first environment images adjacent to the left side, or two frames of first environment images adjacent to the right side. And then, carrying out feature point matching on the image feature points corresponding to the determined two frames of first environment images to obtain matching points.

When the feature points of the image are matched, feature point matching operation can be performed by calling a feature point matching algorithm, so that the distance between the feature points (descriptors) of the image on the first environmental image of two frames is calculated, and a pair of feature points with the minimum distance is selected as the matching points. The feature point matching algorithm may be a brute force matching algorithm BFMatcher, or other feature point matching algorithms, which are not particularly limited herein.

That is, according to pose information, the present application performs feature point matching on image feature points extracted from each first environmental image to obtain matching points, which specifically includes: and selecting a target first environment image from the plurality of first environment images according to the pose information, and performing feature point matching on image feature points of the target first environment image to obtain matching points.

There may be erroneous matching in consideration of matching points obtained by performing feature point matching on image feature points extracted from each first environmental image. Therefore, the present application requires error filtering of matching points.

Specifically, a base matrix may be calculated based on two frames of the first environmental images corresponding to the matching points, and then the base matrix and the position of each of the matching points are substituted into the formula d=x '' ^T * F x, calculating a distance value. Wherein D is a distance value, x' is a position of a first feature point in the matching point, F is a base matrix, x is a position of a second feature point in the matching point, and T is a transpose. Then, comparing the distance value with a preset distance value, if the distance value is larger than the preset distance value, indicating that the matching point is an incorrect matching point, and filtering the matching point; if the distance value is smaller than or equal to the preset distance value, the matching point is indicated to be the correct matching point, and the matching point is reserved. The preset distance value may be flexibly set according to the filtering precision, and is not particularly limited herein, for example, the preset distance value may be set to 3 or other values.

It should be noted that, in the present application, calculating a base matrix based on two frames of the first environmental images corresponding to the matching points is a conventional technology in the art, and will not be described herein in detail.

And S103, determining three-dimensional point coordinates according to the matching points.

Considering that the same feature point is observed by a plurality of first environmental images, there are a plurality of matches of one feature point. Based on the method, the multi-frame first environment image corresponding to the matching point can be determined, and pose information corresponding to each determined frame of first environment image is determined; further, the parallax angle is determined based on the determined pose information. Then, the maximum parallax angle is selected from the plurality of parallax angles, and a matching point corresponding to the maximum parallax angle is used as a target matching point for constructing the three-dimensional model. According to the method and the device, the matching point corresponding to the maximum parallax angle is selected as the target matching point, so that the stability of data calculation can be guaranteed when the three-dimensional model is built subsequently.

And because each feature point in the screened target matching points is a two-dimensional feature point, and the three-dimensional feature point is needed for constructing the three-dimensional model, the two-dimensional feature point is needed to be converted into the three-dimensional feature point. Specifically, triangulation may be performed on each matching point, that is, triangulating processing may be performed on each matching point, so as to obtain three-dimensional point coordinates corresponding to each matching point.

Triangulation of matching points is a conventional technique in the art, and is not described in detail herein.

And S104, constructing a point cloud map according to the three-dimensional point coordinates and the pose information.

Specifically, a beam adjustment (bundle adjustment, abbreviated as BA) algorithm may be used to calculate adjustment of the three-dimensional point coordinates and the plurality of pose information provided by the visual inertial odometer, so as to obtain the optimized three-dimensional point coordinates. And then, based on the optimized three-dimensional point coordinates, constructing a three-dimensional point cloud map of the current real space of the virtual reality equipment. Further, the subsequent positioning operation of the virtual reality device can be assisted based on the three-dimensional point cloud map.

S105, constructing a real-world three-dimensional model according to the point cloud map and the first environment image.

Wherein the three-dimensional model is specifically an environmental grid model.

It should be understood that the present application can better simulate the surface of complex objects in real space, such as human bodies, furniture or living things like teapots, etc., by constructing an environmental mesh model.

Considering that the optimized three-dimensional point coordinates used for constructing the point cloud map are sparse three-dimensional points, and three-dimensional reconstruction, namely, constructing a three-dimensional model of the real world, requires a large number of dense three-dimensional points. Therefore, the application needs to perform dense reconstruction on the optimized three-dimensional point coordinates so as to obtain more dense three-dimensional points.

In addition, when the optimized three-dimensional point coordinates are subjected to dense reconstruction, environment images are needed to be relied on, so that the application can obtain dense three-dimensional points by carrying out dense reconstruction on the optimized three-dimensional point coordinates and a first environment image provided by the visual inertial odometer; further, a real world three-dimensional model is constructed based on the dense three-dimensional points.

Because the three-dimensional model constructed by the method is an environment grid model, when the three-dimensional model of the real world is constructed based on the dense three-dimensional points, triangular grids can be constructed by triangulating the dense three-dimensional points, and then the environment grid model is obtained based on the triangular grids. And then used for image rendering operations based on the environmental mesh model.

As can be seen from the above description, the embodiment of the present application constructs a three-dimensional model of the real world according to the plurality of pose information and the plurality of first environmental images by acquiring the plurality of pose information and the plurality of first environmental images provided by the visual inertial odometer.

On the basis of the foregoing embodiments, the construction of a real world three-dimensional model according to the point cloud map and the first environment image in the embodiments of the present application will be further explained, with particular reference to fig. 2.

As shown in fig. 2, the method may include the steps of:

s201, acquiring a plurality of pose information and a plurality of first environment images provided by a visual inertial odometer.

And S202, carrying out feature point matching on the image feature points extracted from each first environment image according to the pose information to obtain matching points.

And S203, determining three-dimensional point coordinates according to the matching points.

S204, constructing a point cloud map according to the three-dimensional point coordinates and the pose information.

S205, obtaining optimized three-dimensional point coordinates for constructing a point cloud map.

In the embodiment of the application, the optimized three-dimensional point coordinates are specifically three-dimensional point clouds.

And S206, performing dense reconstruction on the optimized three-dimensional point coordinates and the first environment images to obtain a depth map and a normal vector map corresponding to each first environment image.

Wherein the first environmental image refers to all the first environmental images provided by the visual inertial odometer.

Specifically, the present application may utilize a dense reconstruction algorithm, such as MVSNet algorithm, dense optical flow, patch match, and the like, to perform dense reconstruction according to the optimized three-dimensional point coordinates and the first environmental images, so as to generate a depth map and a normal vector map corresponding to each first environmental image.

The depth map refers to Z-axis coordinates of points shot on the image under a camera coordinate system; the normal vector diagram refers to the normal vector of the object surface in three-dimensional space of points photographed on an image.

S207, constructing a real-world three-dimensional model according to the depth map and the normal vector map.

After the depth map and the normal vector map are obtained, the depth map and the normal vector map can be fused to obtain a fused image, and then a three-dimensional model of the real world is built based on three-dimensional point coordinates in the fused image. I.e. building an environmental mesh model.

When the environment grid model is constructed based on the three-dimensional point coordinates in the fused image, triangular grids can be constructed by triangulating the three-dimensional points, and then the environment grid model is obtained based on the triangular grids.

Considering that the pixels in the normal vector image may have the condition that the normal vector is consistent, the method and the device can obtain the processed normal vector image by carrying out aggregation processing on three-dimensional points in the normal vector image.

Specifically, the present application may determine whether the normal vector of a neighboring pixel of each pixel in the normal vector map is consistent with the normal vector of the pixel. If the pixel coordinates are consistent, the pixel and the neighborhood pixel are aggregated into one pixel, and the three-dimensional point coordinates corresponding to the aggregated pixel are calculated according to the three-dimensional point coordinates of the pixel and the neighborhood pixel. In the embodiment of the application, the neighborhood pixels can be eight neighborhood pixels or four neighborhood pixels, and in the application, eight neighborhood pixels are preferable, so that the number of three-dimensional points can be reduced as much as possible, and the speed of constructing the three-dimensional model can be improved.

The pixel and the neighborhood pixels are aggregated into one pixel, specifically, the pixel and all pixels adjacent to the pixel are combined into one pixel.

And according to the three-dimensional point coordinates of the pixels and the neighborhood pixels, when calculating the three-dimensional point coordinate system corresponding to the aggregated pixels, the three-dimensional point coordinates of the pixels and the neighborhood pixels can be added to obtain a sum value; and dividing the sum by the number of pixels to obtain the three-dimensional point coordinates corresponding to the aggregated pixel points.

For example, if eight neighboring pixels of pixel 5 in a certain normal vector image are determined, specifically: pixel 1, pixel 2, pixel 3, pixel 4, pixel 6, pixel 7, pixel 8, and pixel 9; if the normal vector of pixel 1, pixel 2, pixel 3, pixel 4, pixel 6, pixel 7, pixel 8, and pixel 9 corresponds to the pixel 5, then pixel 1, pixel 2, pixel 3, pixel 4, pixel 5, pixel 6, pixel 7, pixel 8, and pixel 9 can be combined into pixel 1'. Then, according to the use formula:the three-dimensional point coordinates of the pixel 1' are calculated from the pixel 1, the pixel 2, the pixel 3, the pixel 4, the pixel 6, the pixel 7, the pixel 8, and the pixel 9. Wherein p is the three-dimensional point coordinate of the pixel 1', R is a rotation matrix, pi ^-1 To project a pixel to a normalization plane (z=1), PX is pixel 1, δ is a neighborhood pixel of pixel 1, d is the depth value of the pixel, and t is the translation vector.

Further, since there may be erroneous three-dimensional points on the depth image, such as three-dimensional points with inconsistent depth values and/or inconsistent colors. Therefore, the method and the device can eliminate the wrong three-dimensional points on the depth image to reduce adverse interference in building the three-dimensional model, so that the building accuracy of the three-dimensional model is improved.

Specifically, the application can project a depth map of one view angle under another view angle based on the depth value and pose information; further, the depth value and the color value of each three-dimensional point on the two depth maps are compared. If the depth values and/or the color values of the three-dimensional points at the same position in the two depth maps are inconsistent, the three-dimensional points are indicated to be error three-dimensional points. At this time, the erroneous three-dimensional point is eliminated. If the depth values and/or color values of the three-dimensional points are consistent, the three-dimensional points are indicated as normal three-dimensional points, and the normal three-dimensional points are reserved.

It should be understood that, by comparing whether the color values of the three-dimensional points on the depth image are consistent, the present application can ensure that the brightness of the photographed image is consistent all the time because the plurality of cameras on the virtual reality device can keep the exposure consistent all the time. The control of the camera exposure consistency can be realized through a preset exposure control program, wherein the preset exposure control can be determined according to the performance of the camera and various parameters of equipment, and the details are not repeated here.

Furthermore, the method and the device can fuse the processed normal vector image and the processed depth image to obtain a fused image; then, triangulating the three-dimensional point coordinates in the fused image to construct a triangular grid, and obtaining an environment grid model based on the triangular grid.

According to the three-dimensional model construction method, the plurality of pose information and the plurality of first environment images provided by the visual inertial odometer are obtained, feature point matching is conducted on image feature points extracted from each first environment image according to the pose information to obtain matching points, then a point cloud map is constructed according to three-dimensional point coordinates and pose information determined by the matching points, and then a three-dimensional model of the real world is constructed according to the point cloud map and the first environment images. The method and the device realize that the pose information and the first environment image provided by the visual inertial odometer are utilized to construct the three-dimensional model of the real world, so that when a user uses the mixed reality application on the virtual reality device, the user can see the real world based on the three-dimensional model and simultaneously see the interaction between the virtual object and the real world, and the use requirement of the user is met. In addition, by generating a depth map and a normal vector map according to the optimized three-dimensional points and the first environment image of the point cloud map, the depth map and the normal vector map are processed, and then an environment grid model is built based on the processed depth map and the normal vector map, so that the building speed of the three-dimensional model is improved, the accuracy of the built three-dimensional model can be ensured, and the use experience of a user is further improved.

The following describes a rendering method according to an embodiment of the present application by taking an execution body as a virtual reality device as an example, and specifically refer to fig. 3.

Fig. 3 is a flow chart of a rendering method according to an embodiment of the present application. The rendering method may be performed by a rendering device. The rendering means may be composed of hardware and/or software and may be integrated in the virtual reality device.

As shown in fig. 3, the rendering method may include the steps of:

s301, acquiring a second environment image acquired by the camera.

Alternatively, after the three-dimensional model of the real space is built, the virtual reality device may store the three-dimensional model in its own storage unit. And, when the user uses the MR application on the virtual reality device again, the three-dimensional model of the current real space can be rapidly identified, and then the positioning operation is performed based on the three-dimensional model.

In particular, during the MR application process of the user on the virtual reality device, the camera on the virtual reality device can acquire the surrounding environment image, namely the second environment image, in real time. The acquired second environmental image is then sent to a processor of the virtual reality device, such that the processor locates based on the second environmental image.

S302, determining current pose information of the virtual reality equipment according to the second environment image and a pre-constructed point cloud map.

The pre-built point cloud map is built based on the three-dimensional model building method of the previous embodiment.

In addition to determining real-time pose information of the virtual reality device based on a pre-constructed point cloud map in the process of using the virtual reality device, the SLAM module on the virtual reality device can also locate the pose information of the virtual reality device in real time. Therefore, there are two pieces of pose information, so that the virtual reality device cannot accurately determine its current position and pose. For example, as shown in fig. 4, the pose track 1 is pose information output by the point cloud map, and the pose track 2 is pose information output by the SLAM module. The SLAM module outputs a frame rate higher than that of the point cloud map.

In order to solve the above problems, before determining current pose information of a virtual reality device, the present application uses a pre-constructed point cloud map, optionally acquires preset frame first pose information provided by the point cloud map, and identical frame second pose information corresponding to the preset frame provided by a SLAM module. Determining a pose conversion relationship according to the first pose information and the second pose information; and further, the pose information provided by the SLAM module is converted into a coordinate system corresponding to the point cloud map, so that coordinate system alignment operation between the SLAM model and the point cloud map is achieved. Thereby realizing providing unique pose information for the virtual reality equipment.

Exemplary, assuming that the preset frame is 3 frames, the posture conversion relationship is T _mw The first pose information provided by the point cloud map is thatThe SLAM module provides second pose information of +.>Where i=1, 2, 3, then according to the preset posture conversion relationship: />The posture conversion relation T can be solved _mw . Further, based on the posture conversion relationship T _mw The pose information provided by the SLAM module may be converted into a coordinate system of the point cloud map. Where Log is the mapping of the lie groups into lie algebra.

Furthermore, after the coordinate system between the SLAM model and the point cloud map is aligned, the current pose information of the virtual reality device can be determined according to the second environment image and the point cloud map which is built in advance. The pre-constructed point cloud map specifically refers to a point cloud map aligned with the SLAM model.

As an optional implementation manner, when determining the current pose information of the virtual reality device, the application may first extract image feature points from the second environment image, and match the image feature points with features of three-dimensional point coordinates in the point cloud map to obtain target matching points; and further, according to the target matching point, determining the current pose information of the virtual reality equipment.

The image feature points extracted from the second environment image are two-dimensional points, and the feature points in the point cloud map are three-dimensional points, so that the target matching points obtained by the method are two-dimensional points and three-dimensional points. And because of PNP (Perselect-n-Point) algorithm, namely pose estimation algorithm, can confirm the pose information of the camera in the reference coordinate system according to the two-dimensional pixel coordinates of the characteristic points and the corresponding three-dimensional space coordinates. Therefore, the pose information of the virtual reality device in the point cloud map coordinate system can be determined according to the two-dimensional points and the three-dimensional points in the target matching points by utilizing a pnp algorithm.

S303, selecting a target local area from a pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera.

And S304, performing rendering operation according to the target local area and the second environment image.

Wherein the pre-built three-dimensional model is built based on the three-dimensional model building method of the foregoing embodiment.

The view angle range of the user watching the real space is determined by the view cone range of the camera, and when the camera is fixed, the view cone range of the camera is also determined. Therefore, the method and the device can select the target local area from the pre-built three-dimensional model according to the current pose information of the virtual reality device and the view cone range of the camera. The selection process specifically refers to obtaining a target grid area from the environment grid model, so that local rendering of the environment grid model is performed based on the target grid area, unnecessary occupation and waste of resources caused by rendering the whole three-dimensional model each time are avoided, the use amount of the resources is saved, and the running performance of the virtual reality equipment is improved.

According to the rendering method, the current pose information of the virtual reality equipment is determined according to the second environment image and the pre-constructed point cloud map by acquiring the second environment image acquired by the camera; and selecting a target local area from the pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera, and performing rendering operation according to the target local area and the second environment image. Therefore, accurate positioning is achieved through the environment images collected in real time according to the cameras and the point cloud map built in advance, the target local area is selected from the three-dimensional model built in advance according to the determined pose information and the view cone range of the cameras, and then the local rendering of the environment grid model is carried out based on the target grid area, so that unnecessary resource occupation and waste caused by rendering the whole three-dimensional model each time are avoided, the use amount of resources is saved, and the running performance of the virtual reality equipment is improved.

On the basis of the above embodiment, the present application further includes: and correcting the current pose information of the virtual reality equipment.

Considering that the current pose information of the virtual reality device may have errors according to the second environment image and the pre-constructed point cloud map. Therefore, the method and the device can obtain accurate pose information by correcting the current pose information.

Specifically, consider that the application can control the second pose information provided by the SLAM model and the first pose information provided by the point cloud map every several framesLine one-time alignment operation to obtain current pose information of virtual reality equipmentAnd obtaining a covariance of the coordinate system alignment based on the number of matched poses as +.>Because the alignment operation of the map coordinate system and the point cloud map coordinate system is already performed, the method can be used for carrying out the +_ alignment operation on the current pose information based on Kalman filtering>And correcting to obtain corrected pose information.

Exemplary, assume that the second pose information provided by the SLAM model corresponding to the current pose information is T _mc The current pose information of the virtual reality equipment is thatThe covariance of the alignment of the coordinate system is +.>The current covariance is sigma _mc . Then the current pose information can be +.>And (3) correcting:

wherein,for the corrected current pose information, exp is mapping the lie algebra to the lie group, log is mapping the lie group to the lie algebra, and-1 is the inverse.

And, the updated covariance in this application is:

it is understood that the current pose information is corrected to obtain accurate pose information, so that the target local area can be obtained from the three-dimensional model more accurately based on the corrected current pose information, and further rendering operation is performed on the target local area, so that the use scene of a user is more fitted, and the use experience of the user is further improved.

A three-dimensional model building apparatus according to an embodiment of the present application will be described below with reference to fig. 5. It should be noted that, the three-dimensional model building apparatus provided in the embodiments of the present application may be configured in a terminal device, where the terminal device may be any hardware device having a data processing function, such as a server, a virtual reality device, and the like.

Fig. 5 is a schematic block diagram of a three-dimensional model building apparatus provided in an embodiment of the present application. Wherein, the three-dimensional model construction apparatus 400 includes: a data acquisition module 410, a feature matching module 420, a coordinate determination module 430, a map construction module 440, and a model construction module 450.

Wherein, the data acquisition module 410 is configured to acquire a plurality of pose information and a plurality of first environmental images provided by the visual inertial odometer;

the feature matching module 420 is configured to perform feature point matching on the image feature points extracted from each first environmental image according to the pose information to obtain matching points;

a coordinate determining module 430, configured to determine three-dimensional point coordinates according to the matching points;

the map construction module 440 is configured to construct a point cloud map according to the three-dimensional point coordinates and the pose information;

The model building module 450 is configured to build a three-dimensional model of the real world according to the point cloud map and the first environment image.

An optional implementation manner of the embodiment of the present application, the map building module 440 is specifically configured to:

performing adjustment calculation on the three-dimensional point coordinates and the pose information by using a beam method adjustment algorithm to obtain optimized three-dimensional point coordinates;

and constructing a point cloud map according to the optimized three-dimensional point coordinates.

An optional implementation manner of the embodiment of the present application, the model building module 450 includes:

the coordinate acquisition unit is used for acquiring the optimized three-dimensional point coordinates used for constructing the point cloud map;

the reconstruction unit is used for performing dense reconstruction on the optimized three-dimensional point coordinates and the first environment images to obtain a depth map and a normal vector map corresponding to each first environment image;

and the model construction unit is used for constructing a real-world three-dimensional model according to the depth map and the normal vector map.

An optional implementation manner of the embodiment of the present application, the apparatus 400 further includes:

the image processing module is used for carrying out aggregation processing on the three-dimensional point coordinates in the normal vector to obtain a processed normal vector diagram;

Removing the error three-dimensional point coordinates in the depth map to obtain a processed depth map;

correspondingly, the model construction unit is specifically used for:

fusing the processed normal vector image and the processed depth image to obtain a fused image;

and constructing a real-world three-dimensional model according to the three-dimensional point coordinates in the fused image.

An optional implementation manner of the embodiment of the present application, the feature matching module 420 is specifically configured to:

selecting a target first environment image from a plurality of first environment images according to the pose information;

and performing feature point matching on the image feature points of the target first environment image to obtain matching points.

An optional implementation manner of the embodiment of the present application, the coordinate determining module 430 is specifically configured to:

and carrying out triangulation on the matching points, and determining three-dimensional point coordinates corresponding to the matching points.

and the rejecting module is used for rejecting the error matching points in the matching points.

In an optional implementation manner of this embodiment of the present application, if the terminal device is a virtual reality device, the apparatus 400 further includes:

The data storage module is used for storing the pose information and the first environment image;

and the control module is used for executing three-dimensional model construction operation according to the pose information and the first environment image when the device is in a non-working state.

According to the three-dimensional model construction device, the plurality of pose information and the plurality of first environment images provided by the visual inertial odometer are obtained, feature point matching is conducted on image feature points extracted from each first environment image according to the pose information to obtain matching points, then a point cloud map is constructed according to three-dimensional point coordinates and pose information determined by the matching points, and then a three-dimensional model of the real world is constructed according to the point cloud map and the first environment images. The method and the device realize that the pose information and the first environment image provided by the visual inertial odometer are utilized to construct the three-dimensional model of the real world, so that when a user uses the mixed reality application on the virtual reality device, the user can see the real world based on the three-dimensional model and simultaneously see the interaction between the virtual object and the real world, and the use requirement of the user is met.

It should be understood that apparatus embodiments and the foregoing method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus 400 shown in fig. 5 may perform the method embodiment corresponding to fig. 1, and the foregoing and other operations and/or functions of each module in the apparatus 400 are respectively for implementing the corresponding flow in each method in fig. 1, and are not further described herein for brevity.

The apparatus 400 of the embodiments of the present application is described above in terms of functional modules in connection with the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiment of the first aspect in the embodiments of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in software, and the steps of the method of the first aspect disclosed in connection with the embodiments of the present application may be directly implemented as an execution of a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the method embodiment of the first aspect.

A rendering device according to an embodiment of the present application will be described below with reference to fig. 6. It should be noted that, the rendering apparatus provided in the embodiments of the present application may be configured in a virtual reality device, and a camera is installed on the virtual reality device.

Fig. 6 is a schematic block diagram of a rendering apparatus provided in an embodiment of the present application. Wherein, the rendering device 500 includes: an image acquisition module 510, a pose determination module 520, a region determination module 530, and a rendering module 540.

The image obtaining module 510 is configured to obtain a second environmental image collected by the camera;

the pose determining module 520 is configured to determine current pose information of the virtual reality device according to the second environment image and a pre-constructed point cloud map;

the region determining module 530 is configured to select a target local region from a pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera;

a rendering module 540, configured to perform a rendering operation according to the target local area and the second environmental image;

An optional implementation manner of the embodiment of the present application, the pose determining module 520 is specifically configured to:

extracting image feature points from the second environment image;

matching the image characteristic points with the characteristics of the three-dimensional point coordinates in the point cloud map to obtain target matching points;

And determining the current pose information of the virtual reality equipment according to the target matching point.

An optional implementation manner of the embodiment of the present application, the apparatus 500 further includes:

and the correction module is used for correcting the current pose information of the virtual reality equipment.

According to the rendering device, the current pose information of the virtual reality equipment is determined according to the second environment image and the pre-constructed point cloud map by acquiring the second environment image acquired by the camera; and selecting a target local area from the pre-constructed three-dimensional model according to the current pose information and the view cone range of the camera, and performing rendering operation according to the target local area and the second environment image. Therefore, accurate positioning is achieved through the environment images acquired in real time according to the cameras and the point cloud map built in advance, and the target local area is selected from the three-dimensional model built in advance according to the determined pose information and the view cone range of the cameras, so that the local rendering of the environment grid model is carried out based on the target grid area, unnecessary resource occupation and waste caused by rendering the whole three-dimensional model each time are avoided, the use amount of resources is saved, and the running performance of the virtual reality equipment is improved.

It should be understood that apparatus embodiments and the foregoing method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus 500 shown in fig. 6 may perform the method embodiment corresponding to fig. 3, and the foregoing and other operations and/or functions of each module in the apparatus 500 are respectively for implementing the corresponding flow in each method in fig. 3, and are not further described herein for brevity.

The apparatus 500 of the embodiments of the present application is described above in terms of functional modules in connection with the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiments of the second aspect in the embodiments of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in software, and the steps of the method of the second aspect disclosed in connection with the embodiments of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with its hardware, performs the steps in the method embodiments of the second aspect.

Fig. 7 is a schematic block diagram of an electronic device provided in an embodiment of the present application. As shown in fig. 7, the electronic device 600 may include:

a memory 610 and a processor 620, the memory 610 being adapted to store a computer program and to transfer the program code to the processor 620. In other words, the processor 620 may call and run a computer program from the memory 610 to implement the three-dimensional model building method, or rendering method in the embodiments of the present application.

For example, the processor 620 may be configured to perform the three-dimensional model building methods described above, or rendering method embodiments, according to instructions in the computer program.

In some embodiments of the present application, the processor 620 may include, but is not limited to:

a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.

In some embodiments of the present application, the memory 610 includes, but is not limited to:

volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).

In some embodiments of the present application, the computer program may be partitioned into one or more modules that are stored in the memory 610 and executed by the processor 620 to complete the three-dimensional model building method, or rendering method, provided herein. The one or more modules may be a series of computer program instruction segments capable of performing the specified functions, which are used to describe the execution of the computer program in the electronic device.

As shown in fig. 7, the electronic device may further include:

a transceiver 630, the transceiver 630 being connectable to the processor 620 or the memory 610.

The processor 620 may control the transceiver 630 to communicate with other devices, and in particular, may send information or data to other devices or receive information or data sent by other devices. Transceiver 630 may include a transmitter and a receiver. Transceiver 630 may further include antennas, the number of which may be one or more.

It will be appreciated that the various components in the electronic device are connected by a bus system that includes, in addition to a data bus, a power bus, a control bus, and a status signal bus.

In an embodiment of the present application, when the electronic device is an HMD, the embodiment of the present application provides a schematic block diagram of the HMD, as shown in fig. 8.

As shown in fig. 8, the main functional modules of the HMD700 may include, but are not limited to, the following: detection module 710, feedback module 720, sensor 730, control module 740, modeling module 750.

The detection module 810 is configured to detect operation commands of a user by using various sensors, and act on a virtual environment, such as continuously updating images displayed on a display screen along with the line of sight of the user, so as to realize interaction between the user and the virtual scene.

The feedback module 720 is configured to receive data from the sensors and provide real-time feedback to the user. For example, the feedback module 720 may generate a feedback instruction according to the user operation data and output the feedback instruction.

The sensor 730 is configured to accept an operation command from a user and apply it to the virtual environment; and on the other hand is configured to provide the results generated after the operation to the user in the form of various feedback.

The control module 740 is configured to control sensors and various input/output devices, including obtaining user data such as motion, voice, etc., and outputting sensory data such as images, vibrations, temperature, sounds, etc., to affect the user, virtual environment, and the real world. For example, the control module 740 may obtain user gestures, voice, and the like.

The modeling module 750 is configured to construct a three-dimensional model of the virtual environment and may also include various feedback mechanisms of sound, touch, etc. in the three-dimensional model.

It should be appreciated that the various functional modules in the HMD700 are connected by a bus system that includes, in addition to a data bus, a power bus, a control bus, a status signal bus, and the like.

The present application also provides a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments.

Embodiments of the present application also provide a computer program product comprising program instructions which, when run on an electronic device, cause the electronic device to perform the method of the method embodiments described above.

When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces, in whole or in part, a flow or function consistent with embodiments of the present application. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A three-dimensional model construction method, characterized by being executed by a terminal device, the method comprising:

2. The method of claim 1, wherein constructing a point cloud map from the three-dimensional point coordinates and the pose information comprises:

3. The method of claim 2, wherein constructing a three-dimensional model of the real world from the point cloud map and the first environmental image comprises:

acquiring the optimized three-dimensional point coordinates for constructing a point cloud map;

performing dense reconstruction on the optimized three-dimensional point coordinates and the first environment images to obtain a depth map and a normal vector map corresponding to each first environment image;

And constructing a real-world three-dimensional model according to the depth map and the normal vector map.

4. A method according to claim 3, further comprising, after obtaining the depth map and the normal vector map corresponding to each of the first environmental images:

carrying out aggregation treatment on the three-dimensional point coordinates in the normal vector to obtain a treated normal vector diagram;

correspondingly, constructing a real-world three-dimensional model according to the depth map and the normal vector map, wherein the three-dimensional model comprises the following steps:

5. The method according to any one of claims 1 to 4, wherein performing feature point matching on the image feature points extracted from each first environmental image to obtain matching points according to the pose information, comprises:

6. The method of any of claims 1-4, wherein determining three-dimensional point coordinates from the matching points comprises:

7. The method according to any one of claims 1 to 4, further comprising, after performing feature point matching on the image feature points extracted from each of the first environmental images to obtain matching points:

and eliminating the error matching points in the matching points.

8. The method according to any one of claims 1-4, further comprising, after acquiring the plurality of pose information and the plurality of first environmental images provided by the visual inertial odometer if the terminal device is a virtual reality device:

storing the pose information and the first environment image;

and when the three-dimensional model is in a non-working state, executing three-dimensional model construction operation according to the pose information and the first environment image.

9. A rendering method performed by a virtual reality device having a camera mounted thereon, the method comprising:

acquiring a second environment image acquired by the camera;

rendering operation is carried out according to the target local area and the second environment image;

wherein the pre-built point cloud map and the three-dimensional model are built based on the three-dimensional model building method according to any one of claims 1 to 8.

10. The method of claim 9, wherein determining current pose information of the virtual reality device from the second environment image and a pre-constructed point cloud map comprises:

extracting image feature points from the second environment image;

11. The method as recited in claim 9, further comprising:

and correcting the current pose information of the virtual reality equipment.

12. A three-dimensional model construction apparatus, characterized by being configured in a terminal device, comprising:

13. A rendering apparatus, characterized by being configured in a virtual reality device on which a camera is mounted, comprising:

The rendering module is used for performing rendering operation according to the target local area and the second environment image;

wherein the pre-built point cloud map and the three-dimensional model are built based on the three-dimensional model building apparatus of claim 12.

14. An electronic device, comprising:

a processor and a memory for storing a computer program, the processor being adapted to call and run the computer program stored in the memory to perform the three-dimensional model building method according to any one of claims 1 to 8 or to perform the rendering method according to any one of claims 9 to 11.

15. A computer-readable storage medium storing a computer program for causing a computer to execute the three-dimensional model construction method according to any one of claims 1 to 8 or the rendering method according to any one of claims 9 to 11.

16. A computer program product comprising program instructions which, when run on an electronic device, cause the electronic device to perform the three-dimensional model building method of any one of claims 1 to 8 or the rendering method of any one of claims 9 to 11.