CN114332207A

CN114332207A - Distance determination method and device, computer equipment and storage medium

Info

Publication number: CN114332207A
Application number: CN202111638468.XA
Authority: CN
Inventors: 周玉杰
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2022-04-12

Abstract

The present disclosure provides a method, an apparatus, a computer device and a storage medium for distance determination, wherein the method comprises: rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache, and generating a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera; reading a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point from the preset cache in response to the distance measurement operation of the first pixel point and the second pixel point in the rendered image; determining a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

Description

Distance determination method and device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for determining a distance, a computer device, and a storage medium.

Background

In many scenes, the distance between real objects corresponding to a three-dimensional model needs to be determined through the three-dimensional model constructed for the real scene; the current distance measurement mode has the problem that measurement precision and measurement efficiency cannot be considered at the same time.

Disclosure of Invention

The embodiment of the disclosure at least provides a method and a device for determining distance, computer equipment and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a method for determining a distance, including:

rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache, and generating a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera; reading a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point from the preset cache in response to the distance measurement operation of the first pixel point and the second pixel point in the rendered image; determining a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

Like this, through in the three-dimensional model with the target scene lie in the shooting visual field of virtual camera the depth information that the target summit corresponds renders to predetermineeing the buffer memory in, and generate the rendering image of three-dimensional model, then respond to the range finding operation to first pixel and second pixel in the rendering image, the depth value of reading the first summit that first pixel corresponds and the second summit that the second pixel corresponds that can be convenient from predetermineeing the buffer memory, and utilize the depth value that should read to range, thereby need not to add the bounding box for three-dimensional model, the distance between the different summits of measurement that can be accurate, the range finding process also need not complicated operation process simultaneously, higher measurement efficiency has, reach the effect of taking into account measurement accuracy, and measurement efficiency.

In an optional embodiment, the rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene to a preset cache, and generating a rendered image of the three-dimensional model includes: determining the target vertex from the three-dimensional model based on the pose of a virtual camera in a model coordinate system corresponding to the three-dimensional model in response to an image rendering event for the three-dimensional model; rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset cache; and reading pixel values corresponding to all target vertexes from the preset cache, and generating a rendering image of the three-dimensional model based on the read pixel values.

Determining a plurality of vertexes, which are positioned in the shooting range of the virtual camera and close to the virtual camera, in the three-dimensional model as target vertexes based on the pose of the virtual camera in the model coordinate system; rendering the pixel value and the depth value corresponding to the target vertex into a preset cache, and reading the pixel value from the preset cache to obtain a rendered image; when ranging operation aiming at any two pixel points in a rendered graph occurs, the corresponding depth value can be read from the preset cache, the distance between the corresponding vertexes of any two pixel points is obtained according to the depth value, the characteristic that the three-dimensional model renders related data to the preset cache when rendering an image is utilized, and the distance between the two vertexes can be efficiently and accurately determined.

In an optional embodiment, the rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset buffer includes: determining projection relationship information between the virtual camera and the three-dimensional model based on the pose of the virtual camera in the model coordinate system; determining the projection position of each target vertex in the rendered image based on the projection relation information; determining the cache position of each target vertex in the preset cache based on the projection position; and rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex into the preset cache based on the cache position of each target vertex in the preset cache.

Therefore, the pixel values of the target vertexes corresponding to the pixel points in the rendered image and the storage positions of the depth values in the preset cache can be accurately determined, the pixel values and the depth values of the target vertexes are rendered to the corresponding cache positions, the target vertexes corresponding to any two pixel points in the rendered image can be conveniently and accurately read out of the corresponding depth values from the corresponding cache positions in the subsequent ranging process, and the ranging accuracy is improved.

In an optional embodiment, in response to a distance measurement operation on a first pixel point and a second pixel point in the rendered image, reading a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point from the preset buffer, including: responding to a distance measurement operation of a first pixel point and a second pixel point in the rendered image, and determining a first storage position of a first depth value of the first vertex from the preset cache based on a first position of the first pixel point in the rendered image; reading the first depth value from the preset cache based on the first storage position; and determining a second storage position of a second depth value of the second vertex from the preset cache based on a second position of the second pixel point in the rendered image, and reading the second depth value from the preset cache based on the second storage position.

Therefore, in the process of ranging a first vertex corresponding to a first pixel point and a second vertex corresponding to a second pixel point in the rendered image, a first depth value of the corresponding first vertex and a second depth value of the corresponding second vertex can be read from corresponding storage positions more accurately, and the ranging precision is improved.

In an alternative embodiment, the determining the target distance between the first vertex and the second vertex based on the first depth value and the second depth value includes: determining a depth difference for the first vertex and the second vertex based on the first depth value and the second depth value; determining a second distance after the first vertex and the second vertex are projected to a preset plane based on the distance between the first pixel point and the second pixel point and a camera projection principle; wherein the preset plane is parallel to a projection plane of the virtual camera; determining a target distance between the first vertex and the second vertex based on the depth difference and the second distance.

Therefore, the depth difference between the first vertex and the second distance between the first vertex and the second vertex after being projected to the preset plane are accurately determined; based on the depth difference and the second distance, the distance between the first vertex and the second vertex can be accurately and efficiently measured, so that the problem that the measurement precision and the measurement efficiency cannot be considered in the current distance measuring mode is solved, and the effect of considering both the measurement precision and the measurement efficiency is achieved.

In an alternative embodiment, determining the target distance between the first vertex and the second vertex based on the depth difference and the second distance comprises: determining a third distance of the first vertex and the second vertex in a model coordinate system based on the depth difference and the second distance; determining the target distance based on a proportional relationship between the model coordinate system and real space, and the third distance.

Therefore, based on the depth difference and the second distance and the proportional relation between the model coordinate system and the real space, the distance between the first vertex and the second vertex in the target scene of the real space can be accurately and efficiently measured, the problems that the measurement precision and the measurement efficiency cannot be considered in the current distance measuring mode are solved, and the effects of considering both the measurement precision and the measurement efficiency are achieved.

In an alternative embodiment, the first vertex belongs to a first object in the target scene; the second vertex belongs to a second object in the target scene; after the determining the target distance between the first vertex and the second vertex, the method further comprises: determining a distance between the first object and the second object based on a target distance between the first vertex and the second vertex.

In an alternative embodiment, the first vertex and the second vertex belong to a first object in the target scene; after the determining the target distance between the first vertex and the second vertex, the method further comprises: determining size information of the first object based on the first vertex and a target distance between the second vertices.

In a second aspect, an embodiment of the present disclosure further provides an apparatus for distance determination, including: the processing module is used for rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache and generating a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera; a reading module, configured to read, from the preset cache, a first depth value of a first vertex corresponding to a first pixel point and a second depth value of a second vertex corresponding to a second pixel point in response to a distance measurement operation on the first pixel point and the second pixel point in the rendered image; a determination module to determine a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

In an optional implementation manner, when the processing model performs the rendering of the depth information corresponding to the target vertex in the three-dimensional model of the target scene into a preset cache, and generates the rendered image of the three-dimensional model, the processing model is specifically configured to: determining the target vertex from the three-dimensional model based on the pose of a virtual camera in a model coordinate system corresponding to the three-dimensional model in response to an image rendering event for the three-dimensional model; rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset cache; and reading pixel values corresponding to all target vertexes from the preset cache, and generating a rendering image of the three-dimensional model based on the read pixel values.

In an optional implementation manner, when the processing module performs rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset cache, the processing module is specifically configured to: determining projection relationship information between the virtual camera and the three-dimensional model based on the pose of the virtual camera in the model coordinate system; determining the projection position of each target vertex in the rendered image based on the projection relation information; determining the cache position of each target vertex in the preset cache based on the projection position; and rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex into the preset cache based on the cache position of each target vertex in the preset cache.

In an optional embodiment, when performing a ranging operation on a first pixel point and a second pixel point in the rendered image, the reading module is specifically configured to, when reading, from the preset buffer, a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point: responding to a distance measurement operation of a first pixel point and a second pixel point in the rendered image, and determining a first storage position of a first depth value of the first vertex from the preset cache based on a first position of the first pixel point in the rendered image; reading the first depth value from the preset cache based on the first storage position; and determining a second storage position of a second depth value of the second vertex from the preset cache based on a second position of the second pixel point in the rendered image, and reading the second depth value from the preset cache based on the second storage position. .

In an optional embodiment, the determining module, when performing the determining of the target distance between the first vertex and the second vertex based on the first depth value and the second depth value, is specifically configured to: determining a depth difference for the first vertex and the second vertex based on the first depth value and the second depth value; determining a second distance after the first vertex and the second vertex are projected to a preset plane based on the distance between the first pixel point and the second pixel point and a camera projection principle; wherein the preset plane is parallel to a projection plane of the virtual camera; determining a target distance between the first vertex and the second vertex based on the depth difference and the second distance.

In an optional embodiment, the determining module, when performing the determining the target distance between the first vertex and the second vertex based on the depth difference and the second distance, is specifically configured to: determining a third distance of the first vertex and the second vertex in a model coordinate system based on the depth difference and the second distance; determining the target distance based on a proportional relationship between the model coordinate system and real space, and the third distance.

In an alternative embodiment, the first vertex belongs to a first object in the target scene; the second vertex belongs to a second object in the target scene; the determination module, after performing the determining the target distance between the first vertex and the second vertex, is further to: determining a distance between the first object and the second object based on a target distance between the first vertex and the second vertex.

In an alternative embodiment, the first vertex and the second vertex belong to a first object in the target scene; the determination module, after performing the determining the target distance between the first vertex and the second vertex, is further to: determining size information of the first object based on the first vertex and a target distance between the second vertices.

In a third aspect, this disclosure also provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in the first aspect or any one of the possible implementations of the first aspect.

In a fourth aspect, this disclosure also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.

For the description of the effects of the above distance determining apparatus, computer device, and computer readable storage medium, reference is made to the description of the above distance determining method, which is not repeated here.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

FIG. 1 illustrates a flow chart of a method of distance determination provided by an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating one particular manner of generating a three-dimensional dense model in a method of distance determination provided by an embodiment of the present disclosure;

fig. 3a is a schematic diagram illustrating an arrangement order between a pixel point a and a pixel point b included in a rendered image in the distance determination method according to the embodiment of the present disclosure;

fig. 3b is a schematic diagram illustrating storage positions of pixel values and depth values respectively corresponding to a target vertex a corresponding to a pixel point a and a target vertex b corresponding to a pixel point b in a preset cache in the distance determination method according to the embodiment of the disclosure;

FIG. 4a is a schematic diagram illustrating a display interface displaying a rendered image of a three-dimensional model of a target scene in a method for distance determination provided by an embodiment of the present disclosure;

fig. 4b is a schematic diagram illustrating a display interface displaying a first pixel point and a second pixel point to be processed by distance measurement in the distance determining method according to the embodiment of the disclosure;

FIG. 5 is a schematic diagram illustrating a third distance between the first vertex and the second vertex in the model coordinate system in the distance determination method provided by the embodiment of the disclosure;

FIG. 6 is a schematic diagram of an apparatus for distance determination provided by an embodiment of the present disclosure;

fig. 7 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

In many scenarios, it is necessary to determine the distance between real objects corresponding to a three-dimensional model through the three-dimensional model constructed for the real scene.

The current ranging methods include the following two methods: one is as follows: generating bounding boxes for the three-dimensional models respectively corresponding to the real objects; the three-dimensional model corresponding to the real object is positioned in the bounding box; determining the distance between the real object and the real object by determining the distance between the bounding boxes; although the measurement is simple, in practice, a large gap may exist between the bounding box and the three-dimensional model corresponding to the real object; this causes a large error in the distance determined based on the bounding box. The second step is as follows: the three-dimensional model is formed by point cloud points or grids, and by taking the grid as an example, each grid in the three-dimensional model needs to be traversed, the distance corresponding to each grid is determined, and the distance between objects is determined based on the distance corresponding to each grid; although the distance determined by the method is high in precision, the number of grids forming the three-dimensional model is large, and traversing each grid of the three-dimensional model in sequence causes long time, so that the distance determined by the method needs a large amount of calculation and processing time, the measurement efficiency is low, and more calculation resources need to be consumed. Therefore, the current distance measurement mode has the problem that the measurement precision and the measurement efficiency cannot be considered at the same time.

Based on the research, the present disclosure provides a distance determining method, which renders depth information corresponding to a target vertex located in a shooting field of a virtual camera in a three-dimensional model of a target scene into a preset cache, generates a rendered image of the three-dimensional model, and then, in response to a distance measurement operation on a first pixel point and a second pixel point in the rendered image, may conveniently read depth values of the first vertex corresponding to the first pixel point and the second vertex corresponding to the second pixel point from the preset cache, and performs distance measurement by using the read depth values, thereby eliminating the need to add a bounding box to the three-dimensional model, being capable of accurately measuring distances between different vertices, and simultaneously, the distance measurement process does not need a complex operation process, and has higher measurement efficiency, and achieves the effects of measurement accuracy and measurement efficiency.

The defects existing in the prior art and the proposed solutions are the results of the inventor after practice and careful study, so the discovery process of the above problems and the proposed solutions of the present disclosure to the above problems in the text should be the contribution of the inventor to the present disclosure in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

To facilitate understanding of the present embodiment, a detailed description is first given of a distance determining method disclosed in the embodiments of the present disclosure, and an execution subject of the distance determining method provided in the embodiments of the present disclosure is generally a computer device with certain computing power, where the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the method of distance determination may be implemented by a processor calling computer readable instructions stored in a memory.

The following describes a method for determining a distance provided by an embodiment of the present disclosure.

Referring to fig. 1, a flowchart of a method for determining a distance provided in an embodiment of the present disclosure is shown, where the method includes steps S101 to S103, where:

s101, rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene to a preset cache, and generating a rendered image of the three-dimensional model.

Wherein, the three-dimensional model includes: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the mesh provided in the embodiment of the present disclosure may include, for example, at least one of a triangular mesh or a quadrangular mesh, which is not particularly limited; the target vertices include vertices of the three-dimensional model corresponding to a plurality of meshes located in the shooting range of the virtual camera and in the shooting lens close to the virtual camera, that is, vertices corresponding to meshes that can be "seen" by the virtual camera. The other grids which are located within the shooting range of the virtual camera but are occluded by the grids close to the virtual camera, that is, the vertexes corresponding to the grids which cannot be "seen" by the virtual camera are considered to be not located within the shooting field of view of the virtual camera.

In specific implementation, the three-dimensional model of the target scene may be generated based on a panoramic video obtained by image acquisition of the target scene by the image acquisition device and a pose when the panoramic video is acquired by the image acquisition device.

Wherein the image capture device may include, for example, but not limited to, at least one of a cell phone, a camera, a panoramic camera, an unmanned aerial vehicle, a drone, and the like. Specifically, since the image capturing device can obtain a video or a plurality of video frame images when capturing a target scene, it can be applied to capturing a target scene with a large space, such as a machine room (or a station), a factory building, and the like, in an all-around manner. Taking a computer room as an example, a computing device, a data storage device, a signal receiving device, and the like may be stored therein; the plant may house, for example, production facilities, handling facilities, transportation facilities, and the like. Target scenes such as machine rooms, factory buildings and the like are all solid spaces.

Illustratively, the target scenario may include, for example, a machine room having a large floor space, such as a machine room having a floor space of 20 square meters, 30 square meters, or 50 square meters. In the case of taking a machine room as a target scene, the scene in the machine room can be shot by using the image acquisition equipment.

In addition, the target scene may also be an outdoor scene, for example, in order to monitor the surrounding environment of the tower used for communication or for transmitting electric power, so as to prevent vegetation around the tower from affecting normal application of the tower during growth, the tower and the surrounding environment may be used as the target scene, the video frame image may be acquired, and modeling may be performed on the tower, vegetation near the tower, and buildings and the like that may exist near the tower.

In one possible case, the target scene for data acquisition may include multiple regions, for example, multiple rooms may be included in a large target scene. Additionally, at least one object is also included in the target scene, which may include, for example, but is not limited to: at least one of a building located within the target scene, equipment deployed within the target scene, and vegetation located within the target scene; for example, in the case that the target scene includes a machine room, the buildings located in the target scene may include, but are not limited to: at least one of a machine room ceiling, a machine room floor, a machine room wall, a machine room column, etc.; devices deployed within the target scene may include, for example, but are not limited to: the tower and the outdoor cabinet are arranged on the ceiling of the machine room, the cabling rack connected with the tower is arranged, and the indoor cabinet is arranged in the machine room.

Specifically, when the image acquisition device is controlled to acquire an image of a target scene, the robot carrying the image acquisition device is controlled to walk in the target scene to acquire a panoramic video corresponding to the target scene; or, image acquisition can be performed on the target scene in a manner that workers such as survey personnel hold the image acquisition equipment, so as to obtain a panoramic video corresponding to the target scene; or, the unmanned aerial vehicle provided with the image acquisition equipment can be controlled to fly in the target scene so as to acquire the panoramic video of the target scene.

When the image acquisition is performed on the target scene, in order to complete modeling of the target scene, the image acquisition device can be controlled to perform image acquisition at different poses so as to form a panoramic video corresponding to the target scene.

When data processing is performed on a video frame image acquired by image acquisition equipment, for example, the video frame image is used for three-dimensional model reconstruction, so that the pose of the image acquisition equipment in a target scene needs to be determined. In this case, for example, before the image capturing device captures an image of the target scene, the gyroscope of the image capturing device may be calibrated to determine the pose of the image capturing device in the target scene; illustratively, for example, the optical axis of the image capture device may be adjusted to be parallel to the ground of the target scene.

After the gyroscope of the image acquisition equipment is calibrated, image acquisition can be carried out by selecting a video mode of the image acquisition equipment, and a panoramic video corresponding to a target scene is obtained.

For example, when the three-dimensional model of the target scene includes the three-dimensional dense model, after the panoramic video corresponding to the target scene is acquired, the three-dimensional dense reconstruction of the target scene may be performed based on the panoramic video and the pose of the image acquisition device when acquiring the panoramic video, so as to generate the three-dimensional dense model of the target scene.

For example, but not limited to, at least one of the following A1-A2 may be used to generate a three-dimensional dense model of the target scene:

and A1, the image acquisition equipment only undertakes the task of image acquisition, and transmits the acquired panoramic video and the pose of the panoramic camera during the acquisition of the panoramic video to the data processing equipment by depending on network connection, so that the data processing equipment establishes a three-dimensional dense model corresponding to the target scene.

Network connections that may be relied upon may include, but are not limited to, Fiber Ethernet adapters, mobile communication technologies (e.g., fourth generation mobile communication technology (4G), or fifth generation mobile communication technology (5G)), and Wireless Fidelity (WiFi), among others; the data processing device may for example comprise, but is not limited to, the computer device described above.

When the data processing device processes the panoramic video, for example, the three-dimensional dense reconstruction of the target scene can be performed according to the panoramic video and the pose of the image acquisition device when acquiring the panoramic video (i.e. the pose of the image acquisition device in the target scene), so as to obtain three-dimensional dense data of the target scene; and generating a three-dimensional dense model based on the three-dimensional dense data.

The three-dimensional dense data may include, but is not limited to, a plurality of dense points of the object surface within the target scene, and position information of each dense point within the target scene, respectively.

When generating a three-dimensional dense model of a target scene based on a video, for example, at least one algorithm of Simultaneous Localization And Mapping (SLAM) And real-time dense reconstruction may be used.

For example, when the image acquisition device acquires a panoramic video, a three-dimensional dense model covering a target scene may be gradually generated when the image acquisition device gradually moves to acquire the panoramic video; or after the image acquisition equipment finishes acquiring the panoramic video, generating a three-dimensional dense model corresponding to the target scene by using the obtained complete panoramic video.

In another embodiment of the present disclosure, a specific embodiment of generating a three-dimensional dense model for a target scene by using a SLAM algorithm and a real-time dense reconstruction algorithm is further provided. The panoramic camera selects two fisheye cameras arranged at the front and back positions on the scanner; the fisheye camera is arranged on the scanner in a preset pose position to acquire a panoramic video corresponding to a complete target scene.

Referring to fig. 2, a flowchart of a specific manner of generating a three-dimensional dense model according to an embodiment of the present disclosure is provided, where:

s201, the data processing equipment acquires two panoramic videos which are acquired by the front fisheye camera and the rear fisheye camera of the scanner in real time and are synchronous in time.

Wherein, the two panoramic videos respectively comprise a plurality of frames of video frame images. Because the two fisheye cameras collect two panoramic videos with synchronous time in real time, timestamps of multi-frame video frame images respectively included in the two panoramic videos respectively correspond to each other.

In addition, the precision of the time stamp and the acquisition frequency when acquiring the video frame images in the panoramic video can be determined according to the specific instrument parameters of the two fisheye cameras. For example, setting the time stamp of the video frame image to be accurate to nanosecond; and when the video frame images in the panoramic video are acquired, the acquisition frequency is not lower than 30 hertz (Hz).

S202, the data processing equipment determines relevant data of the inertial measurement unit IMU when the two fisheye cameras respectively acquire the panoramic video.

Taking any one of the two fisheye cameras as an example, when the fisheye camera captures a video frame image in a panoramic video, the relevant data of the inertial measurement unit IMU between two adjacent frames of video frames and the timestamp when the relevant data is acquired can be correspondingly observed and acquired. In particular, a corresponding scanner coordinate system (which may be constituted by, for example, an X-axis, a Y-axis, and a Z-axis) may also be determined for the fisheye camera to determine relevant data of the inertial measurement unit IMU on the scanner coordinate system, such as accelerations and angular velocities under the X-axis, the Y-axis, and the Z-axis of the scanner coordinate system.

In addition, the time stamp for acquiring the relevant data of the inertial measurement unit IMU can be determined according to the specific instrument parameters of the two fisheye cameras. For example, it may be determined that the observation frequency for acquiring the relevant data of the inertial measurement unit IMU is not lower than 400 Hz.

S203, the data processing equipment determines the poses of the two fisheye cameras in the world coordinate system based on the relevant data of the inertial measurement unit IMU.

Specifically, since the coordinate system transformation relationship between the scanner coordinate system and the world coordinate system can be determined, after the relevant data Of the inertial measurement unit IMU is acquired, the poses Of the two fisheye cameras in the world coordinate system can be determined according to the coordinate system transformation relationship, for example, the poses can be expressed as 6-Degree Of Freedom (6 DOF) poses, and specifically, according to the coordinate system transformation relationship between the scanner coordinate system and the world coordinate system, the existing coordinate system transformation method is adopted for determining the poses Of the two fisheye cameras in the world coordinate system, and details are not repeated here.

For the above S201 to S203, when the SLAM algorithm is adopted, since the video frame images in the panoramic video are all panoramic images, the 6DOF pose of the image acquisition device can be accurately solved by the processing steps of image processing, key point extraction, key point tracking, and establishment of the association relationship between the key points, that is, the acquisition and calculation of the 6DOF pose of the image acquisition device in real time are realized; and moreover, the coordinates of dense point cloud points in the target scene can be obtained.

When the video frame images in the panoramic video are processed, the key frame images can be determined in the corresponding multi-frame video frame images in the panoramic video, so that the SLAM algorithm is ensured to have enough processing data, the calculation amount is reduced, and the efficiency is improved.

Specifically, the manner of determining the key frame image from the panoramic video may be, for example, but not limited to, at least one of the following manners B1 to B4:

and B1, extracting at least one frame of video frame image from the panoramic video as a key frame image by using an alternate frame sampling method.

And B2, extracting the frequency of a preset number of video frame images in a preset time, and extracting at least one frame of video frame image from the panoramic video to be used as a key frame image.

The preset time extraction of the preset number of video frame images may include, but is not limited to: two frames per second.

B3, recognizing the content of each frame of video frame image in the panoramic video by using the image Processing algorithm, the image analysis algorithm, the Natural Language Processing (NLP) and other technologies, determining the semantic information corresponding to each frame of video frame image, and extracting the video frame image at least including the target object in the target scene as the key frame image based on the semantic information corresponding to each frame of video frame image.

And B4, responding to the selection of the video frame image in the panoramic video, and determining the key frame image in the panoramic video.

In a specific implementation, a panoramic video of a target scene can be presented to a user, and when the panoramic video is presented, a part of selected video frames in the panoramic video can be used as key frame images in the panoramic video in response to a user selecting operation on the part of selected video frames.

Illustratively, when the panoramic video is presented to the user, for example, a prompt for a selected key frame image may be displayed to the user. Specifically, for example, a video frame image in the panoramic video may be selected in response to a specific operation such as a long press, a double click, or the like by the user, and the selected video frame image may be used as the key frame image. In addition, prompt information may be displayed, for example, a message containing the text "please press the frame of video frame image long to select" is displayed, and when receiving that the user performs a long press operation on any frame of video frame image in the panoramic video, the frame of video frame image is used as a key frame image.

In a possible situation, if the distribution of each device in the target scene is not concentrated, the devices may not exist in continuous multi-frame video frame images but may be concentrated in other frame video frame images in the panoramic video, so that the situation that the video frame images without the devices are used as key frame images can be avoided by adopting a mode of manually selecting the video frame images, and the three-dimensional dense model corresponding to the target scene generated based on the key frame images can more accurately reflect the real scene of the target scene. In another possible situation, if there is a situation that a part of the video frame images in the video frame images are unclear or data is damaged, the situation that the video frame images are used as key frame images can also be avoided by adopting a mode of manually selecting the video frame images.

After determining the key frame image in the panoramic video, storing a key frame image map in a background of an SLAM algorithm, so that after controlling the image acquisition equipment to return to the acquired position again, comparing two frames of video frame images at the position to perform loop detection on the image acquisition equipment, thereby correcting the positioning accumulated error of the image acquisition equipment under long-time and long-distance operation.

And S204, the data processing equipment processes the keyframe images in the panoramic video and the poses of the fisheye cameras, which are respectively acquired by the fisheye cameras, as input data of the real-time dense reconstruction algorithm.

For example, for a panoramic video acquired by any fisheye camera, after determining a new key frame image in the panoramic video by using the above S201 to S203, all currently obtained key frame images and the poses of the fisheye cameras corresponding to the new key frame image are used as input data of the real-time dense reconstruction algorithm.

Before a new key frame image is obtained, for the transmitted key frame image, when the key frame image is used as input data of the real-time dense reconstruction algorithm, the pose of the corresponding fisheye camera is used as the input data to be input into the real-time dense reconstruction algorithm, so that the new key frame image can not be input repeatedly.

S205, the data processing equipment processes the input data by using a real-time dense reconstruction algorithm to obtain a three-dimensional dense model corresponding to the target scene.

Exemplary, resulting three-dimensional dense models may include, for example, but are not limited to: a plurality of dense points located on the surface of an object within the target scene. In generating the three-dimensional dense model, the dense points may be updated, for example, but not limited to, as the process of acquiring the panoramic video continues to scale up. The updating frequency can be determined according to the input frequency of the key frame images and the pose of the fisheye camera when the real-time dense reconstruction algorithm is input.

For the above S204 to S205, when the real-time dense reconstruction algorithm is adopted, a dense stereo matching technique may be used to estimate the corresponding dense depth maps of the keyframe images in the target scene, and the corresponding poses of the fisheye cameras are used to fuse the dense depth maps into a three-dimensional dense model, so as to obtain the three-dimensional model of the target scene after the target scene is completely acquired.

The dense depth map is also called a distance image, and is different from the storage brightness value of a pixel point in a gray image, and the pixel point stores the distance between the pixel point and image acquisition equipment, namely the depth value; because the depth value is only related to the distance and is not related to factors such as environment, light rays, direction and the like, the dense depth map can truly and accurately represent the geometric depth information of the scene, and thus a three-dimensional dense model which can represent the real scene of the target scene can be generated based on the dense depth map; in addition, in consideration of the limitation of the resolution of the device, the image enhancement processing such as denoising or repairing can be carried out on the dense depth image so as to provide the high-quality dense depth image for three-dimensional reconstruction.

In a possible case, by using the pose of the image capturing device corresponding to the key frame image and the pose of the image capturing device corresponding to the new key frame image adjacent to the key frame image, whether the pose of the image capturing device when capturing the target scene is adjusted can be determined. If the pose is not adjusted, continuously performing real-time three-dimensional dense reconstruction on the target scene to obtain a three-dimensional dense model; and if the pose is adjusted, correspondingly adjusting the dense depth map according to the pose adjustment, and performing real-time three-dimensional dense reconstruction on the target scene based on the adjusted dense depth map so as to obtain an accurate three-dimensional dense model.

A2, the image acquisition equipment has the calculation power capable of processing the data of the panoramic video, and after the panoramic video is acquired, the data of the panoramic video is processed by utilizing the calculation power of the image acquisition equipment, so that the three-dimensional dense model corresponding to the target scene is obtained.

Here, the specific manner in which the image capturing device generates the three-dimensional dense model of the target scene based on the panoramic video may refer to the description of a1, and is not described herein again.

After the three-dimensional model of the target scene is generated, because vertices (i.e., dense points) forming each mesh in the three-dimensional model of the target scene have depth information, in the rendering stage, depth information corresponding to the target vertices in the target scene may be rendered into a preset cache, and a rendered image of the three-dimensional model is generated, specifically: in response to an image rendering event for the three-dimensional model, determining a target vertex from the three-dimensional model based on a pose of the virtual camera in a model coordinate system corresponding to the three-dimensional model; rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to a preset cache; and reading the pixel value corresponding to each target vertex from a preset cache, and generating a rendering image of the three-dimensional model based on the read pixel values.

Wherein, presetting the cache for example to be located the cache that the display card corresponds, include: a buffer space for storing the pixel value and the depth value of each pixel point, which includes Gbuffer; gbuffer is where the depth values of the pixels are stored in the graphics card. For the geometry rendering stage, we first need to initialize a frame buffer Object, i.e. gBuffer, which contains multiple color buffers and a single Depth rendering buffer Object.

For example, a plurality of target vertices located in the shooting angle of the virtual camera and close to the virtual camera may be determined from a plurality of vertices constituting each mesh in the three-dimensional model based on the pose of the virtual camera in the model coordinate system corresponding to the three-dimensional model by triggering an operation in response to the presentation of the three-dimensional model.

After the target vertex is determined, determining projection relation information between the virtual camera and the three-dimensional model based on the pose of the virtual camera in the model coordinate system; determining the projection position of each target vertex in the rendered image based on the projection relation information; determining the cache position of each target vertex in a preset cache based on the projection position; and rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to a preset cache based on the cache position of each target vertex in the preset cache.

The projection relationship information between the virtual camera and the three-dimensional model refers to a mapping relationship when the vertexes of each mesh in the three-dimensional model are projected to the projection plane corresponding to the virtual camera.

In a specific implementation, the view angle range of the virtual camera can be determined based on the pose of the virtual camera in the model coordinate system; determining a projection plane of the virtual camera based on the view angle range of the virtual camera; determining projection relation information between the virtual camera and the three-dimensional model based on the pose of the projection plane of the virtual camera in the model coordinate system of the three-dimensional model and the poses of each grid in the three-dimensional model; after the projection relationship information between the virtual camera and the three-dimensional model is determined, based on the projection relationship information, the projection position of each target vertex when projected onto the projection plane of the virtual camera is determined, and the projection position is taken as the projection position in the rendered image.

After the projection position of each target vertex in the rendered image is determined, in order to ensure that the pixel value corresponding to each target vertex stored in the preset cache corresponds to the pixel value corresponding to each pixel point in the rendered image one by one, the pixel value of the target vertex corresponding to each pixel point and the depth value may be sequentially stored in the preset cache according to the arrangement sequence of each pixel point in the rendered image according to the projection position.

Exemplarily, fig. 3a is a schematic diagram of an arrangement sequence between a pixel point a and a pixel point b included in a rendered image, because a pixel value of each pixel point needs 3 bytes of storage space for storage, and a depth value of a target vertex corresponding to the pixel point needs 1 byte of storage space for storage, according to a projection position of the target vertex corresponding to each pixel point in the rendered image, it is determined that a cache position of the target vertex a corresponding to the pixel point a in a preset cache includes cache spaces with addresses 00000000, 00000001, 00000010, and 00000011, exemplarily, the pixel value of the target vertex a corresponding to the pixel point a is stored in the cache space with addresses 00000000, 00000001, and 00000010, and the depth value of the target vertex a corresponding to the pixel point a is stored in the cache space with an address 00000011; the cache location of the target vertex b corresponding to the pixel point b in the preset cache includes a cache space with addresses 00000100, 00000101, 00000110, and 00000111, exemplarily, the pixel value of the target vertex b corresponding to the pixel point b is stored in the cache space with addresses 00000100, 00000101, and 00000110, the depth value of the target vertex b corresponding to the pixel point b is stored in the cache space with addresses 00000111, and a schematic diagram of the storage locations of the pixel value and the depth value in the preset cache, which correspond to the target vertex a corresponding to the specific pixel point a, and the target vertex b corresponding to the pixel point b, respectively, is shown in fig. 3 b.

After rendering the pixel values corresponding to the target vertices and the depth values corresponding to the target vertices into the preset cache based on the storage locations of the target vertices in the preset cache, the pixel values of the target vertices may be sequentially read from the preset cache according to the storage locations of the target vertices in the preset cache, so as to generate a rendered image of the three-dimensional model based on the pixel values of the target vertices.

In step S101, the method for determining a distance provided in the embodiment of the present disclosure further includes:

s102, responding to the distance measurement operation of a first pixel point and a second pixel point in the rendered image, and reading a first depth value of a first peak corresponding to the first pixel point and a second depth value of a second peak corresponding to the second pixel point from the preset cache.

The first pixel point and the second pixel point may include, but are not limited to, any two pixel points in a rendered image, for example.

After the rendering image of the three-dimensional model is generated based on S101, the rendering image of the three-dimensional model may be displayed; after the rendered image is displayed, responding to the distance measurement operation of a first pixel point and a second pixel point in the rendered image, and determining a first storage position of a first depth value of a first vertex from a preset cache based on a first position of the first pixel point in the rendered image; reading a first depth value from a preset cache based on a first storage position; and determining a second storage position of a second depth value of the second vertex from the preset cache based on a second position of the second pixel point in the rendered image, and reading the second depth value from the preset cache based on the second storage position.

For example, if the target scene includes a machine room, a display interface displaying a rendering image of a three-dimensional model of the target scene may be as shown in fig. 4 a; in addition, a touch button "ranging" indicating that the user selects to perform a ranging operation may also be illustrated in fig. 4 a; after the "ranging" touch button shown in fig. 4a is triggered, the rendered image of the three-dimensional model of the target scene is adjusted to an editable mode, and in response to the selection operation of the user for the first pixel point and the second pixel point in the rendered image, the first pixel point and the second pixel point are used as pixel points to be subjected to ranging processing, a specific display interface for displaying the first pixel point and the second pixel point to be subjected to ranging processing may be as shown in fig. 4b, and a "determining" touch button for determining that the ranging processing is performed on the first pixel point and the second pixel point and a "canceling" touch button for canceling the ranging processing on the first pixel point and the second pixel point are also shown in fig. 4 b.

In response to the trigger operation aiming at the 'confirm' touch button, determining to carry out distance measurement processing on the first pixel point and the second pixel point; since the depth value of the target vertex corresponding to each pixel point in the rendered image and the pixel value are stored according to the projection position of each target vertex in the rendered image, the first storage position of the first depth value of the first vertex of the first pixel point generated by projection may be determined in the preset buffer based on the first position of the first pixel point in the rendered image and the preset storage rule of the pixel value and the depth value (for example, but not limited to: storing the depth value by using a buffer space of 1 byte, and storing the pixel value by using a buffer space of 3 bytes); reading a first depth value of a first vertex from a first storage location; determining a second storage position of a second depth value of a second vertex of the second pixel point generated by projection in a preset cache based on a second position of the first pixel point in the rendered image and a preset storage rule of the pixel value and the depth value; reading a second depth value for a second vertex from a second storage location; the preset storage rule of the pixel values and the depth values may be set according to actual requirements, and is not limited herein.

Receiving the foregoing S102, after obtaining a first depth value of a first vertex corresponding to a first pixel point and a second depth value of a second vertex corresponding to a second pixel point, the embodiment of the present disclosure further includes:

s103, determining a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

In one possible implementation, a depth difference between the first depth value and the second depth value may be determined as a target distance between the first vertex and the second vertex.

In another possible implementation, a depth difference for the first vertex and the second vertex may be determined based on the first depth value and the second depth value; determining a second distance after the first vertex and the second vertex are projected to a preset plane based on the distance between the first pixel point and the second pixel point and a camera projection principle; based on the depth difference and the second distance, a target distance between the first vertex and the second vertex is determined.

Wherein the preset plane is parallel to the projection plane of the virtual camera.

Specifically, the calculated absolute value of the difference between the first depth value and the second depth value is determined as the depth difference between the first vertex and the second vertex; and calculating to obtain a second distance after the first vertex and the second vertex are projected to a projection plane parallel to the virtual camera based on a coordinate system conversion relation in the camera projection principle and the distance between the first pixel and the second pixel in the rendered image.

In a specific implementation, after the depth difference and the second distance are obtained through calculation, a third distance of the first vertex and the second vertex in the model coordinate system can be determined based on the depth difference and the second distance; and determining the target distance between the first vertex and the second vertex based on the proportional relation between the model coordinate system and the real space and the third distance.

Specifically, after the depth difference and the second distance are obtained through calculation, a third distance between the first vertex and the second vertex in the three-dimensional model is determined based on the pythagorean theorem, the depth difference and the second distance. For example, the formula specifically used for calculating the third distance of the first vertex and the second vertex in the model coordinate system may be as shown in formula one:

wherein l₁Representing a third distance between the first vertex and the second vertex under the model coordinate system; h is₁A first depth value representing a first vertex; h is₂A second depth value representing a second vertex; | h₁-h₂L represents a depth difference between the first vertex and the second vertex; c represents a second distance after the first vertex and the second vertex are projected to a projection plane parallel to the virtual camera.

For example, a diagram specifically illustrating the third distance of the first vertex and the second vertex in the model coordinate system may be shown in fig. 5, where fig. 5 showsFirst depth value h of first vertex₁A second depth value h of the second vertex₂A second distance c between the first vertex and the second vertex after being projected to a projection plane parallel to the virtual camera, and a third distance l between the first vertex and the second vertex in the model coordinate system₁。

After determining the third distance between the first vertex and the second vertex in the model coordinate system, the target distance between the first vertex and the second vertex may be determined based on the proportional relationship between the model coordinate system and the real space and the third distance.

The proportional relation between the model coordinate system and the real space can be set according to actual requirements, and is not particularly limited; the target distance includes a distance of the first vertex and the second vertex in the target scene in real space.

Illustratively, if the proportional relationship between the model coordinate system and the real space includes 1:10, the third distance l of the first vertex and the second vertex in the model coordinate system is determined₁Then, can be according to l₂＝10l₁Calculating the target distance l between the first vertex and the second vertex₂。

In a specific implementation, after calculating the target distance between the first vertex and the second vertex, the target distance between the first vertex and the second vertex may be applied based on at least one of the following, but not limited to, C1-C2:

c1, in case the first vertex belongs to the first object in the target scene and the second vertex belongs to the second object in the target scene; a distance between the first object and the second object may be determined based on the target distance between the first vertex and the second vertex.

The first object may include, for example, any object in the target scene; the second object may for example comprise any object in the target scene other than the first object.

Illustratively, the target distance l between the first vertex and the second vertex may be₂Determined as the distance between the first object and the second object.

C2, in case the first vertex and the second vertex belong to a first object in the target scene; size information of the first object may be determined based on the first vertex and a target distance between the second vertices.

Illustratively, the target distance l between the first vertex and the second vertex may be₂And determining the size information of the first object.

Similarly, in the case where the first vertex and the second vertex belong to a second object in the target scene; size information of the second object may be determined based on the first vertex and a target distance between the second vertices. Illustratively, the target distance l between the first vertex and the second vertex may be₂And determined as size information of the second object.

In the embodiment of the disclosure, depth information corresponding to a target vertex in a shooting field of a virtual camera in a three-dimensional model of a target scene is rendered into a preset cache, a rendering image of the three-dimensional model is generated, then, in response to a distance measurement operation on a first pixel point and a second pixel point in the rendering image, depth values of the first vertex corresponding to the first pixel point and the second vertex corresponding to the second pixel point can be conveniently read from the preset cache, and the distance is measured by using the read depth values, so that a bounding box is not required to be added for the three-dimensional model, distances between different vertices can be accurately measured, a complex operation process is not required in a distance measurement process, and the method has higher measurement efficiency, and achieves the effects of considering measurement precision and measurement efficiency.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same inventive concept, a distance determining apparatus corresponding to the distance determining method is also provided in the embodiments of the present disclosure, and since the principle of solving the problem of the apparatus in the embodiments of the present disclosure is similar to the distance determining method in the embodiments of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 6, a schematic diagram of an apparatus for determining a distance according to an embodiment of the present disclosure is shown, where the apparatus includes: a processing module 601, a reading module 602 and a determining module 603; wherein:

the processing module 601 is configured to render depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache, and generate a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera; a reading module 602, configured to, in response to a distance measurement operation on a first pixel point and a second pixel point in the rendered image, read, from the preset cache, a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point; a determining module 603 configured to determine a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

In an optional implementation manner, when performing the rendering of the depth information corresponding to the target vertex in the three-dimensional model of the target scene to the preset cache and generating the rendered image of the three-dimensional model, the processing model 601 is specifically configured to: determining the target vertex from the three-dimensional model based on the pose of a virtual camera in a model coordinate system corresponding to the three-dimensional model in response to an image rendering event for the three-dimensional model; rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset cache; and reading pixel values corresponding to all target vertexes from the preset cache, and generating a rendering image of the three-dimensional model based on the read pixel values.

In an optional implementation manner, when the processing module 601 performs rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset buffer, specifically, the processing module is configured to: determining projection relationship information between the virtual camera and the three-dimensional model based on the pose of the virtual camera in the model coordinate system; determining the projection position of each target vertex in the rendered image based on the projection relation information; determining the cache position of each target vertex in the preset cache based on the projection position; and rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex into the preset cache based on the cache position of each target vertex in the preset cache.

In an optional implementation manner, when performing a ranging operation on a first pixel point and a second pixel point in the rendered image, the reading module 602 is specifically configured to, when reading, from the preset buffer, a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point: responding to a distance measurement operation of a first pixel point and a second pixel point in the rendered image, and determining a first storage position of a first depth value of the first vertex from the preset cache based on a first position of the first pixel point in the rendered image; reading the first depth value from the preset cache based on the first storage position; and determining a second storage position of a second depth value of the second vertex from the preset cache based on a second position of the second pixel point in the rendered image, and reading the second depth value from the preset cache based on the second storage position. .

In an optional implementation, the determining module 603, when performing the determining of the target distance between the first vertex and the second vertex based on the first depth value and the second depth value, is specifically configured to: determining a depth difference for the first vertex and the second vertex based on the first depth value and the second depth value; determining a second distance after the first vertex and the second vertex are projected to a preset plane based on the distance between the first pixel point and the second pixel point and a camera projection principle; wherein the preset plane is parallel to a projection plane of the virtual camera; determining a target distance between the first vertex and the second vertex based on the depth difference and the second distance.

In an optional implementation, the determining module 603, when performing the determining the target distance between the first vertex and the second vertex based on the depth difference and the second distance, is specifically configured to: determining a third distance of the first vertex and the second vertex in a model coordinate system based on the depth difference and the second distance; determining the target distance based on a proportional relationship between the model coordinate system and real space, and the third distance.

In an alternative embodiment, the first vertex belongs to a first object in the target scene; the second vertex belongs to a second object in the target scene; the determining module 603, after performing the determining the target distance between the first vertex and the second vertex, is further configured to: determining a distance between the first object and the second object based on a target distance between the first vertex and the second vertex.

In an alternative embodiment, the first vertex and the second vertex belong to a first object in the target scene; the determining module 603, after performing the determining the target distance between the first vertex and the second vertex, is further configured to: determining size information of the first object based on the first vertex and a target distance between the second vertices.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

Based on the same technical concept, the embodiment of the application also provides computer equipment. Referring to fig. 7, a schematic structural diagram of a computer device 700 provided in the embodiment of the present application includes a processor 701, a memory 702, and a bus 703. The memory 702 is used for storing execution instructions and includes a memory 7021 and an external memory 7022; the memory 7021 is also referred to as an internal memory, and is used to temporarily store operation data in the processor 701 and data exchanged with an external memory 7022 such as a hard disk, the processor 701 exchanges data with the external memory 7022 through the memory 7021, and when the computer apparatus 700 is operated, the processor 701 and the memory 702 communicate with each other through the bus 703, so that the processor 701 executes the following instructions:

The specific processing flow of the processor 701 may refer to the description of the above method embodiment, and is not described herein again.

Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, performs the steps of the distance determining method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the distance determining method in the foregoing method embodiments, which may be referred to specifically for the foregoing method embodiments, and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method of distance determination, comprising:

rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache, and generating a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera;

reading a first depth value of a first vertex corresponding to the first pixel point and a second depth value of a second vertex corresponding to the second pixel point from the preset cache in response to the distance measurement operation of the first pixel point and the second pixel point in the rendered image;

determining a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

2. The method of claim 1, wherein the rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache and generating a rendered image of the three-dimensional model comprises:

determining the target vertex from the three-dimensional model based on the pose of a virtual camera in a model coordinate system corresponding to the three-dimensional model in response to an image rendering event for the three-dimensional model;

rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex to the preset cache;

and reading pixel values corresponding to all target vertexes from the preset cache, and generating a rendering image of the three-dimensional model based on the read pixel values.

3. The method of claim 2, wherein rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex into the predetermined buffer comprises:

determining projection relationship information between the virtual camera and the three-dimensional model based on the pose of the virtual camera in the model coordinate system;

determining the projection position of each target vertex in the rendered image based on the projection relation information;

determining the cache position of each target vertex in the preset cache based on the projection position;

and rendering the pixel value corresponding to the target vertex and the depth value corresponding to the target vertex into the preset cache based on the cache position of each target vertex in the preset cache.

4. The method of any of claims 1-3, wherein reading, from the pre-defined buffer, a first depth value of a first vertex corresponding to the first pixel and a second depth value of a second vertex corresponding to the second pixel in response to a ranging operation for the first pixel and the second pixel in the rendered image, comprises:

responding to a distance measurement operation of a first pixel point and a second pixel point in the rendered image, and determining a first storage position of a first depth value of the first vertex from the preset cache based on a first position of the first pixel point in the rendered image; reading the first depth value from the preset cache based on the first storage position;

and determining a second storage position of a second depth value of the second vertex from the preset cache based on a second position of the second pixel point in the rendered image, and reading the second depth value from the preset cache based on the second storage position.

5. The method of any of claims 1-4, wherein determining the target distance between the first vertex and the second vertex based on the first depth value and the second depth value comprises:

determining a depth difference for the first vertex and the second vertex based on the first depth value and the second depth value;

determining a second distance after the first vertex and the second vertex are projected to a preset plane based on the distance between the first pixel point and the second pixel point and a camera projection principle; wherein the preset plane is parallel to a projection plane of the virtual camera;

determining a target distance between the first vertex and the second vertex based on the depth difference and the second distance.

6. The method of claim 5, wherein determining the target distance between the first vertex and the second vertex based on the depth difference and the second distance comprises:

determining a third distance of the first vertex and the second vertex in a model coordinate system based on the depth difference and the second distance;

determining the target distance based on a proportional relationship between the model coordinate system and real space, and the third distance.

7. The method of any of claims 1-6, wherein the first vertex belongs to a first object in the target scene; the second vertex belongs to a second object in the target scene;

after the determining the target distance between the first vertex and the second vertex, the method further comprises:

determining a distance between the first object and the second object based on a target distance between the first vertex and the second vertex.

8. The method of any of claims 1-7, wherein the first vertex and the second vertex belong to a first object in the target scene;

determining size information of the first object based on the first vertex and a target distance between the second vertices.

9. An apparatus for distance determination, comprising:

the processing module is used for rendering depth information corresponding to a target vertex in a three-dimensional model of a target scene into a preset cache and generating a rendered image of the three-dimensional model; wherein the three-dimensional model comprises: the plurality of grids and the vertexes corresponding to the plurality of grids respectively; any mesh has at least one same vertex with at least one other mesh; the target vertex is located in the shooting field of view of the virtual camera;

a reading module, configured to read, from the preset cache, a first depth value of a first vertex corresponding to a first pixel point and a second depth value of a second vertex corresponding to a second pixel point in response to a distance measurement operation on the first pixel point and the second pixel point in the rendered image;

a determination module to determine a target distance between the first vertex and the second vertex based on the first depth value and the second depth value.

10. A computer device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the processor for executing the machine-readable instructions stored in the memory, the processor performing the steps of the method of distance determination as claimed in any one of claims 1 to 8 when the machine-readable instructions are executed by the processor.

11. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a computer device, performs the steps of the method for distance determination according to any one of claims 1 to 8.