CN118015197B

CN118015197B - Live-action three-dimensional logic singulation method and device and electronic equipment

Info

Publication number: CN118015197B
Application number: CN202410411161.3A
Authority: CN
Inventors: 鄂超; 姜璐; 谢潇; 林欢; 廖小罕; 张向前
Original assignee: Beijing Digsur Science And Technology Co ltd; Zhuhai Campus Of Beijing Normal University; Institute of Geographic Sciences and Natural Resources of CAS
Current assignee: Beijing Digsur Science And Technology Co ltd; Zhuhai Campus Of Beijing Normal University; Institute of Geographic Sciences and Natural Resources of CAS
Priority date: 2024-04-08
Filing date: 2024-04-08
Publication date: 2024-06-18
Anticipated expiration: 2044-04-08
Also published as: CN118015197A

Abstract

The invention provides a live-action three-dimensional logic singulation method, a live-action three-dimensional logic singulation device and electronic equipment. The method comprises the steps of obtaining viewpoint information of a three-dimensional scene, calculating a view cone according to the viewpoint information, and obtaining a target three-dimensional model in a visual range according to the view cone; calculating vector outline surfaces of all monomers based on the target three-dimensional model, and establishing a monomer three-dimensional bounding box according to the height information of all monomers; projecting the monomer stereoscopic bounding box to a two-dimensional screen. In this way, automated extraction can be achieved by defining quantitative rules of geometric features; optimizing the efficiency of logic singulation, extracting the feature information of the entity objects in the three-dimensional scene and calculating the geometric topology, projecting the feature information into a two-dimensional screen in real time for display, and drawing the geometric shape of the model singulation in the screen to realize the balance of accuracy and efficiency.

Description

Live-action three-dimensional logic singulation method and device and electronic equipment

Technical Field

The present invention relates generally to the field of computer vision and image processing, and more particularly, to a live-action three-dimensional logical singulation method, apparatus, and electronic device.

Background

Along with the acceleration of the urbanization process, the demand for space three-dimensional basic information is growing increasingly, and the real three-dimensional modeling is widely applied in various fields by the characteristics of reality, intuitiveness, high precision and the like. In the construction process of a live three-dimensional model, the monomerization is a very important step. Only if the entity is correctly represented as an independent individual, the accuracy and the reliability of the model can be ensured, so that the model can play an important role in practical application.

The individualization of the real three-dimensional model is divided into logic individualization and real individualization.

Logic singulation: and logically dividing the three-dimensional model of the real scene, and independently extracting functional units or logic parts in the three-dimensional model to form an independent logic monomer model.

True monomer: and physically dividing the three-dimensional model of the real scene, and independently extracting each object or building in the three-dimensional model to form an independent real monomer model.

In order to analyze the spatial relationships, geometric features and interactions of different monomer portions in a scene and make corresponding applications and decisions, a logical singulation method is generally employed. However, the logic singulation is usually divided in a geometric form, which is not accurate enough, and the object is complicated in space, especially in the part-level live-action three-dimensional model level, the data updating frequency is high, the geometric modeling processing and the manual interaction of the logic singulation are performed, the cost is high, and the time consumption is long.

Disclosure of Invention

According to an embodiment of the invention, a realistic three-dimensional logic singulation scheme is provided. According to the scheme, automatic extraction is realized by defining quantitative rules of the monomerized geometric features; optimizing the efficiency of logic singulation, extracting the feature information of the entity objects in the three-dimensional scene and calculating the geometric topology, projecting the feature information into a two-dimensional screen in real time for display, and drawing the geometric shape of the model singulation in the screen to realize the balance of accuracy and efficiency.

In a first aspect of the invention, a realistic three-dimensional logical singulation method is provided. The method comprises the following steps:

obtaining viewpoint information of a three-dimensional scene, calculating a view cone according to the viewpoint information, and obtaining a target three-dimensional model in a visual range according to the view cone;

calculating vector outline surfaces of all monomers based on the target three-dimensional model, and establishing a monomer three-dimensional bounding box according to the height information of all monomers;

And projecting the monomer three-dimensional bounding box to a two-dimensional screen, so that the space positions of all the monomers are dynamically displayed on the basis of three-dimensional scene rendering, and the subsequent semantic analysis is supported.

Further, the obtaining viewpoint information of the three-dimensional scene, calculating a view cone according to the viewpoint information, and obtaining a three-dimensional model in a visual range according to the view cone, includes:

acquiring a coordinate range of a view cone according to the position and the direction of the view point;

And carrying out intersection operation according to the coordinate range of the view cone and the coordinate range of the model, judging whether the model is in the view cone, and taking the model in the view cone as a target three-dimensional model.

Further, the obtaining the coordinate range of the view cone according to the position, the view angle and the screen range of the view point includes:

converting a standardized equipment coordinate system where the viewpoint is located into a camera local coordinate system;

Converting the local coordinate system of the camera into a world coordinate system;

And converting the viewpoint coordinates in the world coordinate system into longitude and latitude coordinates, and taking the longitude and latitude coordinates as the coordinate range of the view cone.

Further, the calculating the vector contour surface of each monomer based on the target three-dimensional model includes:

generating DSM data of the target three-dimensional model, and extracting edge contour information of each monomer by using the DSM data;

And carrying out fitting treatment on the edge contour of each monomer by combining the edge contour information to obtain a vector contour surface of each monomer.

Further, the generating DSM data for the target three-dimensional model includes:

constructing an empty grid and a DSM image according to the planar area range of the target three-dimensional model and the preset spatial resolution;

projecting all triangular patch vertexes in the target three-dimensional model into a regular grid at an angle of overlooking right above;

calculating DSM image pixel values of corresponding positions according to the vertexes projected in the grids;

Performing interpolation calculation on the DSM image to obtain an interpolated DSM image;

And combining the plane area range coordinates of the target three-dimensional model according to the spatial resolution of the DSM image after interpolation to obtain the DSM image with the spatial geographic coordinates.

Further, the fitting processing is performed on the edge profile of each monomer by combining the edge profile information to obtain a vector profile surface of each monomer, including:

carrying out geometric shape regularization treatment on the single edge profile according to the geometric shape and the edge profile characteristics of the single body;

Carrying out geometric corner searching on the edge contour subjected to geometric shape regularization treatment to obtain corner information of geometric figures;

screening characteristic corner points of the corner point information of the geometric figure to obtain a characteristic corner point set;

And performing graph approximation fitting on the characteristic angle point set according to the geometrical shape and the edge contour characteristics of the monomer to obtain a vector contour surface of the monomer.

Further, the building a monomer stereoscopic bounding box according to the height information of each monomer includes:

extracting height information from corresponding DSM data based on the vector profile surface of the monomer to obtain the elevation of each corner point on the vector profile;

and obtaining the top surface elevation by using a least square method according to the principle that the top surface elevations are the same, and generating the monomer three-dimensional bounding box.

Further, the projecting the monomer stereoscopic bounding box onto a two-dimensional screen includes:

converting the monomeric stereoscopic bounding box from a local coordinate system to a world coordinate system;

converting the monomer solid bounding box under the world coordinate system into relative coordinates under the camera space;

creating a view volume, and performing projective transformation on relative coordinates in the camera space through orthographic projective transformation and/or perspective projective transformation;

and displaying the object projected in the view body after projection transformation on a two-dimensional view port plane.

In a second aspect of the invention, a realistic three-dimensional logical singulation apparatus is provided. The device comprises:

the acquisition module is used for acquiring viewpoint information of the three-dimensional scene, calculating a view cone according to the viewpoint information and acquiring a target three-dimensional model in a visual range according to the view cone;

The building module is used for calculating the vector outline surface of each monomer based on the target three-dimensional model and building a monomer three-dimensional bounding box according to the height information of each monomer;

and the projection module is used for projecting the monomer stereoscopic bounding box to a two-dimensional screen.

In a third aspect of the invention, an electronic device is provided. At least one processor of the electronic device; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the invention.

It should be understood that the description in this summary is not intended to limit the critical or essential features of the embodiments of the invention, nor is it intended to limit the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals denote like or similar elements, in which:

FIG. 1 shows a flow chart of a realistic three-dimensional logical singulation method in accordance with an embodiment of the present invention;

FIG. 2 shows a flow chart for acquiring a three-dimensional model according to an embodiment of the invention;

FIG. 3 illustrates a flow chart for acquiring a coordinate range of a view cone according to an embodiment of the invention;

FIG. 4 shows a flow chart of the calculation of vector profile facets for individual monomers according to an embodiment of the present invention;

FIG. 5 shows a flow chart of generating DSM data according to an embodiment of the present invention;

FIG. 6 shows a flow chart of an edge profile refinement process for each cell in accordance with an embodiment of the invention;

FIG. 7 shows a flow chart for creating a singulated stereoscopic bounding box in accordance with an embodiment of the invention;

FIG. 8 illustrates a flowchart of projecting a singulated stereoscopic bounding box onto a two dimensional screen in accordance with an embodiment of the invention;

FIG. 9 shows a schematic diagram of a building edge profile extraction process according to an embodiment of the invention;

FIG. 10 shows a comparison schematic before and after a geometric regularization process according to an embodiment of the invention;

fig. 11 shows a schematic diagram of a FAST corner detection algorithm according to an embodiment of the invention;

FIG. 12 shows a schematic representation of the edge profile of a building with redundant corner points according to an embodiment of the invention;

FIG. 13 shows a schematic diagram of a redundant corner culling before and after comparison according to an embodiment of the present invention;

FIG. 14 shows a vector profile schematic of a fitted monomer according to an embodiment of the invention;

FIG. 15 shows a corner sort coordinate schematic according to an embodiment of the present invention;

FIG. 16 shows a monolithic stereogram effect diagram according to an embodiment of the present invention;

FIG. 17 shows a block diagram of a live three-dimensional logical monomer apparatus based on three-dimensional scene and two-dimensional screen linkage, in accordance with an embodiment of the present invention;

fig. 18 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the invention.

1800 Is an electronic device, 1801 is a computing unit, 1802 is a ROM, 1803 is a RAM, 1804 is a bus, 1805 is an I/O interface, 1806 is an input unit, 1807 is an output unit, 1808 is a storage unit, and 1809 is a communication unit.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In addition, the term "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

In the real-scene three-dimensional model, represented by an oblique photography model, the ground multi-view image is rapidly acquired through an oblique photography measurement technology, and the oblique model capable of reflecting the real ground condition is produced, so that the method has the characteristics of low modeling cost, high production efficiency, large model scale and the like, and meets the requirements of people on rapid and large-scale production of the three-dimensional model. However, due to the automatic modeling technical mechanism, the produced model is integrated and is essentially a mesh model. Such an integrated skin model cannot distinguish between geographical elements such as buildings, roads, and vegetation, and cannot perform individual operations such as selecting, assigning, and querying various geographical elements. I.e., the operation of model "singulation" cannot be achieved.

The three-dimensional model is characterized in that the three-dimensional model is a single entity object which can be selected and organized in a managed way, different colors (called as highlighting) can be displayed when the entity object is clicked by a mouse, additional attributes, query statistics and the single entity object can be carried out, and the single entity object can be used and managed to realize the basic functions and common operations of single three-dimensional display, query, analysis and the like.

The current three-dimensional model singulation methods can be broadly divided into two categories depending on whether or not the actual singulation can be achieved: true (physical) and logical (vector) monomers.

True singulation is the physical cutting of the model according to the actual definition of singulation, enabling the separation of a single target model from the overall model. The real individualization can be divided into model reconstruction individualization and cutting individualization, wherein the model reconstruction individualization is to reconstruct a target model (a model needing individualization) semi-automatically on the basis of modeling results, and the cutting individualization is to physically divide the target model after the three-dimensional model modeling is completed.

The model reconstruction and singulation is to utilize data such as an aerial triangulation result and texture produced by automatic modeling to be imported into third-party software, extract the boundary of a target model, automatically map the texture and rebuild the target model in a manual intervention mode, achieve the separation effect of the target model in the whole model and realize three-dimensional model singulation.

The cutting and singulation is to physically cut the three-dimensional model by using two-dimensional vector surface data of the target model, namely, dividing continuous triangular patches to realize singulation. The method can separate the target model from the whole three-dimensional model, and perform operations such as model browsing, attribute giving, attribute inquiring and the like on the separated target model.

The logical singulation is not a real singulation, but rather adopts a mode that a target model in a three-dimensional model is overlapped with a vector surface or the target model and the vector surface store the same ID value so as to achieve the effect of "singulation". The superposition of the vector surfaces is carried out on the target model, dynamic monomer is achieved, the same ID value is stored in the target model and the vector surfaces, and ID monomer is achieved.

The ID singulation is to fuse the target model and the corresponding two-dimensional vector surface, and assign the ID value stored in the two-dimensional vector surface on the same target model as an attribute to each vertex in the triangular surface patch of the corresponding target model, namely the same target model stores the same ID value, and when one target model is selected, the target model is highlighted, so that the singulation effect is realized.

Dynamic monomalization is to superimpose a vector surface on a target model during model rendering, and to provide the user with expression and operation of "monomalization" by using the vector surface corresponding to the target model. The method is similar to superimposing a semi-transparent film over the object model to achieve the effect that the object model can be selected, and is called dynamic singulation because it is dynamically presented at the time of rendering. The dynamic monomerization is to dynamically superimpose the two-dimensional vector surface on the three-dimensional model when the inclined model is rendered, and the monomerization is performed in the mode, so that the two-dimensional vector surface and the inclined model can be separately managed, the data of the inclined model do not need to be preprocessed, and the real-time updating of the two-dimensional vector surface on the target model is facilitated.

The purpose of the singulation is to enable the target model on the three-dimensional model to have functions of selection, attribute inquiry, space analysis and the like. Although physical singulation is true, the singulation mode is to cut or reconstruct a target model, so that single target model management and operation are further carried out, the cut or reconstructed target model and an integral inclined model lack of topological relation, space inquiry and analysis of the target model are not facilitated, and the method is more focused on providing single products rather than expanding application of a three-dimensional model. The ID singulation and dynamic singulation are to manage the three-dimensional model by adopting vector data, and the target model on the inclined model has a singulation effect through the vector data, so that the singulation mode does not damage the structural integrity of the inclined model, the attribute inquiry, the space inquiry and the analysis capability of the GIS on the two-dimensional surface are quite perfect, and the application requirement of the inclined model in the GIS is completely met by adopting the vector data for singulation.

Compared with ID (identity) monomerization, dynamic monomerization can realize separation of two-dimensional vector data and a target model, has the advantages of no redundancy of inclined model data, timely update of the two-dimensional vector data and the like, and has obvious advantages.

The logic monomerization method adopts dynamic monomerization and optimizes the dynamic monomerization process, and the vector surface of the required monomerized object is generated in real time in an automatic extraction and analysis mode without manual interaction. The dynamic monomer mode is as follows: based on a three-dimensional model in a scene, a vector outline surface of the scene is automatically acquired, a transparent bounding box is established according to height information of the scene, and dynamic monomerization is performed.

FIG. 1 shows a flow chart of a realistic three-dimensional logical singulation method in accordance with an embodiment of the present invention.

The method comprises the following steps:

S101, obtaining viewpoint information of a three-dimensional scene, calculating a view cone according to the viewpoint information, and obtaining a target three-dimensional model in a visual range according to the view cone.

As shown in fig. 2, in this embodiment, the obtaining viewpoint information of a three-dimensional scene, calculating a view cone according to the viewpoint information, and obtaining a three-dimensional model in a visual range according to the view cone includes:

S201, acquiring a coordinate range of a view cone according to the position, the view angle and the screen range of the view point.

Specifically, as shown in fig. 3, in this embodiment, the obtaining the coordinate range of the view cone according to the position, the viewing angle, and the screen range of the viewpoint includes:

s301, converting a standardized device coordinate system where the viewpoint is located into a camera local coordinate system.

In some embodiments, 8 vertices of the six faces of the standardized device coordinate system are converted back to the camera local coordinate system according to the inverse of the projection matrix; for example a cube of (-1, -1, -1) to (1, 1, 1). The local coordinate system is the coordinate system where the different models are respectively located.

S302, converting the local coordinate system of the camera into a world coordinate system.

Specifically, eight vertices of the camera local coordinate system in the above embodiment are transformed into the world coordinate system, using the inverse of the view matrix. That is, coordinates in the world coordinate system are calculated reversely from the two-dimensional screen range, the viewpoint, and the like, in order to judge the current viewpoint and the data that can be seen in the screen range. World coordinates are the world coordinate system range of the screen (viewport).

S303, converting viewpoint coordinates in a world coordinate system into longitude and latitude coordinates, and taking the longitude and latitude coordinates as a coordinate range of a view cone.

Specifically, the coordinates of the world coordinate system are converted into latitude and longitude coordinates. This requires a geographical coordinate transformation. The specific conversion mode depends on the map projection mode used, and common map projection modes include longitude and latitude coordinates, mercator projection, UTM projection and the like.

S202, according to the intersection operation of the coordinate range of the view cone and the coordinate range of the model, judging whether the model is in the view cone, and taking the model in the view cone as a target three-dimensional model.

Specifically, if a model exists in the view cone, taking the model as a target three-dimensional model, and acquiring model information of the model; if no model exists in the view cone, the judging process is exited.

S102, calculating vector contour surfaces of all the monomers based on the target three-dimensional model, and building a monomer three-dimensional bounding box according to the height information of all the monomers.

As shown in fig. 4, in this embodiment, the calculating the vector profile of each monomer based on the target three-dimensional model includes:

s401, generating DSM data of the target three-dimensional model, and extracting edge contour information of each monomer by using the DSM data.

DSM (Digital Surface Model) is a digital surface model, which is a ground elevation model comprising the heights of ground buildings, bridges, trees and the like, and is the most truly expressed ground relief condition.

In the real-scene three-dimensional model, the height of a building monomer is generally higher than that of other surrounding ground objects, the building monomer has larger numerical value on elevation which is the most obvious characteristic of the building monomer, and the building and other ground objects have clear edges in a DSM image; most buildings are square and symmetrically distributed, the contour lines are mainly straight lines, and the DSM image has high identification degree. By combining the difference characteristics of the elevation and the geometric shapes of the building and other surrounding ground objects, the edge contour information of the building can be extracted by using DSM data corresponding to the three-dimensional model.

As shown in fig. 5, in this embodiment, the generating DSM data of the target three-dimensional model includes:

S501, constructing an empty grid and a DSM image according to the planar area range of the target three-dimensional model and the preset spatial resolution.

Specifically, an empty grid is constructed according to the planar area range of the input three-dimensional model and the preset spatial resolution. And simultaneously generating a single-band 8-bit image with the same width and height as those of the grid.

In some embodiments, to ensure accuracy, the grid resolution is not less than 0.30 meters.

S502, projecting all triangular patch vertexes in the target three-dimensional model into a regular grid at an angle of overlooking right above.

In some embodiments, the vertices of the triangular patches that make up the three-dimensional model are discrete and are not evenly distributed projected into the planar mesh, resulting in the possible presence or absence of one or more vertices in each small mesh.

S503, calculating the DSM image pixel value of the corresponding position according to the projected vertex in the grid.

Specifically, traversing each small grid in the grids in turn, mapping the elevation value of a vertex to 0-255 if the vertex exists for each small grid, obtaining a mapping value, and rounding and assigning the value to be a pixel value of a corresponding position of the image; if a plurality of vertexes exist, mapping the vertexes into 0-255 in sequence, adding the mapped values to calculate an average value, rounding the average value, and assigning the rounded average value as a pixel value of a corresponding position of the image; and if no vertex exists in the small grid, temporarily changing the image pixel value corresponding to the current small grid into 0.

S504, performing interpolation calculation on the DSM image to obtain the DSM image after interpolation.

Through interpolation calculation on the DSM image, the phenomenon that 0 value exists in the DSM image due to uneven distribution of vertexes can be effectively solved, and meanwhile, through interpolation calculation, the spatial resolution and the elevation precision of the DSM image can be improved, and more reliable basic data can be provided for follow-up.

In some embodiments, the DSM image may be interpolated using a cubic convolution interpolation algorithm. The algorithm takes 16 adjacent pixel points around the calculated point (x, y), and the value at the (x, y) position in the final interpolated image is the sum of the weight convolutions of the 16 pixel points. Interpolation calculation is performed in a certain direction, for example, interpolation calculation is performed in the current x direction, pixel values of four pixel points (x, y-1), (x, y), (x, y+1) and (x, y+2) can be obtained, and interpolation calculation is performed in the y direction by using the four calculation results, so that the pixel values of (x, y) are obtained. Since the image pixel value corresponding to no vertex in the grid is assigned to 0 in the DSM image formed in the last step, if the adjacent points around the current interpolation pixel point have the pixel value of 0, the points also participate in the interpolation calculation, and the interpolation precision will be affected. Therefore, the interpolation algorithm is improved, pixels with the pixel value of 0 are removed before interpolation calculation is carried out on each pixel point, the pixels are not involved in the interpolation calculation, and only effective Gao Chengdian pixels are interpolated, so that the elevation information is not interfered by the pixel points with the value of 0.

Compared with the nearest neighbor interpolation algorithm and the bilinear interpolation method, the three-time convolution interpolation algorithm has higher interpolation precision, and the main idea is to increase the number of the nearest neighbors to participate in interpolation calculation so as to improve the interpolation precision.

S505, according to the spatial resolution of the DSM image after interpolation, combining the spatial coordinate system of the target three-dimensional model, and endowing the spatial coordinate system of the DSM image after interpolation processing to obtain the DSM image with the spatial geographic coordinates.

In this embodiment, the extracting edge profile information of each monomer using the DSM data includes:

In DSM image data, building monomers typically have a higher elevation than their surrounding terrain, which is the most prominent feature of building monomers; this feature can be used to effect extraction of building edge profile information.

Among the image edge extraction algorithms, the Canny algorithm has: ① The false judgment rate is low, only the truly existing edge is detected, and non-edge pixels are not detected by mistake; ② The method has the advantages that the method is high in precision positioning, the edge can be accurately positioned on the pixel point with the largest gray value change, and the error between the detection result and the real edge in the image in the distance is small; ③ The repeated detection is avoided, and only one corresponding result is returned for each edge, so that the repeated detection is avoided; ④ The noise resistance is strong, and the noise and other advantages can be effectively restrained through double-threshold processing and non-maximum value restraint.

The Canny algorithm mainly comprises five steps:

(1) And (5) Gaussian filtering of the image. Firstly, gaussian filtering processing is carried out on an image, noise is concentrated on a high-frequency signal, edge information also belongs to the high-frequency information in the image, the noise is easy to identify as a pseudo edge, and the Gaussian filtering algorithm is applied to remove the noise, so that the identification of the pseudo edge can be reduced.

(2) The gradient magnitude and gradient direction are calculated. There is a change in brightness in the gray image, and when the intensity change is relatively severe (the pixel gradient is large) at a certain place, an edge is formed, and the Sobel operator is used for calculating the magnitude and direction of the pixel gradient. The Sobel operator is two 3×3 matrices, S _x and S _y, respectively. The former is used to calculate the image x-direction pixel gradient matrix G _x, and the latter is used to calculate the image y-direction pixel gradient matrix G _y. The specific form is as follows: wherein/> Is a gray map matrix, which represents convolution operations.

The gradient intensity matrix Gxy can be calculated from the following formula: ; the gradient direction θ can be calculated from the following formula: θ=atan2 (Gy, gx), where atan2 is an anticontrol function.

(3) And performing non-maximum suppression on the gradient amplitude image. Noise interference is eliminated by suppressing non-maximum values, pixel points which are not edges are eliminated, and only real edge pixel points are reserved, so that the accuracy and the robustness of edge detection are improved. The specific algorithm is that the gradient intensity of the current pixel is compared with two pixels along the positive and negative gradient directions, if the gradient amplitude of the current pixel point is not a local maximum value, the gradient amplitude is set to 0, namely the pixel point is restrained; if the gradient amplitude of the current pixel point is a local maximum value, the gradient amplitude is reserved.

(4) And (5) double-threshold processing. The non-maximum suppressed image has some pixels with values but not edges, and the pixels with the edges are distinguished by setting a high threshold and a low threshold through a double-threshold screening process. If the gradient value of the edge pixel point is larger than the high threshold value, the edge pixel point is considered to be an edge, and is marked as a strong edge point; if the edge gradient value is smaller than the high threshold value and larger than the low threshold value, the edge gradient value is considered to be possible to be an edge and marked as a weak edge point; points below the low threshold are then considered to be necessarily not edges and are to be suppressed.

(5) And (5) edge connection. After the strong and weak edges are obtained, pixels that are more likely to be edges are also screened from the weak edges and connected to the strong edges. It is generally considered that the weak edge points and the strong edge points caused by the real edges are connected, whereas the weak edge points caused by noise or color change are not. According to the principle, the connectivity of the edges is analyzed, all the strong edge pixels are traversed, whether weak edge pixels exist around the strong edge pixels or not is checked, and if so, the strong edge pixels are classified as the strong edges.

The process of extracting the edge contour of the building is shown in fig. 9, and fig. 9 (a) is a top view of the original model; fig. 9 (b) is a view showing the DSM data corresponding to the model, and in fig. 9 (b), the portion with higher brightness is the area with higher elevation, so that the edge profile information of the building unit can be clearly judged by visual observation; fig. 9 (c) is a schematic diagram of the Canny algorithm edge extraction result, in which the edge profile of the building monomer has been extracted, but the edge profile of other non-building monomers are also included, so that the edge profile of the non-building monomer needs to be removed by combining the geometric features of the building monomer edge. The resulting building edge profile is shown in fig. 9 (d).

The contour of the building monomer edge obtained after the image edge extraction may deviate from the contour of an actual building, and certain errors exist especially when the problem of shielding trees and the like around the building or the problem of a cavity exists in a three-dimensional model. Therefore, in combination with the features of the edge profile of the building, further refinement of the edge profile is required to obtain more accurate building profile information.

S402, fitting the edge contour of each monomer by combining the edge contour information to obtain a vector contour surface of each monomer.

As shown in fig. 6, in this embodiment, the fitting process is performed on the edge profile of each monomer by combining the edge profile information to obtain a vector profile surface of each monomer, which specifically includes the following refinement process:

S601, carrying out geometric shape regularization treatment on the single edge profile according to the geometric shape and the edge profile characteristics of the single body.

In some embodiments, the regularization of the building edge profile is performed based on the building geometry, which is mainly rectangular, diamond, H-shaped, L-shaped, T-shaped, U-shaped, circular, combined, etc., concave-convex polygonal shapes, which are relatively regular geometric primitives or combinations of regular basic primitives.

Fig. 10 shows a comparison diagram before and after the geometric regularization process, wherein fig. 10 (a) is a diagram before the geometric regularization process; fig. 10 (b) is a schematic diagram after geometric regularization treatment. It can be seen that the geometric regularization process can combine building geometry and edge profile features to remove larger protrusions or depressions in the edge profile pattern, giving it a simple regularity in geometry.

S602, carrying out geometric corner searching on the edge contour subjected to geometric shape regularization treatment to obtain corner information of the geometric figure.

The edge profile after geometric regularization still has some small protrusions, and has slight deviations from the actual profile. The corner points of the geometric figures are connected end to end according to a certain sequence, so that the smaller protrusions or recesses can be removed, and the geometric shapes of the edge contours are not changed.

In this embodiment, a FAST corner detection algorithm is adopted to perform corner detection and extraction on the image after the building edge profile is filled. The FAST (Features from ACCELERATED SEGMENT TEST) corner detection algorithm is a method for performing corner detection based on the characteristics of the neighborhood around the pixel point, and has the greatest advantage of calculation efficiency, is quicker than other corner feature detection algorithms, and is very suitable for real-time image processing application.

As shown in fig. 11, which is a schematic diagram of a FAST corner detection algorithm, the FAST corner detection algorithm mainly analyzes 16 pixels on a circular window of a pixel neighborhood, and if the gray values of n consecutive pixels are larger or smaller than the gray value of p points in 16 pixels on a surrounding ring with the pixel p as the center, p is considered as a corner point, where n is generally 12.

The basic flow of the FAST corner detection algorithm is as follows:

and traversing all pixel points p in the image, and judging whether the pixel points p are corner points or not. Where l is the gray value of pixel p.

Drawing a circle with r as radius, covering M pixels around p point, typically setting r=3, then m=16.

A threshold t is set and if the gray value of n consecutive pixels in the M pixels is higher than l+t or lower than l-t, the pixel p is considered as a corner point. Where n generally takes the value 12.

And (3) improving the detection speed: when detecting angular points in the basic flow, all pixel points in the image need to be detected, however, most points in the image are not angular points, so a method for rapidly distinguishing and eliminating non-angular points needs to be adopted firstly: first for every 90 degrees point around the candidate point: i.e., points numbered 1,9,5, 13 in fig. 11 (test 1 and 9 first, and then test 5 and 13 if they meet the threshold requirements). If p is a corner point, at least 3 of the four points are required to meet the threshold value, otherwise, the points are directly rejected. The remaining points are tested further (whether more than n (12) points meet the threshold requirement).

The FAST corner detection is implemented using the cv:: fastFeatureDetector () function of OpenCV (Open Source Computer Vision Library) open source computer vision and machine learning software library, and the parameters of the cv:: fastFeatureDetector () function include: a threshold of the center pixel and surrounding pixels; whether non-maximum suppression is enabled; neighborhood type selection, etc.

S603, screening characteristic corner points of the corner information of the geometric figure to obtain a characteristic corner point set.

The geometric corner information of the geometric figure can be obtained through the geometric corner searching processing of the contour image, but the redundancy of the detected corner exists due to the fact that the input building edge contour usually has small saw-tooth shape, as shown in fig. 12.

In order to remove redundant corner points, the feature corner point screening processing steps comprise: (1) corner ordering; and (2) redundant corner point elimination.

And (6) sorting the corner points.

The characteristic angle point set obtained in the last step is disordered in sequence, so that the angle points are required to be ordered to form a complete and closed polygon. For a single polygon of a building, the center of gravity of the polygon is located inside the building, the polygon point set ordering can be achieved by calculating the center of gravity (also referred to as the centroid or geometric center) of the polygon, and then determining the order of the vertices based on the relative positions of each vertex and the center of gravity and the direction of the vector.

Defining a polygon point set as c= { P1, P2, … … Pn }, wherein the arrangement order of the points is disordered, taking counterclockwise ordering as an example, the ordering rule is: if point P1 is in the counterclockwise direction of point P2, then point P1 is greater than point P2, and the algorithm steps are as follows:

① The center of gravity O of the point set is calculated from the known point coordinates.

② Calculating the size relation between points: the gravity center O is used as a unit vector OX parallel to the X axis, the unit vector of the point Pi and the gravity center O is OPi, then the included angles Pi of the OPi and the OX are calculated in sequence, and the magnitude of each included angle Pi is compared, so that the magnitude relation between the points is determined. If the included angle Pi between the OPi and OX is larger, the description point Pi is smaller, as shown in fig. 15, the order of the corner points is determined according to the included angle.

According to the algorithm, the corner point set in fig. 15 is sorted as follows: c= { P2, P6, P8, P1, P4, P5, P3, P7}.

(2) And (5) removing redundant corner points.

Traversing the ordered characteristic angle point set in sequence, setting the current traversing angle point as the current angle point, taking out two angle points positioned in front of and behind the current angle point in the sequence, and calculating the included angle theta formed by connecting the current angle point with the previous angle point and the next angle point. When the included angle theta is close to a right angle, judging the corner as a characteristic corner capable of fixing the geometric shape of the geometric outline of the building, and storing; otherwise, judging the corner point as a redundant corner point, and eliminating the redundant corner point; and obtaining an angle point set until all the angle points are processed. In a specific algorithm, |θ -90| <15° expression determines whether the angle is near right angle. The redundant corner point elimination is shown in fig. 13. Fig. 13 (a) is a schematic diagram before removing redundant corner points, wherein the corner points in the circles in the diagram are redundant corner points; fig. 13 (b) is a schematic diagram after redundant corner elimination.

S604, performing graph approximation fitting on the characteristic angle point set according to the geometry of the monomer and the edge contour characteristics to obtain a vector contour surface of the monomer.

The edge contour fitting processing is based on the obtained corner point set, and combines the geometric shape and edge contour characteristics of the building, polygonal or polygonal combined graph approximation fitting is carried out on the corner point set, and according to the principle that the reliability of long straight lines is better than that of short straight lines, the characteristic that the mutually perpendicular relationship exists between the contour line sections of the polygons is considered when approximation fitting is carried out, so that the fitted edge contour is close to the real edge contour of the building, as shown in fig. 14.

As shown in fig. 7, in this embodiment, the method for creating a monomer stereoscopic bounding box according to the height information of each monomer includes:

And S701, extracting height information from corresponding DSM data based on the vector outline surface of the monomer, and obtaining the elevation of each corner point on the vector outline.

S702, obtaining the top surface elevation by using a least square method according to the principle that the top surface elevations are the same, obtaining the elevation matched with the top surface, and generating a monomer three-dimensional bounding box according to the monomer edge profile and the corresponding height information. The effect of the singulated, stereotactic bounding box is shown in fig. 16.

S103, projecting the monomer stereoscopic bounding box to a two-dimensional screen.

As shown in fig. 8, in this embodiment, the projecting the monomer stereoscopic bounding box onto a two-dimensional screen includes:

S801, model transformation, namely converting the monomer stereoscopic bounding box from a local coordinate system to a world coordinate system.

In this embodiment, several bounding boxes that have been singulated are transformed into world coordinate system for subsequent presentation to a two-dimensional screen. The world coordinates herein are the world coordinate system range of the bounding box. The three-dimensional scene is presented on the screen in real time with the real-time response of operation, and the scene is rendered and displayed on the screen as soon as the mouse rotates and enlarges and reduces, namely the viewpoint of the camera changes.

As an embodiment of the present invention, rendering a three-dimensional model to a two-dimensional screen may employ OpenGL open three-dimensional graphics software packages. In the world coordinate system, openGL has three functions to implement model transformation:

(1) The model translates. glTranslate { fd } (TYPE x, TYPE y, TYPE z); the function moves the mapping coordinate system along the x-axis, y-axis, z-axis (of the world coordinate system) with the specified x, y, z values (i.e., translates the object by the same magnitude).

(2) The model rotates. glrotation { fd } (TYPE ANGLE, TYPE x, TYPE, y, TYPE z); the first variable angle in the function sets the rotation angle of the model, the unit is degree, and the last three variables represent the anticlockwise rotation drawing coordinate system by taking the connecting line from the origin (0, 0) to the point (x, y, z) in the world coordinate system as the axis.

(3) Model scaling. glScale { fd } (TYPE x, TYPE y, TYPE z); the function can zoom in and out on the object along x, y and z axes respectively. The three parameters in the function are scaling factors in the x, y, z axis directions, respectively. The defect time is 1.0, namely, the object is unchanged. The object Y-axis ratio was 2.0, the remainder were 1.0, indicating that the cube was changed to a rectangular parallelepiped.

The world coordinate system is the coordinate system where all models are in the same scene.

S802, view transformation, namely converting the monomer stereoscopic bounding box under the world coordinate system into relative coordinates under the camera space.

The view transformation determines the position and orientation of the point of view (i.e., camera) of the object in the scene, allowing the camera to be aimed at the object to be photographed.

Specifically, the view transformation includes:

Translation: the object is moved from the world coordinate system into the camera coordinate system with the viewpoint as the origin. The panning operation causes the viewpoint to become the origin of the camera coordinate system.

And (3) rotation: by the rotation operation, the camera coordinate system is aligned with the world coordinate system. This can be achieved by aligning the observation target with the negative line of sight direction of the camera coordinate system.

Orientation: the upward direction of the camera coordinate system is determined. The upward direction is used to designate the upper side in the camera view angle.

In the above embodiment, openGL provides gluLookAt () function having three variables defining the position of the viewpoint, the reference point of the camera aiming direction, and the upward direction of the camera, respectively. The function is specifically:

void gluLookAt(GLdouble eyex,GLdouble eyey,GLdouble eyez,GLdouble centerx,GLdouble centery,GLdouble upx,GLdouble upy,GLdouble upz)

The function defines a viewpoint matrix and multiplies the current matrix by the matrix. eyex, eyey, eyez define the position of the viewpoint; the centerx, centery and centerz variables specify the location of a reference point, which is typically a point on the central axis of the scene at which the camera is aimed; the upx, upy, upz variable specifies the direction of the upward vector.

S803, projective transformation, namely creating a view volume, and projectively transforming relative coordinates under the camera space through orthographic projection transformation and/or perspective projection transformation.

In this embodiment, in order to enable the displayed object to be displayed in a proper position, size and orientation, the dimension must be reduced by projection. The purpose of projective transformation is to define a view volume so that the redundant parts outside the view volume are cut off and only the relevant parts inside the view volume enter the image finally.

The projection includes both perspective projection and front projection. Orthographic projection, also known as parallel projection. The projected view volume is a rectangular parallel duct, i.e. a cuboid. The biggest feature of orthographic projection is that the size of the projected object is unchanged no matter how far the object is from the camera.

In some embodiments, the OpenGL orthographic projection function has two, glOrtho () and gluOrtho2D ().

(1) Creating parallel views Jing Ti glOrtho：void glOrtho(GLdouble left,GLdouble right,GLdouble bottom,GLdouble top, GLdouble near,GLdouble far) creates a parallel view volume. An orthographic projection matrix is created and the current matrix is multiplied by this matrix. Wherein the near clipping plane is a rectangle, the three-dimensional space coordinate of the left lower corner point of the rectangle is (left, bottom, -near), and the right upper corner point is (right, top, -near); the far clipping plane is also a rectangle, the lower left corner space coordinates are (left, bottom, -far), and the upper right corner is (right, top, -far). All near and far values are either positive at the same time or negative at the same time. If there is no other transformation, the direction of the orthographic projection is parallel to the Z axis and the viewpoint is oriented toward the Z negative axis. This means that far and near are both negative for objects in front of the viewpoint and positive for objects behind the viewpoint.

(2) Orthographic projection transformation gluOrtho D: void gluOrtho2D (GLdouble left, GLdouble right, GLdouble bottom, GLdouble top). It is a special orthographic projection function, mainly used for the projection of two-dimensional images onto two-dimensional screens. Its near and far defaults are-1.0 and 1.0, respectively, and the Z coordinates of all two-dimensional objects are 0.0. So its clipping plane is a rectangle with left bottom corner and right top corner (top).

The perspective projection accords with psychological habit of people, namely, an object close to the viewpoint is large, an object far from the viewpoint is small, and the object far to the pole is vanished to become vanishing point. Its view resembles a pyramid, i.e. a pyramid, with both the top and bottom cut.

In some embodiments, the OpenGL perspective projection function has two, glFrustum () and gluPerspective ().

(1) A perspective view Jing Ti glFrustum：void glFrustum(GLdouble left,GLdouble Right,GLdouble bottom,GLdouble top,GLdouble near,GLdouble far); is created which creates a perspective view volume. The operation is to create a perspective projection matrix and multiply the current matrix by this matrix. The parameters of this function define only the three-dimensional spatial coordinates of the lower left corner and the upper right corner of the near clipping plane, i.e. (left, bottom, -near) and (right, top, -near); the last parameter far is the Z negative value of the far clipping plane, and the space coordinates of the left lower corner and the right upper corner are automatically generated by a function according to the perspective projection principle. near and far represent distances from the viewpoint, which are always positive.

Perspective projective transformation gluPerspective: void gluPerspective (GLdouble fovy, GLdouble aspect, GLdouble zNear, GLdouble zFar); a symmetric perspective projection matrix is created and this matrix is multiplied by the current matrix. Parameters fovy define the angle of the field of view in the X-Z plane, the range being [0.0,180.0]; the parameter aspect is the ratio of projection plane width to height; parameters zNear and Far are the distance of the near-Far clipping plane along the negative axis of Z to the viewpoint, respectively, which are always positive.

S804, view port transformation, namely displaying the object projected in the view scene after projection transformation on a two-dimensional view port plane.

The view port transformation is to display the object projected in the view scene on the two-dimensional view port plane. The object after geometric transformation, projective transformation and clipping transformation is displayed in a designated area in the screen window, and the area is generally rectangular and is called a viewport.

In some embodiments, the viewport transformation related function in OpenGL is glViewport. glViewport (GLint x, GLint y, GLsizei width, GLsizei height), this function defines a viewport. The function parameters (x, y) are the lower left corner coordinates of the viewport in the screen window coordinate system, and the parameters width and height are the width and height of the viewport, respectively. When missing, the parameter values, i.e., (0, 0, winWidth, WINHEIGHT), refer to the actual size of the screen window. All of these values are in pixels and are all integers.

After the above model conversion, view conversion, projection conversion and view port conversion processes, the model in a three-dimensional space can be represented by drawing a corresponding two-dimensional plane, and can be correctly displayed on a two-dimensional computer screen.

According to the embodiment of the invention, automatic extraction is realized by defining the quantitative rule of the geometric characteristics; optimizing the efficiency of logic singulation, extracting the feature information of the entity objects in the three-dimensional scene and calculating the geometric topology, projecting the feature information into a two-dimensional screen in real time for display, and drawing the geometric shape of the model singulation in the screen to realize the balance of accuracy and efficiency.

The above description of the method embodiments further describes the solution of the present invention by means of device embodiments.

As shown in fig. 17, the apparatus 1700 includes:

an obtaining module 1710, configured to obtain viewpoint information of a three-dimensional scene, calculate a view cone according to the viewpoint information, and obtain a target three-dimensional model in a visual range according to the view cone;

A building module 1720, configured to calculate a vector contour plane of each monomer based on the target three-dimensional model, and build a monolithic three-dimensional bounding box according to height information of each monomer;

A projection module 1730, configured to project the monomer stereoscopic bounding box onto a two-dimensional screen.

According to the embodiment of the invention, the invention further provides electronic equipment.

Fig. 18 illustrates a schematic block diagram of an electronic device 1800 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers. The electronic device may also represent various forms of mobile devices.

The electronic device 1800 includes a computing unit 1801 that can perform various suitable actions and processes in accordance with computer programs stored in a Read Only Memory (ROM) 1802 or loaded from a storage unit 1808 into a Random Access Memory (RAM) 1803. In the RAM 1803, various programs and data required for the operation of the electronic device 1800 may also be stored. The computing unit 1801, ROM 1802, and RAM 1803 are connected to each other by a bus 1804. An input/output (I/O) interface 1805 is also connected to the bus 1804.

The computing unit 1801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor. The computing unit 1801 performs the respective methods and processes described above, for example, the methods S101 to S103.

Claims

1. A live-action three-dimensional logic singulation method, comprising:

Projecting the monomer stereoscopic bounding box to a two-dimensional screen;

the calculating the vector outline surface of each monomer based on the target three-dimensional model comprises the following steps:

fitting the edge contour of each monomer by combining the edge contour information to obtain a vector contour surface of each monomer;

the generating the DSM data of the target three-dimensional model includes:

2. The method of claim 1, wherein the acquiring viewpoint information of the three-dimensional scene, calculating a view cone from the viewpoint information, and acquiring a three-dimensional model in a visual range from the view cone, comprises:

acquiring a coordinate range of a view cone according to the position, the view angle and the screen range of the view point;

3. The method according to claim 2, wherein the obtaining the coordinate range of the view cone according to the position, the viewing angle, and the screen range of the viewpoint comprises:

4. The method according to claim 1, wherein the fitting the edge profile of each monomer by combining the edge profile information to obtain a vector profile of each monomer includes:

5. The method of claim 4, wherein the creating a monomers stereoscopic bounding box from the height information of each monomer comprises:

6. The method of claim 1, wherein the projecting the singulated stereoscopic bounding box onto a two dimensional screen comprises:

7. A live-action three-dimensional logic singulation apparatus, comprising:

the projection module is used for projecting the monomer stereoscopic bounding box to a two-dimensional screen;

the generating the DSM data of the target three-dimensional model includes:

8. An electronic device comprising at least one processor; and

A memory communicatively coupled to the at least one processor; it is characterized in that the method comprises the steps of,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.