CN116681854A

CN116681854A - Virtual city generation method and device based on target detection and building reconstruction

Info

Publication number: CN116681854A
Application number: CN202310694723.5A
Authority: CN
Inventors: 郑媛媛; 肖舟旻; 徐庶; 连辉
Original assignee: Nanhu Research Institute Of Electronic Technology Of China
Current assignee: Nanhu Research Institute Of Electronic Technology Of China
Priority date: 2023-06-12
Filing date: 2023-06-12
Publication date: 2023-09-01

Abstract

The application discloses a virtual city generation method and device based on target detection and building reconstruction, comprising the following steps: visual information and depth information of an actual physical sand table are collected; performing target detection based on the visual information, and obtaining space information, form information and building types of each actual building model according to the target detection result and the depth information; converting the space information, the form information and the building type of the actual building model into the shape rule of the corresponding virtual building model in the digital virtual sand table, and generating a component segmentation rule according to the shape rule; and generating a virtual building model corresponding to the actual building model based on the shape rule and the component segmentation rule, and completing the construction of the virtual city. The application realizes real-time twin of the actual physical sand table and the digital virtual sand table.

Description

Virtual city generation method and device based on target detection and building reconstruction

Technical Field

The application belongs to the technical field of virtual city construction, and particularly relates to a virtual city generation method and device based on target detection and building block reconstruction.

Background

In order to show a single building structure or a large-scale topography and urban area building arrangement, an early rudiment of the sand table is created. With the development of modern information technology, a digital virtual sand table for performing perception processing on multi-modal information by utilizing an augmented reality technology or a virtual reality technology and combining deep learning is developed, and the digital virtual sand table is gradually applied to the aspects of building design, urban construction planning, military deduction and the like.

According to the technical data of the digital sand table disclosed at present, the prior art is obviously insufficient in the aspect of perception processing of a physical sand table model. The existing digital sand table technology lacks of identifying the types and positions of various models in a physical sand table, cannot generate virtual building models with corresponding types and sizes in a simulation environment, and has differences between actual and virtual environments. Secondly, the existing digital sand table technology is often displayed based on a preset fixed building model, and a user cannot reconstruct a real-time building structure of the fixed building model, so that the interactive operability is lacking.

Prior art, for example, patent CN202110802083.6 discloses an immersive sand table display system, and based on a scene model library pre-constructed in the system, a user can select a certain scene to start immersive scene demonstration. Although the system improves the scene display effect and visual feeling by utilizing the virtual technology compared with the traditional physical sand table, the demonstration scene and the viewing angle position for watching in the system are fixed, and the model in the demonstration scene can not be regulated by a user, so that interaction diversity is lacked.

The prior art also discloses a holographic 3D intelligent interactive digital virtual sand table system as in the application patent CN 202022592987.4. The system is provided with the motion capture camera, and achieves scene display through the 3D display device after the observation view angle of the user is obtained through position positioning of the user. In addition, the system comprises an interaction component, so that a user can control the movement, rotation and the like of the model in the sand table, but the system has limited interaction function, and the user cannot reconstruct the real-time building model in the system.

The prior art also discloses a paperless interactive inspection method for urban design based on an urban three-dimensional space entity as disclosed in the application patent CN 112270027B. The method can extract the control points in legal planning and specification tables, and realizes paperless generation of examination results and omnibearing display in a real three-dimensional space scene through embedding codes and three-dimensional digital sand tables. However, the three-dimensional model used in the digital sand table is modeling data, and the construction is time-consuming. And the user interactivity is lacking, and the requirement of self-random customization of the user cannot be met.

Disclosure of Invention

The application aims to provide a virtual city generation method based on target detection and building block reconstruction, which realizes real-time twin of an actual physical sand table and a digital virtual sand table.

In order to achieve the above purpose, the technical scheme adopted by the application is as follows:

a virtual city generation method based on target detection and building reconstruction, comprising:

acquiring visual information and depth information of an actual physical sand table, wherein the actual physical sand table comprises one or more actual building models;

performing target detection based on the visual information, and obtaining space information, form information and building types of each actual building model according to a target detection result and the depth information;

converting space information, form information and building types of an actual building model into shape rules of a corresponding virtual building model in a digital virtual sand table, and generating component segmentation rules according to the shape rules, wherein the component segmentation rules comprise basic segmentation rules and refined segmentation rules, the basic segmentation rules are used for describing basic block segmentation of the virtual building model, and the refined segmentation rules are used for describing types of basic blocks;

and generating a virtual building model corresponding to the actual building model based on the shape rule and the component segmentation rule, and completing the construction of the virtual city.

The following provides several alternatives, but not as additional limitations to the above-described overall scheme, and only further additions or preferences, each of which may be individually combined for the above-described overall scheme, or may be combined among multiple alternatives, without technical or logical contradictions.

Preferably, the actual physical sand table comprises a sand table base and one or more actual building models arranged on the sand table base;

the sand table base is connected with a sensor bracket, a visual sensor is arranged on the sensor bracket, and the visual sensor is used for acquiring visual information and depth information of the actual physical sand table.

Preferably, the target detection is performed by a deep learning target detection model, and the building type, the orientation, the length and the width of the anchor frame and the coordinate information of the anchor frame of the actual building model in the training image are marked by a manual marking mode when the deep learning target detection model is pre-trained.

Preferably, the obtaining spatial information, morphological information and building category of each actual building model according to the target detection result and the depth information includes:

the target detection result comprises building category, orientation, anchor frame length and width and anchor frame coordinate information of an actual building model;

obtaining the length and the width of the actual building model in the vertical projection direction according to the length and the width of the anchor frame;

calculating a depth difference value between the actual building model and the sand table base according to the depth information to obtain the height of the actual building model, and taking the length, the width and the height as form information of the actual building model;

and converting the anchor frame coordinate information into world coordinates, taking the world coordinates as position information of an actual building model in an actual physical sand table, and taking the position information and the orientation as space information of the actual building model.

Preferably, the shape rule is composed of a character string attribute, a geometric attribute and a numerical attribute;

the character string attribute is a building style, and the building style of the virtual building model is determined by the building category of the actual building model;

the geometric attributes comprise a rotation angle and a preset scaling, wherein the rotation angle is the rotation angle of the orientation of the virtual building model relative to the actual building model;

the numerical attribute comprises position information and size, the position information of the virtual building model is the same as the position information of the actual building model, and the size of the virtual building model is obtained according to the form information of the actual building model and a preset scaling ratio.

Preferably, the refinement segmentation rule generation process is as follows:

acquiring one basic block, and determining the main type of the basic block according to the building style of the virtual building model;

judging whether the main type of the basic block contains a preset secondary type, if so, dividing the basic block according to the contained secondary type, and assigning a corresponding secondary type to each divided component; if not, the corresponding main type is assigned to the basic block.

Preferably, the generating a virtual building model corresponding to an actual building model based on the shape rule and the component segmentation rule includes:

determining a starting point of the virtual building model according to the position information in the shape rule;

determining the triaxial direction of the virtual building model according to the rotation angle in the shape rule;

generating a three-dimensional space with a corresponding size in three-axis directions according to the size in the shape rule;

determining grid segmentation of the three-dimensional space according to basic segmentation rules in the component segmentation rules to obtain a plurality of grid space blocks;

taking the type of the basic block according to a refinement segmentation rule in the component segmentation rules, and if the type is a main type which does not contain a secondary type, directly reading a pre-constructed model from a material library according to the main type and filling the pre-constructed model into a grid space block corresponding to the position; if the type is the main type containing the sub type, firstly reading a pre-constructed model corresponding to the sub type from a material library to splice, and then filling the spliced model into a grid space block corresponding to the position.

Preferably, the virtual city generating method based on target detection and building reconstruction further comprises:

operating in an editing mode, receiving user input information and adjusting a component segmentation rule of the virtual building model; in the operation mode, the shape rule and the component segmentation rule of the virtual building model are adjusted by receiving user input information.

and providing a first person visual angle visual display of the virtual city, or providing a third person visual angle visual display of the virtual city.

The virtual city generation method based on target detection and building block reconstruction provided by the application can scan the actual building model in the actual physical sand table and identify the category of the actual building model, simultaneously perform accurate spatial positioning of the actual building model, synchronously map the actual spatial position information of the actual building model into the digital virtual sand table, and automatically construct a complex virtual building model at the corresponding position. The parameters of the virtual building model in the digital virtual sand table in the virtual environment can be adjusted, a user can change the model structure, appearance, size, building accessories and the like of the building according to requirements, and the user can browse the whole urban scene by freely switching the visual angle. Based on the method provided by the application, the effect of displaying the complex urban environment in the virtual environment can be achieved only by combining simple actual building models in the actual physical sand table, and the method can be applied to real estate display, urban construction planning and strategy tactical planning games.

The second object of the present application is to provide a virtual city generating device based on object detection and building reconstruction, comprising a processor and a memory storing a plurality of computer instructions, wherein the computer instructions, when executed by the processor, implement the steps of the virtual city generating method based on object detection and building reconstruction.

Drawings

FIG. 1 is a flow chart of a virtual city generation method based on object detection and building reconstruction according to the present application;

FIG. 2 is a schematic illustration of one construction of the actual physical sand table and visual sensor mounting of the present application;

FIG. 3 is a schematic view of a range frame formed by the shape rule definition of the present application;

FIG. 4 is a basic block and type schematic diagram of the component segmentation rule description of the present application;

FIG. 5 is a schematic diagram of a first person perspective visual display of the present application;

fig. 6 is a schematic diagram of a third person viewing angle visualization of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

In order to overcome the defect of virtual city generation in the prior art, the embodiment provides a sand table-virtual city programming generation technology based on deep learning target detection and real-time building reconstruction, which can detect the type, the actual position and the height information of an actual building model in an actual physical sand table in real time, and generate corresponding virtual city buildings with real-time free adjustment structures in a simulation platform, thereby realizing real-time programming generation from the actual physical sand table to the virtual city.

As shown in fig. 1, the virtual city generation method based on target detection and building reconstruction of the present embodiment includes the following steps:

and 1, acquiring visual information and depth information of an actual physical sand table, wherein the actual physical sand table comprises one or more actual building models.

The actual physical sand table includes a sand table base and one or more actual building models disposed on the sand table base. The actual building model is constructed manually, and the constructed foundation can be an integrally formed actual building model, for example, the construction foundation is obtained by 3D printing and other technologies; the foundation can be built by splicing the scattered building blocks. The present embodiment does not limit the construction process of the actual building model.

Visual information (including but not limited to RGB images, and data such as infrared images and point clouds can be acquired when the actual sand table is needed) and depth information can be acquired through a manual handheld visual sensor (such as a binocular visual sensor ZED), or acquired through an external device (such as a robot) separated from the actual physical sand table, or acquired through a sensor bracket connected to a sand table base, and a visual sensor is installed on the sensor bracket, as shown in fig. 2.

The present embodiment does not limit the manner of fixing the vision sensor, but since the present embodiment is mainly implemented based on a top view of an actual building model, when only one vision sensor is included, the photographing surface of the vision sensor is kept perpendicular to the sand table base. When a plurality of visual sensors are included, the shooting surface of at least one visual sensor is kept perpendicular to the sand table base, and the shooting surfaces of the other visual sensors can shoot any angle of an actual physical sand table for auxiliary use.

And 2, performing target detection based on the visual information, and obtaining the space information, the form information and the building category of each actual building model according to the target detection result and the depth information.

The target detection is executed by a deep learning target detection model, when training images in a training data set for training the deep learning target detection model are acquired, RGB images and depth information of an actual building model in a sand table are firstly acquired by using a binocular vision sensor ZED vertically installed in an actual physical sand table system, and then building types, orientations, anchor frames, anchor frame length and width and anchor frame coordinate information of the building model are marked by a Roboflow image processing tool in a manual marking mode, so that corresponding marking data files are generated. The labeling data comprises building category codes, orientation codes and the length and width of one corner point of the anchor frame relative to the origin coordinates and the anchor frame after normalization according to the image size.

In the embodiment, a YOLOv5s target detection model is adopted, and a Mosaic method (such as image random scaling, random arrangement, random clipping, multi-image stitching and the like) is adopted in the training process to expand and strengthen the acquired training data set. And deploying the YOLOv5s target detection model after training at a server side of the system. In practical application, the system calls the binocular vision sensor ZED to collect RGB images and depth information of the sand table in real time, and the server detects building types of all practical building models and image boundary frame coordinate information in real time through deployed YOLOv5s target detection models, and outputs target detection results comprising the building types, the orientations, the anchor frame length and width and the anchor frame coordinate information of the practical building models.

And then obtaining the length and the width of the actual building model in the vertical projection direction according to the length and the width of the anchor frame, calculating the depth difference between the actual building model and the sand table base according to the depth information to obtain the height of the actual building model, and storing the length, the width and the height as the form information of the actual building model. The anchor frame coordinate information is converted to world coordinates, in this embodiment, the visual information only includes RGB images as an example, and the conversion from RGB image coordinates to world coordinates is specifically performed during the coordinate conversion, so as to obtain the position information of the actual building model in the actual physical sand table, and the position information and the orientation are used as the spatial information of the actual building model.

The conversion from RGB image coordinates to world coordinates is realized according to the camera intrinsic parameters and the camera extrinsic parameters of the binocular vision sensor ZED, and is a conventional coordinate conversion process, which is not described in detail in this embodiment.

And step 3, converting the space information, the form information and the building type of the actual building model into the shape rule of the corresponding virtual building model in the digital virtual sand table, and generating the component segmentation rule according to the shape rule.

In order to facilitate implementation of building reconstruction, the present embodiment converts the obtained relevant information of the actual building model into relevant information of the virtual building model for subsequent use. The shape rule of the present embodiment is composed of a character string attribute, a geometric attribute, and a numerical attribute.

The character string attribute is a building style, and the building style of the virtual building model is determined by the building category of the actual building model. For example, if the building type of the actual building model is a chinese shop, the building style of the virtual building model is also a chinese shop, which is described as "China" and "shop".

The geometric attributes include a rotation angle and a preset scaling. Wherein the scaling is the scaling S of XYZ three axes in space _X 、S _Y 、S _Z The rotation angle is the rotation angle of the virtual building model relative to the direction of the actual building model (such as a certain angle of north, south, north, east, west, etc.), and is based on the three axes XYZ in the space, and the rotation angle is also the rotation angle R of the three axes _X 、R _Y 、R _Z . Wherein the rotation angle and the scaling are both predefinedThe sex value is predefined, for example, according to the architectural style.

The numerical attributes include position information P and dimensions (length X, height Y, width Z), the position information of the virtual building model is the same as the position information of the actual building model, and the same is understood herein to mean the same pitch with respect to the respective origin of coordinates, i.e. the position information of the virtual building model in the world coordinate system in the virtual environment is the same as the position information of the actual building model in the actual world coordinate system. The size of the virtual building model is calculated according to the form information of the actual building model and a preset scaling ratio. The final shape rule can be described, for example, as shape build ("China", "shop", S) _X ,S _Y ,S _Z ,R _X ,R _Y ,R _Z P, X, Y, Z) that define a range box in space, as shown in fig. 3.

In this embodiment, the anchor frame coordinate information output by the target detection model is the anchor frame coordinate information, and the coordinate of one angle of the anchor frame is taken as the anchor frame coordinate information, so that the length, width and height of the virtual building model can be directly generated in the forward space of XYZ three axes based on the position information P. In other embodiments, if the anchor frame coordinate information is identified by taking information about other points (e.g., center points) of the anchor frame, the length, width and height in the positive and negative space are calculated to limit the space of the whole virtual building model when the length, width and height are generated.

In order to facilitate description of the construction of the virtual building model, the present embodiment proposes component division rules including basic division rules for describing basic block division of the virtual building model and refined division rules for describing types of basic blocks obtained by the basic division.

The basic block division is mainly used for dividing floors and units, and in other embodiments, the range of the basic block can be properly adjusted, for example, when the basic block is reconstructed only in units, the basic block division can be only unit division; as another example, when only floor is used for reconstruction, then the basic block division may be floor division only.

Floor segmentation segments the current range along one axis (e.g., the Y-axis), i.e., segments the height of the virtual building model, e.g., divides the virtual building model into four layers, which may be expressed as Subdiv ("Y", 3.5,3,3,3). The cell segmentation segments the current range along another axis (e.g., the X-axis), i.e., segments the length of the virtual building model, e.g., segments a layer of the virtual building model into four cells, which may be expressed as Subdiv ("X", 3,4,4,3). The floor division in the present embodiment is performed for the whole of the virtual building model, and the unit division is performed for each floor, and a plurality of independent basic blocks are obtained after division.

The refinement partition is used to represent the type of each basic block, and may be a type representation for the entire basic block, or the basic block may be further partitioned into smaller-sized components, and then a type representation is performed for each component. When the refined segmentation rule is generated, a basic block is obtained, the main type of the basic block is determined according to the building style of the virtual building model, for example, the top layer of a shop in China is taken as a roof, and the main type of the basic block positioned on the top layer is taken as the roof. Judging whether the main type of the basic block contains a preset secondary type, if so, dividing the basic block according to the contained secondary type, and endowing each divided component with a corresponding secondary type; if not, the corresponding main type is assigned to the basic block.

Taking a roof as an example, if the roof of a shop in China is an integral model component, the roof is of a main type and does not contain sub-types, and if the roof is of a first eave, a second eave and a third eave from top to bottom, the roof is of a main type and also contains three sub-types, so that a basic block positioned on the top layer needs to be correspondingly divided from top to bottom to obtain three components, and each component is attached with a corresponding sub-type.

As shown in fig. 4, the component division rule of a certain virtual building model is expressed as comp (type) { a|b| … |z }, where type is the type of basic block, such as the type of pre-built model of roof, eave, etc., F represents roof, B represents building corner module, a represents building middle module, C represents first floor shop, D represents gate, and the type recording in the expression may be performed based on a certain rule, for example, from the type of the upper left basic block to the type of the lower right basic block.

And 4, generating a virtual building model corresponding to the actual building model based on the shape rule and the component segmentation rule, and completing the construction of the virtual city.

After the shape rule and the component segmentation rule are constructed, a corresponding virtual building model can be automatically generated according to the constructed rule, and the generation process is as follows:

determining a starting point of the virtual building model according to the position information in the shape rule; determining the triaxial direction of the virtual building model according to the rotation angle in the shape rule; generating a three-dimensional space with a corresponding size in three-axis directions according to the size in the shape rule; determining grid segmentation of the three-dimensional space according to basic segmentation rules in the component segmentation rules to obtain a plurality of grid space blocks;

In order to increase the virtual city generation speed, the present embodiment pre-constructs a pre-constructed model of each building style, and the pre-constructed model may be a composite component, such as a roof including a plurality of eaves, or may be a single component, such as an integral roof. And when the building is reconstructed, reading the pre-constructed model from the material library according to the corresponding rule, and generating a complex virtual building model at the corresponding position, thereby completing scene construction and forming a city block.

In other embodiments, if the modification of the subsequent virtual building model is not considered, the back-end database returns building type information obtained by the visual sensor, determines shape rules and component segmentation rules, obtains parameters of the building construction required by the interface, directly generates a corresponding complete virtual building model by using a program algorithm, and fills the virtual building model to a corresponding position according to data returned by the visual sensor.

In order to improve the interactive function and realize customization of the three-dimensional block scene, the embodiment also supports building interactive editing, and receives user input information to adjust the component segmentation rule of the virtual building model when the building interactive editing is operated in an editing mode. The editing mode is mainly used for a user (such as an artist) to edit and shape the appearance of the virtual building model; in the operation mode, the shape rule and the component segmentation rule of the virtual building model are adjusted by receiving user input information. The operation mode is mainly used for a user (for example, using a client) to adjust attribute parameters of the virtual building model through a UI interface.

The adjustment of the virtual building model in the editing mode and the operation mode may be to directly adjust parameter values in the form rule and the component segmentation rule, and generate the virtual building model according to the readjusted parameter values, or directly adjust the virtual building model based on a visual editing mode such as UI interface, and synchronize the parameters of the adjusted virtual building model to the form rule and the component segmentation rule.

And after the building reconstruction is completed, providing a first person perspective visual display virtual city as shown in fig. 5, or providing a third person perspective visual display virtual city as shown in fig. 6. I.e. when a scene is presented, a user can browse the virtual city in a three-dimensional environment through the first person or third person viewing angle in an all-around manner.

The method adopts a target detection method based on deep learning, can identify the types of various actual building models in the actual physical sand table, estimates the height information of each actual building model according to the depth difference value, realizes the accurate spatial position of the model through coordinate conversion, and constructs a corresponding virtual building model in a simulation environment according to the model information and the position information.

In addition, the virtual building model in the existing digital sand table is often preset, a user cannot customize according to city style, building style and model structure, and the interactive operability is poor. When the actual physical sand table changes, the changes can be mapped into the three-dimensional virtual environment in real time. The problems of time and labor consumption in traditional scene construction and 3D modeling are solved to the greatest extent, the method is favorable for quickly constructing the virtual environment, and the three-dimensional environment construction efficiency is improved. Meanwhile, three-dimensional city block diversification real-time reconstruction is realized, and the relevant building components are only needed to be prepared in the early stage, the combination and reconstruction of various possibilities can be realized in the later stage according to relevant rules, and the three-dimensional city block diversification real-time reconstruction can be constructed at one time to generate countless needed scenes. Meanwhile, better user interactive experience is provided, creativity of users is brought into play to the greatest extent, the building style of a scene can be specified by the placement of an actual physical sand table, meanwhile, adjustment of building positions and structures required by the users is also provided during operation, and a good three-dimensional virtual deduction platform can be provided for virtual tactics in games.

The application uses the actual physical sand table to carry out construction test of multiple three-dimensional building scenes according to building construction rules. Through tests, the sand table combining the deep learning target detection and the real-time building reconstruction technology can achieve the effect of large-scale city block building programming generation, and can provide very high degree of freedom and interactive experience.

In another embodiment, the application also provides a virtual city generating device based on target detection and building reconstruction, which comprises a processor and a memory storing a plurality of computer instructions, wherein the computer instructions realize the steps of the virtual city generating method based on target detection and building reconstruction when being executed by the processor.

For specific limitations of the virtual city generation apparatus based on the target detection and the building reconstruction, reference may be made to the above limitation of the virtual city generation method based on the target detection and the building reconstruction, and the description thereof will not be repeated here.

The memory and the processor are electrically connected directly or indirectly to each other for data transmission or interaction. For example, the components may be electrically connected to each other by one or more communication buses or signal lines. The memory stores a computer program executable on a processor that implements the method of the embodiments of the present application by running the computer program stored in the memory.

The Memory may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory is used for storing a program, and the processor executes the program after receiving an execution instruction.

The processor may be an integrated circuit chip having data processing capabilities. The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), and the like. The methods, steps and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should be noted that, fig. 5 and 6 are mainly effect diagrams showing the first visual angle visual display and the third visual angle visual display respectively, the graphics in fig. 5 and 6 are only elements in the running interface when the software is running, the improvement emphasis of the present application is not involved, and the definition of the running interface is related to the pixels and the scaling, so the display effect is limited.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. The virtual city generation method based on the target detection and the building reconstruction is characterized by comprising the following steps of:

2. The virtual city generation method based on object detection and building reconstruction of claim 1, wherein the actual physical sand table comprises a sand table base and one or more actual building models disposed on the sand table base;

3. The virtual city generation method based on target detection and building reconstruction according to claim 1, wherein the target detection is performed by a deep learning target detection model, and building types, orientations, anchor frame length and width and anchor frame coordinate information of an actual building model in a training image are marked by a manual marking mode when the deep learning target detection model is pre-trained.

4. The virtual city generation method based on object detection and building reconstruction of claim 1, wherein the obtaining spatial information, morphological information and building category of each actual building model according to the object detection result and the depth information comprises:

5. The virtual city generation method based on object detection and building reconstruction of claim 4, wherein the shape rule consists of a string property, a geometric property, and a numerical property;

6. The virtual city generation method based on object detection and building reconstruction of claim 5, wherein the refinement segmentation rule generation process is as follows:

7. The method for generating a virtual city based on object detection and building reconstruction of claim 6, wherein said generating a virtual building model corresponding to an actual building model based on said shape rules and said component segmentation rules comprises:

8. The virtual city generation method based on object detection and building reconstruction of claim 1, further comprising:

9. The virtual city generation method based on object detection and building reconstruction of claim 1, further comprising:

10. A virtual city generation device based on object detection and building reconstruction, comprising a processor and a memory storing a plurality of computer instructions, wherein the computer instructions, when executed by the processor, implement the steps of the virtual city generation method based on object detection and building reconstruction of any one of claims 1 to 9.