WO2022064242A1

WO2022064242A1 - The method of automatic 3d designing of constructions and colonies in an smart system using a combination of machine scanning and imaging and machine learning and reconstruction of 3d model through deep learning and with the help of machine learning methods

Info

Publication number: WO2022064242A1
Application number: PCT/IB2020/058820
Authority: WO
Inventors: Soroush SARABI; Ali SOLTANMORADI; Mohyeddin ASADI; Ameneh SHADLO; Ali SARABI; Ala ZOBDEH
Original assignee: Sarabi Soroush; Soltanmoradi Ali; Asadi Mohyeddin; Shadlo Ameneh; Sarabi Ali
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2022-03-31

Abstract

The invention of method of automatic 3D designing of constructions and colonies in an smart system using a combination of machine scanning and imaging and machine learning and reconstruction of 3D model through deep learning and with the help of machine learning methods, to create texture and develop ultimate view of the construction is a smart system and also a photo shooting system from different kinds of constructions using drones or other aircrafts and also relates to data integration and 3D designing of constructions and volumes based on extracted information from images and photos with the help of machine learning and deep learning systems. Machine vision techniques can provide information about the depth and colors of each construction in high resolutions. First 2D pictures should be captured which then converted to 2D matrices, and then 2D matrices are processed by machine learning techniques to produce 3D matrices.

Description

TITLE OF INVENTION

THE METHOD OF AUTOMATIC 3D DESIGNING OF CONSTRUCTIONS AND COLONIES IN AN SMART SYSTEM USING A COMBINATION OF

MACHINE SCANNING AND IMAGING AND MACHINE LEARNING

AND RECONSTRUCTION OF 3D MODEL THROUGH DEEP LEARNING AND WITH THE HELP OF MACHINE LEARNING METHODS

TECHNICAL FIELD OF THE INVENTION

The present invention relates to a smart system and also a photo shooting system from different kinds of constructions using drones or other aircrafts and also relates to data integration and three dimensional designing of constructions and volumes based on extracted information from images and photos and also relates to the machine learning and deep learning systems.

PRIOR ART with development of human’s capability in construction of buildings and cities as well as the creation of human colonies next to water resources, the form of cities and also the development of modem buildings has led the three dimensional model of buildings, urban structures and the general form of urban architecture to become a strategic need for controlling and managing cities through having sufficient information about the urban design system and also the information related to urban architecture for future formation and zoning of specific architectural cases is an inseparable part of urban management structure. On the other hand, in many cases, historical building and ancient heritages due to the antiquity and non existence of related maps, do not have a coherent architectural map and in case of need to the related maps, common methods of deigning and drawing maps cannot meet the mentioned need. In relation to natural topography including faults, mountains and high angle rocks such as Grand Canyon also it is not possible to draw the accurate map.

The dominant method has been used to prepare building plans is by employing human experts to observe and record the architectural pattern of the local fields. This way of recording is undoubtedly face with problem while drawing the model of big buildings due to the manual nature of the method that results in squandering time and many human faults and due to lack of coordination for each of the drawn maps with each other and the urban space, drawing three dimensional volumetric architecture of buildings and generally the environment with very large area will be almost impossible. It is also theoretically proven that the human perceptual ability of urban architecture is incomplete, discontinuous, and distorted, especially with the fast-expanding constructions of the modem cities; the intervention of human resources can lead to less speed and unintentional faults or missing details.

Since buildings are the most substantial objects in urban perspectives, generating accurate and timely three dimensional models of the buildings is essential for city modeling. Up-to-date three dimensional models of buildings are necessary for various applications such as land management, disaster management, resource management, and urban planning, communication, transportation, and tourism. As the urban region develops the demand for up-to- date three dimensional models of the existing buildings increases. It is hardly possible for planners or authorities to give an accurate, prompt answer to some practical questions without using three dimensional building models. Three dimensional models can be utilized to investigate how the city architectural pattern is changing and identify parts of the city with a lack of services. Using machine vision and machine learning techniques is a high speed and low-cost way to provide accurate information on urban architecture. When machine vision techniques come into use, automatic modeling of urban constructions with high accuracy is in access. City planners often face problems like lack of updated maps of existing constructions and the absence of some ancient buildings plans. Machine vision is a fast and affordable candidate to address the issues.

With the emergence of machine learning methods and their high impact on different eras, three dimensional modeling of existing buildings has been well addressed. Machine learning techniques and algorithms are a promising method to bridge urban architecture knowledge gaps. Machine learning procedures have exhibited guaranteed progress in the industry and economic sections. As an economical approach for most industries, machine learning aims to facilitate urban development and housing process and also pursue the architectural changes in the city and digging the structural defects and eliminating them. It can smooth the way to automatically generate three dimensional maps of existing buildings in human settlements and urban environments ASAP. Also, using three dimensional models of buildings created by machine learning instead of two dimensional plans sketched by building planners makes information to be faster transferred. Three dimensional modeling of buildings can analyze the buildings from different aspects, including shape, size, material, color, texture and, structure. Machine learning as a promising tool is applicable to accelerate and economize the accurate study of the urban areas. To cut through an urban architectural problem, the first step is having complete knowledge about the architectural texture of the city, which is time-consuming and money- squandering without using machine learning methods.

To generate a high speed, affordable, and detailed three dimensional model of a building, machine vision and machine learning techniques and deep learning algorithms are employed. To prepare a three dimensional structural map of a building in a short time and low cost, machine vision methods, machine learning techniques, and deep learning algorithms are a significant candidate. In this regard, different methods are presented about scanning, data integration and creation of specific three dimensional structures which can refer to the following mentioned cases.

The invention with JP2019514240A patent No, filed in 29 February 2016 in Japan’s patent office named “Three-dimensional scanning support system and method”. This invention is a three-dimensional scanning system comprises a camera configured to acquire an image, a processor, and a memory coupled to the camera and the processor, the memory including an image acquired by the camera and instructions. Configured to store, the instructions, when executed by the processor, controlling the processor to obtain one or more initial images of the subject from the first pose of the camera; Or calculating a guidance map according to the one or more initial images to identify a plurality of next postures, and obtaining one or more additional images from at least one of the one or more next postures Control the camera to update the guidance map according to one or more additional images, and the images acquired by the camera to generate a three-dimensional model To force, characterized in that for the execution. Also the “Deep Camera” is used in this method.

Also an invention with KR101388133B1 patent No, filed in 16 February, 2007 in Korea’s patent office named “Method and apparatus for creating a 3D model from 2D photograph image”. The present invention relates to a method and apparatus for generating a three-dimensional model from a two-dimensional photorealistic image, which corresponds to the vanishing point using a straight line equation of an edge extension line converging to at least one or more vanishing points in an object of the two-dimensional photorealistic image. By calculating the planar equation of the plane to be used and generating a three- dimensional model for the object by using this, the calculation for calculating the three-dimensional geometric information in the three-dimensional model generation process can be generated easily and accurately the three-dimensional model.

Also in another invention with US8532368 patent No, filed in February 11, 2011 in USPTO, named “Method and apparatus for producing 3D model of an environment” provided a system (method and apparatus) for creating photorealistic 3D models of environments and/or objects from a plurality of stereo images obtained from a mobile stereo camera and optional monocular cameras. The cameras may be handheld, mounted on a mobile platform, manipulator or a positioning device. The system automatically detects and tracks features in image sequences and self-references the stereo camera in 6 degrees of freedom by matching the features to a database to track the camera motion, while building the database simultaneously. A motion estimate may be also provided from external sensors and fused with the motion computed from the images. The present invention also provides a system (method and apparatus) for generating photo-realistic 3D models of underground environments such as tunnels, mines, voids and caves, including automatic registration of the 3D models with pre-existing underground maps.

In an invention with US9407904 patent No, filed in 1 May, 2013 in USPTO named “Method for creating 3D virtual reality from 2D images” is a method that enables creation of a 3D virtual reality environment from a series of 2D images of a scene. Embodiments map 2D images onto a sphere to create a composite spherical image, divide the composite image into regions, and add depth information to the regions. Depth information may be generated by mapping regions onto flat or curved surfaces, and positioning these surfaces in 3D space. Some embodiments enable inserting, removing, or extending objects in the scene, adding or modifying depth information as needed. The collection of images may cover an entire sphere, providing 360° viewing in all directions. In the next invention with US 9618934 patent No, filed in 12 September 2014 in USPTO named “Unmanned aerial vehicle 3D mapping system” is an automatic unmanned aerial vehicle (UAV) flight control system for 3D aerial mapping includes a UAV with an onboard camera, and a controller capable of communication with a flight control module of the UAV. The controller is configured to determine an area mapping flight path based on terrain characteristics of an area to be mapped. The controller also can determine a structure mapping flight path based on the existence and location of vertical structures within the area to be mapped. The area mapping flight path can be appended with the structure mapping flight path, thus providing an integrated and optimized UAV flight path for 3D mapping and modeling. This system also has a non-transitory computer-readable storage medium with instructions stored thereon to control a flight path of an unmanned aerial vehicle (UAV) based image capture system for 3D modeling.

In another invention with US10187806 patent No, filed in 14 April, 2015 in USPTO named “ Systems and methods for obtaining accurate 3D modeling data using multiple cameras” provides a system and method for using an Unmanned Aerial Vehicle (UAV) to obtain data capture at a cell site for developing a three dimensional (3D) thereof include causing the UAV to fly a given flight path about a cell tower at the cell site; obtaining data capture during the flight path about the cell tower, wherein the data capture includes a plurality of photos or video subject to a plurality of constraints, wherein the plurality of photos are obtained by a plurality of cameras which are coordinated with one another; and subsequent to the obtaining, processing the data capture to define a three dimensional (3D) model of the cell site based on one or more objects of interest in the data capture.

An invention with US20100156901 Al publication No, filed in 22 December, 2008 in USPTO named “Method and apparatus for reconstructing 3d model” is a method of reconstructing a 3D model includes reconstructing a 3D voxelbased visual hull model using input images of an object captured by a multi view camera; converting the 3D voxel-based visual hull model into a mesh model; and generating a result of view-dependent rendering of a 3D model by performing the view-dependent texture mapping on the mesh model obtained through the conversion. Further, the reconstructing includes defining a 3D voxel space to be reconstructed; and excluding voxels not belonging to the object from the defined 3D voxel space.

An invention with US6990228 patent No, filed in 17 December , 1999 in USPTO named “Image processing apparatus” an image processing apparatus having a processor for processing image data representing images of an object taken from a plurality of different camera positions, a method of processing image data to derive a representation of a three-dimensional surface of the object is described. This method comprises an initial volume containing the object surface as an initial space formed of voxels; the accessing data representing images of the object recorded at different camera positions with respect to the object; checking to see if a voxel meets at least one criterion by projecting that voxel into at least one of the images; dividing, if the voxel does not meet the at least one criterion, the voxel into subsidiary voxels.

Another invention with EP2584534A2 patent No, filed in 20 October, 2011 in European Patent office, named “Computer-implemented method for creating a virtual 3D model of a real three-dimensional real object and product formed on this basis” provides a computer-implemented method for generating a virtual 3D model of a real three-dimensional real object by means of a scanner and a computer connected thereto, wherein a plurality of layered 2D scans are generated in a pixel format by means of the scanner which the structures are identified in the 2D scans. The 2D scans are each converted from the pixel format into a vector format. The 3D scan can be used for scientific, medical, therapeutic or cosmetic purposes by means of a conventional imaging in this field.

In the next invention with US8953024B2 patent No, filed in 21 February, 2012 in USPTO named “3D scene model from collection of images” a method for determining a three-dimensional model of a scene from a collection of digital images, wherein the collection includes a plurality of digital images captured from a variety of camera positions. A set of the digital images from the collection are selected, wherein each digital image contains overlapping scene content with at least one other digital image in the set of digital images, and wherein the set of digital images overlap to cover a contiguous portion of the scene. Pairs of digital images from the set of digital images to determine a camera position for each digital image.

DESCRIPTION OF THE INVENTION

The present invention is a kind of smart system as well as a method for preparing three-dimensional maps and images of buildings and urban or natural features based on two-dimensional photos and the use of artificial intelligence along with the machine learning. Increasing need to access to up-to-date and comprehensive information on the geometric form, dimensions and using materials, as well as how to use (residential, commercial, etc.) buildings and urban features (such as traffic signs and traffic lights, etc.) which are available in the city due to the application of this information in control and management of crisis (including the investigation and possible detection of blocked and unblocked routes by earthquake debris) and also in the field of passive defense, one of the constant need is urban management and planning in critical situations. Today, building information modeling (BIM) is widely used from design and construction to operation and even the stage of demolition of buildings. This technology helps building project managers and executives at every stage to make the right decision by digitally displaying building properties. By using this system, control and management of construction projects and demolition of buildings can be done at a lower cost, faster and more accurately and without the need for a specialist. In this invention, the aim is to prepare comprehensive and accurate information about the quality of facade, dimensions and geometric form of each building in the city using three- dimensional modeling of buildings. Generally, the ultimate goal is a comprehensive analysis from the appearance of existing buildings. In order to carry out three-dimensional modeling of buildings and urban features in terms of quality, dimensions and geometric form, in this design, two-dimensional images are first prepared using a number of mobile robots (including drones) which carry cameras with appropriate resolution and then these two- dimensional images are converted into three-dimensional models of buildings using machine learning and deep learning techniques, as well as machine vision, which provide information such as the quality of facade structures, the exact dimensions and the geometric shape of the building, comprehensively and accurately.

In this invention, the two dimensional images are captured from up and front view of the building. The structure of the robot related to the photography has to be equipped to the several different systems. The first system is the ranging system based on the stereo and image processing and also the professional and ultrasonic rangefinders can be as a part of the intended robot. The existence of the gyroscope module as well as the module for the accurate location detection of the robot and the subject of photography can consider as a side system of the robot and also the accurate photography module can fix on the robot as a photography unit. The said robot can be a flying robot like a drone or a land robot. Building imaging is done with the method and strategy of the map projection. Now the drone can be remote controlled and in most cases their movement and the manner of operation controlled by human.

In this invention, due to the need for the high speed and accuracy, the operation part which should be done by human, must be done by smart structures and machines.

The purpose of this invention is that the drone should have the ability of automatic imaging to capture photos using techniques of artificial intelligence. In this way that according to dimensions and building’s geometric complexities as well as appropriate resolution to estimate the quality and material of facade, the drone can provide the appropriate and comprehensive two dimensional images, therefore the imaging using artificial intelligence which the robot is equipped with it, will done accurate, comprehensive and sufficient.

According to figure 1 , the purpose is to provide three dimensional model from the building which exist in this figure. The robot equipped with the system to send and receive the information which can send the captured photos to a server equipped and connected to an artificial intelligence for analyzing, allows correct analyzing of captured photos and detection of correct position of the robot using the subject’s coordinate as well as the robot’s coordinate and also the information related to ranging and basic imaging system. First the artificial intelligence must have estimation of a suitable distance from the building’s total view in photo shooting. According to designed protocols in the software, the first images are considered as reference, if the distance estimation was done correctly, and image is captured from the total view of the building exactly perpendicular to the building’s view sheet. If the distance estimation wasn’t correct, the suitable distance will estimate again and then the robot which can be a drone, moves back and forth in the suitable distance and perpendicular to the view sheet and an image is captured from the total view of the building. After imaging the total view, in order to increase the resolution and find a suitable resolution for estimating the quality of the facade, first the drone is located in the central point of the building and by imaging the total view, it divides this view into several separate areas for photography and then for each area, it will be located in the center of that area at a convenient distance for imaging and it only captures some other images from the same area and continues this division till the number of divisions is such that the geometric form and material existed in the facade can be accurately estimated. The imaging process is done in 3 steps for each part, so that the robot, which can be a drone, according to Figure 3, after the first shooting, moves away from the subject to a certain extent and shoots again, also for the second time, changes its position toward the subject and shoots again. Now, according to the laser rangefinder and captured photos, the artificial intelligence existed on the system can obtain accurate calculations of the size and dimensions of the intended section. Also, based on the initial definitions on the system, artificial intelligence can accurately identify the type of facade by comparing existing pixels in each image, as well as images related to the surface.

Now consider the situation where the drone is not able to move backwards to capture the total view due to the density of neighboring buildings, i.e. for imaging the total view, the appropriate distance of the drone from the building is greater than or equal to the distance between building X and the front building and the drone hits the front building during photography, in this case the drone automatically senses the presence of the front building and from the beginning divides the building for photography and then takes photos. The drone continues this segmentation till the resolution is sufficient to estimate the quality and geometric structure of the facade. It should be noted that all images are taken perpendicular to the view sheet. If needed, this imaging can be done from other facades of the building due to the complexity of the geometric form and the diversity of the construction’s material.

Throughout the imaging process, the drone is connected to an artificial intelligence system and does not require human intervention to control the drone. In fact, adjusting the appropriate distance, vertical shooting, number of pixels required for imaging and the required resolution learn to the drone using artificial intelligence techniques. Eliminating manpower control in this imaging, makes imaging faster, more accurate and also cheaper.

Finally, the accurate and comprehensive images convert to the data and then to the three dimensional model of the building using vision and machine learning techniques. Now by having comprehensive information and three dimensional models of every building or even every urban features, can obtain the three dimensional modeling of a region or city.

A reachable and speedy way to generate a three dimensional map is to extract three dimensional data from photographed two dimensional images by employing machine learning and machine vision techniques. Multiple two- dimensional images should be captured from different views of each construction. Two dimensional images should be photographed thoroughly without missing details to increase the validity of the final result. Precise cameras should be utilized to capture two dimensional images of every building in the city, exhibiting every aspect to actualize three dimensional models.

Three dimensional projection mapping as a speedy and low-cost way was used to prepare two dimensional images of the building. A drone should be employed carrying several RGB-D cameras and light sources to photograph the front view of a building for texture-making and exact depth estimation. Using a drone moving aligned with the building's height increases the chance that the three dimensional structural map of the building is more accurate and texture-making is appropriately done. Also, the drone should move aligned the depth capturing photos for accurate depth estimation based on the construction's height and neighborhood building density.

After preparing two dimensional images containing comprehensive information about every facet of the construction, images should be processed with machine vision, machine learning, and deep learning algorithms to extract three dimensional data out of them. Machine vision techniques can provide information about the depth and colors of each construction in high resolutions. To process two dimensional data and generating three dimensional data, machine learning and deep learning algorithms are employed. Two-dimensional captured images of the buildings are processed by machine learning approaches to obtain three dimensional models of each construction. Deep learning algorithms identify the building's boundaries, and then background can be removed using some other algorithms. Two-dimensional pictures are converted to two dimensional matrices, and then two dimensional matrices are processed by machine learning techniques to produce three dimensional matrices. Then finally, raw three dimensional models of the buildings can be extracted from pixel matrices. Machine learning methods also facilitate the actualization of the structural map. Machine learning approaches can be employed for texture making and generate a precise and real view of each construction. By processing two dimensional captured images of the buildings, the dimensions of each structure can be determined using machine vision and machine learning methods.

BRIEF DESCRIPTION OF FIGURES

Figure 1 which shows the position of robot against the imaging object. Figure 2 which shows the manner of dividing the large view into smaller parts, to speed up and increase the imaging accuracy.

Figure 3 which shows the numerous imaging stations of the robot from one part of the facade.

Claims

What is claimed is:

1. The invention of method of automatic 3D designing of constructions and colonies in an smart system using a combination of machine scanning and imaging and machine learning and reconstruction of 3D model through deep learning and with the help of machine learning methods, to create texture and develop ultimate view of the construction contains at least an imaging system fixed on a flying robot or a landing robot and also an online system to permanent analysis of received information and also a stereo ranging system and an accurate localization system and also a structure of an artificial intelligence conforming image analysis and also a server equipped with softwares conforming machine learning and deep learning.

2. The invention of claim one which may be a method of extracting multiple images in parallel with the help of several flying robots.

3. The invention of claim one in which the system of finding distance from the subject can be done by capturing a few concentric images of the whole or part of the subject and analyzing images and setting calculations based on initial reference.

4. The invention of claim one in which the captured photos each has an accurate identity for distance, subject’s dimensions, the imaging angle and the geographical and local coordinates.

5. The invention of claim one in which the imaging robot carry several RGB-D cameras to prepare images with two dimensional depth.

6. The invention of claim one in which flying or landing robots can be used.

7. The invention of claim one in which the flying robots also cany’ several light sources to generate the exact dimensions.

8. The invention of claim one in which depending on the construction's height and the neighborhood building density, an image is captured from the maximum possible distance to prepare a general view of the construction and this will use to determine the number of facade divisions as well as determining the accurate location of di visions.

9. The invention of claim one in which the flying robots move aligned with the building's height and depth to record every details for applying texture and color on the three dimensional model of the construction.

10. The invention of claim one in which after imaging, recording and sending details, based on the complexity and size of the structure and also its neighborhood density, the size of two dimensional images of construction and the number of captured images from surrounded environment will determine.

11. The invention of claim one in which the construction boundary processing will be detected using machine learning algorithms and by processing pixel of captured images.

12. The invention of claim one in which the background of image will be eliminated using machine learning algorithms, and the pure images of the construction without the surrounding environment are prepared.

13. The invention of claim one in which the images are processed by machine vision and machine learning techniques and multiple matrices from captured images used to form three dimensional matrices.

14. The invention of claim one in which by combining several matrices and calculating the amount of light reflection from several different angles, the normal vector is identified for each pixel.

15. The invention of claim one in which a two dimensional profile from different views is prepared to exhibit each pixel with its normal vector.

16. The invention of claim one in which each pixel's data will be compared with the normal vector, and by combining the different views of every pixel, a 3D matrix will be generated in X, Y, and Z directions.

17. The invention of claim one in which a three dimensional view of the construction is extracted out of the pixel matrices using machine learning techniques without describing the texture and color.

18. The invention of claim one in which by aiming to extract an actual and accurate three dimensional model of the construction, the texture and color of the building will be created using deep learning algorithms.

19. The invention of claim one in which the final accurate three dimensional model of the structure is obtained by combining the raw three dimensional model and the texture using machine learning methods. 0. The invention of claim one in which the three dimensional structural models of different constructions in a specific neighborhood can be combined to prepare a special region's three dimensional structural model. 1. The invention of claim one in which the three dimensional structural models of different neighborhoods in a city can be combined to reconstruct a three dimensional structural model of a city. 2. The invention of claim one in which using the machine learning technique when facing with similar buildings, correctly accomplish the operation of previous created algorithms as well as the process of detecting one of the best imaging algorithms using past operations.

23. The invention of claim one in which by aggregation of captured photos from different views of a building as well as roof of building can provide an accurate volumetric matrices to determine the three dimensional pattern of the building.

17 The invention of claim one in which can learn the machine to indentify the accurate quality of facade using the facade surface analysis and comparing with available database.

18