WO2024038551A1 - Building interior structure recognition system and building interior structure recognition method - Google Patents

Building interior structure recognition system and building interior structure recognition method Download PDF

Info

Publication number
WO2024038551A1
WO2024038551A1 PCT/JP2022/031247 JP2022031247W WO2024038551A1 WO 2024038551 A1 WO2024038551 A1 WO 2024038551A1 JP 2022031247 W JP2022031247 W JP 2022031247W WO 2024038551 A1 WO2024038551 A1 WO 2024038551A1
Authority
WO
WIPO (PCT)
Prior art keywords
building
image
machine
learned model
scanning
Prior art date
Application number
PCT/JP2022/031247
Other languages
French (fr)
Japanese (ja)
Inventor
徹 伊藤
康文 福間
ザイシン マオ
央 塚田
Original Assignee
株式会社 Sai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 Sai filed Critical 株式会社 Sai
Priority to PCT/JP2022/031247 priority Critical patent/WO2024038551A1/en
Publication of WO2024038551A1 publication Critical patent/WO2024038551A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation

Definitions

  • the present invention relates to an in-building structure recognition system and an in-building structure recognition method, and in particular, to an in-building structure recognition system that recognizes a structure placed inside a building such as a building using deep learning using a neural network.
  • the present invention relates to a recognition system, a method and program for recognizing structures inside buildings.
  • Another idea is to perform machine learning and create a trained model using a rendered image of a completed 3D model of the construction site that closely resembles the actual appearance, rather than photos of the actual construction site. It will be done.
  • rendered images are mainly created for commercial purposes of buildings, and their production costs are high, making it difficult to prepare a sufficient number of rendered images as learning images. Further, the work of annotating structures included in rendered images becomes enormous and requires time and effort to perform manually.
  • the trained model created thereby is required to be able to recognize structures with high accuracy.
  • the system is capable of recognizing with high accuracy site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. is required.
  • noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site.
  • the existence status of such noise components changes from moment to moment as they are moved or added to suit the construction situation, so it is necessary to It is recommended to use a model. In this case, it is desirable to be able to minimize the time and cost of regenerating a model tailored to the site.
  • Non-Patent Document 1 discusses the problem of the huge amount of point cloud data in as-built modeling, which creates 3D models based on 3D measurements of existing large-scale equipment, and describes the problem of ⁇ Measurement used in as-built modeling of large-scale equipment''. It should be noted that the measurement principle of this device is different from that of point cloud measurement devices for small parts.For point cloud measurement of small parts, triangulation is generally performed using a laser output device and a CCD camera. However, with this method, as the size of the object increases, the equipment becomes larger.Also, when measuring small parts, the measured point cloud is often only a few million points at most, but in the case of large equipment, It has been pointed out that "a large number of point clouds are required for modeling.”
  • Patent Document 1 "digitized data of existing parts of a building acquired from existing drawings is converted into 3D CAD data, and created from point cloud data acquired by a 3D laser scanner or the point cloud data.
  • Existing partial survey means is stored together with various field survey data including the 3D polygon model that has been constructed, and newly constructed part objects are selected from member objects stored in the member library in advance for the 3D polygon model.
  • a CPU functioning as a member construction position output means for searching and outputting a member object corresponding to a unique ID together with its construction position information from a three-dimensional CAD model designed by the construction member design means; and an automatic position pointing device that indicates the construction position of the member in the existing part based on the construction position information of the member object output by the member construction position output means.
  • an image acquisition unit that acquires an input image generated by imaging a real space using an imaging device, and based on the position of one or more feature points reflected in the input image, a recognition unit that recognizes a relative position and orientation between the real space and the imaging device; an application unit that provides an augmented reality application using the recognized relative position and orientation; and the recognition unit. and a display control unit that superimposes a guiding object that guides a user operating the imaging device on the input image according to the distribution of the feature points so that the recognition processing performed by the image processing device is stabilized. is disclosed.
  • Patent Documents 1 and 2 both disclose techniques for grasping a three-dimensional space or an object in a three-dimensional space, in particular, three-dimensional point clouds in large-scale facilities such as buildings and factories are used. It did not solve the problem of huge amounts of data, and it was not suitable for automating the recognition of structures in images in order to quickly understand the situation at a construction site in the middle of construction. .
  • Patent Documents 1 and 2 also apply to site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. There was no consideration given to recognizing structures with high accuracy or retraining the model to match the latest site conditions.
  • the present invention solves the above-mentioned problems, and also handles site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site.
  • the present invention provides an in-building structure recognition system and an in-building structure recognition method that can recognize target structures with high accuracy and can recognize structures with high precision in accordance with the latest site conditions. be.
  • the present invention also provides a program for causing a computer to execute each step of the method for recognizing structures inside a building.
  • the present invention provides an in-building structure recognition system for recognizing in-building structures using a machine learning model, which uses BIM (Building Information Modeling) data and a first site image.
  • a first machine learning model generation unit that performs machine learning using a first machine learning model to generate a first machine learned model, and an image of a noise construct that does not have BIM data for the first machine learned model a second machine learning model generation unit that performs relearning using a second on-site image containing images to generate a second machine learned model; a scanning unit that scans the interior of the building and acquires three-dimensional point cloud data and a third on-site image; and a noise component that removes an image of the noise constituent from the third on-site image acquired by the scanning unit.
  • BIM Building Information Modeling
  • a structure recognition system Provides a structure recognition system.
  • the first machine learning model generation unit uses an image generated from BIM data as correct data and an image generated by rendering the BIM data as a first site image.
  • the method is characterized in that machine learning is performed using an image obtained by processing using information obtained from observation data as observation data to generate a first machine learned model.
  • the second machine learning model generation unit is configured to generate correct data and observation data for the second site image that includes an image of a noise component that does not have BIM data.
  • the method is characterized in that the set is input to a first machine-learned model and re-learning is performed to generate a second machine-learned model.
  • the scanning unit acquires images inside the building, and at least one corresponding reference point or reference structure exists between successive frames. If at least one corresponding reference point or reference structure does not exist, an alert prompting rescanning is notified.
  • the noise component removal unit extracts an image of the noise construct from the third on-site image acquired by the scanning unit by stereo matching, and generates a mask image of the noise construct.
  • the method is characterized in that it generates an image, reconstructs the image so as to interpolate a portion of the mask image, and generates an image from which noise components have been removed.
  • the third machine learning model generation unit generates a correct answer for the image from which noise components have been removed from the third on-site image obtained by the noise component removal unit.
  • the method is characterized in that a set of data and observed data is input to a second machine-learned model and re-learning is performed to generate a third machine-learned model.
  • the in-building structure recognition unit inputs a third on-site image to a third machine-learned model to identify inside buildings included in the third on-site image.
  • the feature is that it recognizes the structure of.
  • the present invention provides an in-building structure recognition method for recognizing structures in a building using a machine learning model, which performs machine learning using BIM (Building Information Modeling) data and a first site image. and retraining the first machine learned model using a second site image that includes an image of a noise composition that does not have BIM data. scanning the building while determining the success or failure of scanning the structures inside the building, and generating 3D point cloud data of the structures inside the building and images inside the building.
  • BIM Building Information Modeling
  • a method for recognizing a structure inside a building includes a step of extracting and outputting point cloud data of a structure inside a building recognized by an inside structure recognition unit from the three-dimensional point cloud data acquired in the above.
  • the scanning step includes acquiring an image inside the building and determining whether or not there is at least one corresponding reference point or reference structure.
  • the method is characterized in that an alert prompting rescanning is notified if at least one or more corresponding reference points or reference structures do not exist.
  • the present invention provides a program that causes a computer to execute each step of the above method for recognizing structures inside a building.
  • BIM Building Information Modeling
  • the accuracy of structure recognition can be improved by relearning the model to match the latest site conditions in response to noise components that change moment by moment.
  • FIG. 1 is a schematic diagram showing the entire structure recognition system in a building according to the present invention.
  • FIG. 2 is a diagram showing the flow of each process of the building structure recognition system according to the present invention.
  • FIG. 3 is a schematic diagram showing the first machine learning model generation section of the present invention.
  • FIG. 4 is a schematic diagram showing the second machine learning model generation section of the present invention.
  • FIG. 5 is a schematic diagram showing the scanning section of the present invention.
  • FIG. 6 is a schematic diagram illustrating the noise construct removal section of the present invention.
  • FIG. 7 is a schematic diagram showing the third machine learning model generation section of the present invention.
  • FIG. 8 is a schematic diagram showing the intra-building structure recognition section of the present invention.
  • FIG. 9 is a diagram showing the overall flow of the method for recognizing structures inside a building according to the present invention.
  • FIG. 1 is a schematic diagram showing the entire building structure recognition system 1 according to the present invention.
  • the in-building structure recognition system 1 according to the present invention includes a first machine learning model that performs machine learning using BIM (Building Information Modeling) data and a first site image to generate a first machine learned model.
  • the generation unit 11 performs relearning on the first machine learned model M1 using a second site image including an image of a noise component that does not have BIM data, and generates a second machine learned model M2.
  • a second machine learning model generation unit 12 that generates 3D point cloud data inside the building and a third on-site image are obtained by scanning the inside of the building while determining the success or failure of scanning the structures inside the building.
  • a third machine learning model generation unit 13 that performs relearning on the second machine learned model using the image from which noise components have been removed, and generates a third machine learned model M3;
  • the building structure recognition unit 40 uses the third machine learned model M3 to recognize the structures in the building, and the building structure recognition unit recognizes the structures from the three-dimensional point cloud data acquired by the scanning unit.
  • a point cloud data output unit 50 that extracts and outputs point cloud data of structures in the building.
  • the first machine learning model generation unit 11 performs machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model. Specifically, the first machine learning model generation unit 11 uses the image generated from the BIM data as correct data, and processes the image generated by rendering the BIM data using the information obtained from the first site image. Machine learning is performed using the image obtained by this as observation data to generate a first machine learned model.
  • BIM Building Information Modeling
  • the second machine learning model generation unit 12 performs relearning on the first machine learned model M1 using a second site image including an image of a noise component that does not have BIM data, and A machine learned model M2 is generated. Specifically, the second machine learning model generation unit 12 generates a set of correct data and observed data for the second site image including an image of a noise component that does not have BIM data, using the first machine learning model generation unit 12. A second machine-learned model is generated by inputting the input into the trained model and performing re-learning.
  • the scanning unit 20 scans the inside of the building while determining the success or failure of scanning the structures inside the building, and acquires three-dimensional point cloud data inside the building and a third on-site image.
  • the scanning unit 20 acquires an image inside the building, determines whether or not there is at least one corresponding reference point or reference structure, and determines whether there is at least one or more corresponding reference point or reference structure. If a structure does not exist, an alert will be sent prompting a rescan.
  • the noise constituent removal unit 30 extracts an image from which noise constituents have been removed from the image inside the building acquired by the scanning unit 20.
  • the noise constituent removing unit 30 extracts an image of the noise constituent from the third scene image acquired by the scanning unit 20 by stereo matching, generates a mask image of the noise constituent, and interpolates the portion of the mask image. The image is reconstructed to generate an image from which noise components have been removed.
  • the third machine learning model generation unit 13 performs relearning on the second machine learning model using the image from which the noise components extracted by the noise component removal unit have been removed, and generates a third machine learning model. Generate trained model M3. Specifically, the third machine learning model generation unit 13 generates a set of correct data and observed data for the image from which noise components have been removed from the third on-site image, obtained by the noise component removal unit 30. , and performs relearning by inputting it into the second machine learned model to generate a third machine learned model.
  • the in-building structure recognition unit 40 recognizes the structures in the building using the third machine learned model M3.
  • the intra-building structure recognition unit 40 inputs the third on-site image to the third machine-learned model and recognizes the intra-building structures included in the third on-site image.
  • the point cloud data output unit 50 extracts and outputs point cloud data of the structures inside the building recognized by the building structure recognition unit from the three-dimensional point cloud data acquired by the scanning unit.
  • FIG. 2 is a diagram showing the flow of each process of the building structure recognition system according to the present invention.
  • Figure 2 shows the relationships between machine learning model generation processing, data acquisition/scanning processing, noise component removal processing, and building structure recognition processing among the processes executed in the building structure recognition system.
  • the machine learning model generation process is executed by the first machine learning model generation unit 11, the second machine learning model generation unit 12, or the third machine learning model generation unit 13.
  • the data acquisition/scanning process is executed by the scanning section 20 or the point cloud data output section 50.
  • the pre-processing before scanning may be performed by any external imaging device or scanning device (not shown).
  • the noise constituent removal process is executed by the noise constituent removal unit 30.
  • the building structure recognition process is executed by the building structure recognition section 40.
  • the overall processing flow is roughly divided into pre-processing before scanning and processing after scanning.
  • the pre-processing before scanning includes generating a first machine learned model and relearning the first machine learned model to generate a second machine learned model.
  • BIM data and a first site image for generating a first machine learned model are acquired.
  • a first machine learned model is generated using the acquired BIM data and the first site image.
  • the first machine learned model is created assuming an ideal site, and can be used universally for various sites. However, in the actual site, there may be items (wire mesh, protective nets and sheets, temporarily installed iron fences and poles, trash, materials, etc.) that are not included in the BIM data.
  • a second site image that includes such noise components that become noise, and added it to the first machine learned model in order to learn the noise components as noise.
  • Re-learning will be performed.
  • a second machine-learned model matching the actual site situation is obtained.
  • the second site image is acquired when inspecting the site before performing actual scanning at the site.
  • Processing after scanning involves scanning, removing noise components, generating a third machine learned model, recognizing structures inside the building, and acquiring point cloud data. It includes doing.
  • the success or failure of scanning is determined, and if it is necessary to rescan, an alert is sent to prompt rescanning. If an alert prompting a rescan is received, a rescan will be performed. Repeat this until all objects in the building are scanned. Noise components are then removed from the scanned third scene image.
  • FIG. 3 is a schematic diagram showing the first machine learning model generation section of the present invention.
  • the first machine learning model generation unit 11 uses the image generated from the BIM data as correct data (correct image), and processes the image generated by rendering the BIM data using the information obtained from the first site image. Machine learning is performed using the image obtained by this as observation data (observed image) to generate a first machine learned model.
  • the correct image generated from the BIM data shows the structure in the image in a way that distinguishes it from the background.
  • the correct image may be one that is manually generated, such as by manually filling in parts of the structure within the image.
  • the correct image may be, for example, a binarized image in which a structure part and a background part can be distinguished. Observation images are also generated from BIM data.
  • a rendered image is generated by rendering BIM data.
  • using information such as textures extracted from the first site image, textures, etc. are added to the rendered image to generate an image that is more similar to the actual photograph of the site, and this is used as an observation image.
  • a machine learning model generation process is performed using such a set of the correct image and observed image to generate a first machine learned model M1.
  • the first machine learned model M1 can be used as a general-purpose machine learned model for recognizing structures inside a building. In particular, when scanning a building in an ideal environment where no noise components exist and recognizing structures within the building, the first machine-learned model M1 can be used.
  • FIG. 4 is a schematic diagram showing the second machine learning model generation section of the present invention.
  • the second machine learning model generation unit 12 inputs a set of correct data and observation data regarding the noise component image that does not have BIM data included in the second site image into the first machine learned model M1. Then, relearning is performed to generate a second machine learned model M2.
  • the first machine learned model M1 since the correct image and observed image are generated based on BIM data and used for machine learning, the first machine learned model M1 is free of noise components. Suitable for scanning ideal building environments that do not exist. On the other hand, in actual sites, noise components that do not have BIM data often exist.
  • a second machine learned model M2 that is capable of is generated.
  • a set of correct images and observed images to be used for relearning the first machine learned model M1 is generated from the second on-site image.
  • the correct image of the second scene image shows the structure in the image so as to be distinguished from the background.
  • the correct image of the second site image may be one that is manually generated, such as by manually filling in a portion of the structure in the image.
  • the correct image of the second site image may be, for example, a binarized image in which a structure part and a background part can be distinguished.
  • the second on-site image including noise components may be used as is. Further, as the observed image of the second on-site image, an image obtained by pre-processing the second on-site image including noise components as necessary may be used. Using such a set of the correct image of the second on-site image and the observed image, a relearning process is performed on the first machine learned model M1 to generate a second machine learned model M2.
  • FIG. 5 is a schematic diagram showing the scanning section of the present invention.
  • the scanning unit 20 acquires an image inside the building, determines whether or not there is at least one corresponding reference point or reference structure, and determines whether there is at least one or more corresponding reference point or reference structure. If a structure does not exist, an alert will be sent prompting a rescan.
  • the scanning unit 20 may include a scanning success/failure determination unit 203, an alert notification unit 204, and a rescan processing unit 205 for each function.
  • the scanning success/failure determination unit 203 determines whether there is at least one corresponding reference point or reference structure.
  • the "reference point” is a point that serves as a reference for matching consecutive frames, and for example, a marker or the like may be attached to a structure in a building to serve as the reference point.
  • a “reference structure” is a structure that serves as a reference for matching consecutive frames, and for example, a structure that has a straight part, such as a corner of a column, is a structure that serves as a reference. You may also do so.
  • the alert notification unit 204 notifies an alert prompting rescanning when at least one corresponding reference point or reference structure does not exist.
  • the alert is for notifying the user that rescanning is necessary, and includes the display of an icon or message on the display screen of a scanning device such as the distance measuring scanner 201 or the imaging device 202, a warning sound, etc. It may be.
  • the rescan processing unit 205 receives a rescan instruction from the user and performs the rescan.
  • the processing in the scanning success/failure determination unit 203, alert notification unit 204, and rescan processing unit 205 is repeated until all scanning of the scan target is completed, and when the scanning is completed, the third site image and the three-dimensional point cloud data are is obtained.
  • the third on-site image and three-dimensional point cloud data acquired by the scanning unit 20 may include information on noise components.
  • FIG. 6 is a schematic diagram illustrating the noise construct removal section of the present invention.
  • the noise constituent removing unit 30 extracts an image of the noise constituent from the third scene image acquired by the scanning unit 20 by stereo matching, generates a mask image of the noise constituent, and interpolates the portion of the mask image. The image is reconstructed to generate an image from which noise components have been removed.
  • the noise component removal unit 30 may include a stereo matching unit 301, a mask image generation unit 302, and an image reconstruction unit 303 for each function.
  • the stereo matching unit 301 adds two images of the same object taken from different viewpoints (typical (right image and left image) are input, three-dimensional depth is estimated for each pixel, and a distance image is generated in which the estimated depth for each pixel is represented by a gradation.
  • Existing machine learned models for performing stereo matching calculate the parallax that represents the disparity between two images of the same object taken from different viewpoints (typically a right image and a left image). It may be a machine-learned model in which a mapping function for obtaining an image is learned using a convolutional neural network (CNN). Additionally, existing machine learned models for performing stereo matching may use recursive refinement to update disparity from coarse to fine, or a hierarchical network with a layered cascade architecture for inference.
  • CNN convolutional neural network
  • the stereo matching unit 301 may use an existing stereo matching method that does not use a machine learning model instead of the above method that uses an existing machine learned model for stereo matching.
  • the three-dimensional depth of each pixel is estimated using two images (typically the right image and the left image) taken from two points, and the estimated depth of each pixel is A distance image represented by a gradation may be generated.
  • the mask image generation unit 302 performs threshold processing on the distance image generated by the stereo matching unit 301 to generate a mask image of the noise component. For example, if a wire mesh that is a noise component exists in front of a structure to be recognized, a mask image of the wire mesh portion that is a noise component is generated.
  • the image reconstruction unit 303 removes the mask portion of the mask image generated by the mask image generation unit 302 from the original image, and obtains a reconstructed image by interpolating the mask image portion.
  • the image reconstruction unit 303 adds the third on-site image containing noise components and the mask image of the noise components generated by the mask image generation unit 302 to an existing machine learned model for image reconstruction.
  • a reconstructed image may be obtained such that regions of the mask image are removed and portions of the mask image are interpolated.
  • the existing machine learned model for reconstructing an image may be an existing machine learned model generated by deep learning using a neural network. Thereby, the image reconstruction unit 303 obtains an image from which noise components have been removed from the third on-site image.
  • FIG. 7 is a schematic diagram showing the third machine learning model generation section of the present invention.
  • the third machine learning model generation unit 13 generates a set of correct data and observed data for the image from which noise components have been removed from the image inside the building extracted by the noise component removal unit 30.
  • the input is input to the trained model and relearning is performed to generate a third machine learned model.
  • the correct image of the second on-site image containing noise components and the observed image obtained in pre-processing before scanning are used for re-learning.
  • noise components were removed from the third field image that contained the noise components and was acquired during the actual scanning.
  • a third machine-learned model M3 is generated that is capable of recognizing structures further suited to the target site.
  • the set of correct images and observed images to be used for relearning the second machine learned model M2 is generated from the image obtained by the noise component removal unit 30 from which noise components have been removed from the third on-site image. be done.
  • the correct image of the image from which the noise components have been removed is one that shows the structures in the image in a way that distinguishes them from the background.
  • the correct image of the image from which the noise components have been removed may be one that is manually generated, for example, by manually filling in portions of the structure in the image.
  • the correct image of the image from which noise components have been removed may be, for example, a binarized image in which a structure part and a background part can be distinguished.
  • the image from which the noise components have been removed may be used as is.
  • the observed image of the image from which the noise components were removed was processed by preprocessing as necessary on the image from which the noise components were removed from the third on-site image containing the noise components. Images may also be used. Using the set of the correct image and observed image from which such noise components have been removed, relearning processing is performed on the second machine learned model M2, and the third machine learned model M3 is generate.
  • FIG. 8 is a schematic diagram showing an intra-building structure recognition section and a point cloud data output section of the present invention.
  • the intra-building structure recognition unit 40 inputs the third on-site image to the third machine-learned model M3 and recognizes the intra-building structures included in the third on-site image.
  • the output from the third machine learned model M3 shows the structure inside the building in the image so as to distinguish it from the background.
  • the output from the third machine-learned model M3 may be a binarized image that can distinguish between the components inside the building and the background.
  • the third scene image may include noise constructs that were present during the scanning of the scene.
  • the third machine learned model M3 is obtained by additionally learning an image in which noise components are removed from a third on-site image that includes noise components, and when the third on-site image includes noise components, Structures inside a building can be recognized with high accuracy even when
  • the point cloud data output unit 50 is a scanning unit that generates three-dimensional point cloud data of a portion corresponding to the recognized structure in the building based on the image information of the structure in the building recognized by the structure recognition unit 40 in the building. It is extracted from the three-dimensional point cloud data scanned in step 20 and output as point cloud data. As a result, 3D point cloud data of the structures inside the building is obtained, and by performing rendering etc. on the obtained 3D point cloud data, it is used to generate 3D CAD data of the inside of the building including the structure, etc. It becomes possible to use it for.
  • FIG. 9 is a diagram showing the overall flow of the method for recognizing structures inside a building according to the present invention.
  • the indoor structure recognition method according to the present invention includes step S901 of performing machine learning using BIM (Building Information Modeling) data and a first site image to generate a first machine learned model; Step S902 of performing re-learning on the learned model M1 using a second site image including images of noise components that do not have BIM data to generate a second machine-learned model M2; A scanning step S903 in which the building is scanned while determining the success or failure of scanning of the structure in the building, and three-dimensional point cloud data of the structure in the building and an image inside the building are obtained; Step S904 of removing the image of the noise component from the image of Step S905 of generating the third machine learned model M3; Step S906 of recognizing structures in the building using the third machine learned model M3; and S906, from the three-dimensional point cloud
  • BIM Building Information Modeling
  • the scanning step S903 acquires an image inside the building and determines whether there is at least one corresponding reference point or reference structure. If at least one corresponding reference point or reference structure does not exist, an alert prompting rescanning is sent.
  • the present invention provides a program that causes a computer to execute each step of the above method for recognizing structures inside a building.
  • An in-building structure recognition system for recognizing structures in a building using a machine learning model in Example 2 performs machine learning using BIM (Building Information Modeling) data and a first site image, and The first machine learning model generation unit 11 generates the first machine learned model M1, scans the building while determining the success or failure of scanning the structures in the building, and generates three-dimensional point cloud data and data in the building.
  • BIM Building Information Modeling
  • a scanning section 20 that acquires a third on-site image; a noise component removal section 30 that removes an image of noise components from the image inside the building acquired by the scanning section 20; a third machine learning model generation unit 13 that performs relearning on the first machine learned model M1 using the image from which noise components have been removed, and generates a third machine learned model M3;
  • the building structure recognition unit 40 recognizes the structures in the building using the third machine learned model M3, and the three-dimensional point cloud data acquired by the scanning unit 20 recognizes the structures in the building.
  • the present invention is characterized by comprising a point cloud data output section 50 that extracts and outputs point cloud data of the structures in the building recognized by the section 40.
  • the difference from the building structure recognition system 1 of Example 1 is that the first machine-learned model is re-trained using a second site image that includes images of noise components that do not have BIM data. It does not include a second machine learning model generation unit 12 that generates a second machine learned model.
  • the third machine learning model generation unit 13 performs relearning on the first machine learned model M1 instead of the second machine learned model M2, and Generate trained model M3. That is, in the second embodiment, the third machine learning model generation unit 13 calculates the difference between the correct data and the observed data for the image from which the noise components extracted from the building image extracted by the noise component removal unit 30 are removed. The set is input to the first machine learned model M1 and relearning is performed to generate the third machine learned model M3.
  • a set of correct images and observed images to be used for relearning the first machine-learned model M1 is generated from an image obtained by the noise component removal unit 30 from which noise components have been removed from the third on-site image. be done.
  • the correct image of the image from which the noise components have been removed is one that shows the structures in the image in a way that distinguishes them from the background.
  • the correct image of the image from which the noise components have been removed may be one that is manually generated, for example, by manually filling in portions of the structure in the image.
  • the correct image of the image from which noise components have been removed may be, for example, a binarized image in which a structure part and a background part can be distinguished.
  • the image from which the noise components have been removed may be used as is.
  • the observed image of the image from which the noise components were removed was processed by preprocessing as necessary on the image from which the noise components were removed from the third on-site image containing the noise components. Images may also be used. Using a set of the correct image from which such noise components have been removed and the observed image, relearning processing is performed on the first machine learned model M1, and the third machine learned model M3 is generate.
  • the in-building structure recognition method for recognizing structures in a building using a machine learning model in Example 2 is the in-building structure recognition method for recognizing structures in a building using a machine learning model.
  • BIM Building Information Modeling
  • Step S906 recognizes the structures inside the building using the third machine learned model, and from the three-dimensional point cloud data acquired by the scanning section, Step S907 of extracting and outputting point cloud data of the structure.
  • the difference from the building structure recognition method of Example 1 is that the first machine learned model is retrained using a second site image that includes images of noise components that do not have BIM data. , does not include step S902 of generating the second machine learned model.
  • step S905 of generating the third machine learned model performs relearning on the first machine learned model M1 instead of the second machine learned model M2, and 3 machine learned model M3 is generated. That is, in Example 2, the set of correct data and observed data for the image from which noise components have been removed from the third on-site image containing noise components obtained by the noise component removal unit 30 is It is input to the machine learned model M1 and relearning is performed to generate a third machine learned model M3.
  • the building structure recognition system and the building structure recognition method according to the present invention described above it becomes possible to measure the shape and position of noteworthy members at a construction site, thereby improving accuracy and speed. can be improved. Furthermore, the amount of components to be managed at a construction site can be reduced, and accordingly, the amount of data handled by the construction site component management system can be significantly reduced.
  • wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, materials, etc. existing at the construction site can be used. Even on-site images containing noise components can be recognized with high accuracy.
  • the accuracy of structure recognition can be improved by relearning the model to match the latest site conditions in response to noise components that change moment by moment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

This building interior structure recognition system comprises: a first machine learning model generation unit that uses BIM data and a first site image to generate a first machine-learned model; a second machine learning model generation unit that uses a second site image containing an image of a noise component to carry out relearning with respect to the first machine-learned model; a scanning unit that scans the interior of a building while determining whether scanning of a structure inside the building is successful, and acquires a third site image and 3D point cloud data of a structure inside the building; a noise component removal unit that extracts an image of a noise component; a third machine learning model generation unit uses the image of the noise component to carry out relearning with respect to the second machine-learned model; a building interior structure recognition unit that uses the third machine-learned model to recognize a structure inside the building; and a point group data output unit that extracts and outputs point cloud data of the structure inside the building.

Description

建屋内構造物認識システム及び建屋内構造物認識方法Building structure recognition system and building structure recognition method
 本発明は、建屋内構造物認識システム及び建屋内構造物認識方法に関し、特に、ニューラルネットワークによる深層学習を用いてビル等の建築物の建屋内に配置された構造物を認識する建屋内構造物認識システム、建屋内構造物認識方法及びプログラムに関するものである。 The present invention relates to an in-building structure recognition system and an in-building structure recognition method, and in particular, to an in-building structure recognition system that recognizes a structure placed inside a building such as a building using deep learning using a neural network. The present invention relates to a recognition system, a method and program for recognizing structures inside buildings.
 従来、建築途中のビル等の建築物について施工状況を確認する方法としては、2次元の施工図等を用いて、施工現場にて人間が計器等を用いて直接計測して確認するか、例えばLiDER(Light Detection and Ranging)等の反射光を用いて距離を計測することが可能なリモートセンシング技術を用いて、BIM(Building Information Modeling)のモデルと比較することが行われている。 Conventionally, methods for checking the construction status of buildings such as buildings that are currently under construction include using two-dimensional construction drawings, etc., and checking by direct measurement by humans using instruments etc. at the construction site, for example. Comparisons are being made with BIM (Building Information Modeling) models using remote sensing technology that can measure distance using reflected light, such as LiDER (Light Detection and Ranging).
 しかしながら、LiDER等により計測を行う場合、経験をもとに現場の状況に応じて施工現場の複数個所を計測することが必要となり、計測者の熟練度により得られるデータの精度がことなるという問題があった。また、得られた点群データのレジストレーション(登録)を行う手間や、パイプ等の建屋内の構造物を手作業で特定し、位置やサイズを計測する手間がかかるという問題があった。また、取り込まれた点群データやそれを加工したデータの正確性の問題や、データの再利用がしにくいという問題があった。 However, when measuring using LiDER, etc., it is necessary to measure multiple locations at the construction site depending on the site situation based on experience, and there is a problem that the accuracy of the data obtained varies depending on the skill level of the measurer. was there. Further, there were problems in that it took time and effort to register the obtained point cloud data, and to manually identify structures in the building such as pipes and measure their positions and sizes. Additionally, there were problems with the accuracy of the captured point cloud data and the data processed from it, and the difficulty of reusing the data.
 データの正確性を重視して施工現場の全地点について計測を行うことは、情報量が膨大となるため現実的には採用し難い。計測者の熟練度が高い場合には、自身の経験をもとに必要な個所のみを計測することも可能であるが、熟練度によるばらつきや計測の効率化のため、計測の自動化が求められる。 It is difficult to realistically adopt measurements at all points on the construction site with an emphasis on data accuracy because the amount of information would be enormous. If the measurer is highly skilled, it is possible to measure only the necessary points based on his or her own experience, but automation of measurement is required to prevent variations due to skill level and to improve measurement efficiency. .
 建築途中の施工現場の施工状況と完成形との比較を行うために、施工現場に配設された構造物の領域の特定とその構造物が何であるかの認識を自動化することを考えた場合に、ニューラルネットワークによる深層学習による学習済モデルを用いることが期待される。 When considering automating the identification of the area of structures placed at the construction site and the recognition of the structure in order to compare the construction status of the construction site in the middle of construction with the completed form. It is expected that trained models using deep learning using neural networks will be used for this purpose.
 画像内の構造物の認識を自動化するための学習済モデルを作成するためには、学習用の入力データとして、必要十分な数の施工現場の画像が必要である。また、学習用の正解データとして、その画像に含まれる構造物に対するアノテーション、即ち、画像中のどの部分が何であるかという、画像内の構造物の認識を行った結果が必要である。しかしながら、入力データとして学習に用いることが可能な実際の施工現場の写真の画像を多数収集し、正解データとして用いるために膨大な数の構造物のアノテーションを行うことは困難である。 In order to create a trained model for automating the recognition of structures in images, a sufficient number of construction site images are required as input data for learning. Further, as the correct answer data for learning, an annotation for the structure included in the image, that is, the result of recognition of the structure in the image, indicating which part of the image is what, is required. However, it is difficult to collect a large number of photographic images of actual construction sites that can be used for learning as input data, and to annotate a huge number of structures for use as correct answer data.
 また、実際の施工現場の写真ではなく、施工現場の完成後の3次元モデルを実際の見た目に近くなるようにレンダリングしたレンダリング画像を用いて機械学習を行い、学習済モデルを作成することも考えられる。しかしながら、レンダリング画像は主に建築物の営業目的で作成されるものであり、制作コストが高く、学習用に必要十分な数の学習用画像としてレンダリング画像を用意することは困難である。また、レンダリング画像に含まれる構造物に対するアノテーションの作業も膨大となり、人手で行うには手間を要する。 Another idea is to perform machine learning and create a trained model using a rendered image of a completed 3D model of the construction site that closely resembles the actual appearance, rather than photos of the actual construction site. It will be done. However, rendered images are mainly created for commercial purposes of buildings, and their production costs are high, making it difficult to prepare a sufficient number of rendered images as learning images. Further, the work of annotating structures included in rendered images becomes enormous and requires time and effort to perform manually.
 そのため、学習用に必要十分な数の施工現場に関する学習用画像を用意することが可能であり、かつ、その学習用画像に含まれる構造物のアノテーションを自動化することが求められる。また、それにより作成された学習済モデルにより、精度の高い構造物の認識ができることが求められる。 Therefore, it is required to be able to prepare a sufficient number of learning images related to construction sites for learning, and to automate the annotation of structures included in the learning images. Furthermore, the trained model created thereby is required to be able to recognize structures with high accuracy.
 さらに、実際の施工現場においては、計測の対象である構造物以外に、金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等の様々な物品が存在する。また、現場のスキャニング作業の途中に意図せず人物が映り込む場合もある。これらのノイズ構成物は、計測の対象である構造物の認識の妨げとなり、構造物の認識精度に影響を与える。 Furthermore, at actual construction sites, in addition to the structures to be measured, there are various items such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, trash, and materials. Additionally, people may be unintentionally captured during the scanning process at the site. These noise components interfere with the recognition of the structure that is the object of measurement, and affect the recognition accuracy of the structure.
 そのため、施工現場に存在する金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等のノイズ構成物を含む現場画像についても高い精度で認識することが可能なシステムが求められる。また、実際の現場においては、そのようなノイズ構成物の存在状況は、施工状況に合わせて移動や追加が行われることにより、刻一刻と変化するものであるため、最新の現場状況に合わせたモデルを使用することが望まれる。その際、現場に合わせたモデルを生成し直すための時間やコストを最小限にできることが望ましい。 Therefore, the system is capable of recognizing with high accuracy site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. is required. In addition, at actual sites, the existence status of such noise components changes from moment to moment as they are moved or added to suit the construction situation, so it is necessary to It is recommended to use a model. In this case, it is desirable to be able to minimize the time and cost of regenerating a model tailored to the site.
 非特許文献1では、既存の大型設備の3次元計測に基づいて3Dモデルを作成するアズビルトモデリングにおいて、点群データの量が膨大化する問題に関して、「大型設備のアズビルトモデリングに用いられる計測装置は,小型部品用の点群計測装置とは計測原理が異なることに注意を要する.小型部品の点群計測では,レーザ出力装置とCCDカメラを用いて三角測量を行うのが一般的であるが,この方法では対象物のサイズが大きくなるに従って装置も巨大化する.また,小型部品の計測では,計測される点群はせいぜい数百万点程度であることが多いが,大型設備の場合には,モデル化に大量の点群を必要とする.」ことが指摘されている。 Non-Patent Document 1 discusses the problem of the huge amount of point cloud data in as-built modeling, which creates 3D models based on 3D measurements of existing large-scale equipment, and describes the problem of ``Measurement used in as-built modeling of large-scale equipment''. It should be noted that the measurement principle of this device is different from that of point cloud measurement devices for small parts.For point cloud measurement of small parts, triangulation is generally performed using a laser output device and a CCD camera. However, with this method, as the size of the object increases, the equipment becomes larger.Also, when measuring small parts, the measured point cloud is often only a few million points at most, but in the case of large equipment, It has been pointed out that "a large number of point clouds are required for modeling."
 例えば、特許文献1では、「既存図面から取得した建築物の既存部分の電子化データを3次元CADデータに変換して、3次元レーザースキャナにより取得された点群データや該点群データから作成された3次元ポリゴンモデルを含む各種現場調査データと共に格納する既存部分調査手段と、前記3次元ポリゴンモデルに対して、予め部材ライブラリに格納された部材オブジェクトの中から選択された新たに施工される部材オブジェクトを配置する施工部材設計手段と、該施工部材設計手段により配置された前記部材オブジェクトに従って部材工場でプレカットされた部材に取り付けられた電子タグをIDリーダで読み取ることにより得られた前記部材オブジェクト固有のIDに対応する部材オブジェクトをその施工位置情報と共に前記施工部材設計手段により設計された3次元CADモデルから検索して出力する部材施工位置出力手段と、して機能するCPUと、該CPUの前記部材施工位置出力手段により出力された前記部材オブジェクトの施工位置情報に基づき、前記既存部分における前記部材の施工位置を指し示す自動位置指示装置と」を備えた建築生産システムが開示されている。 For example, in Patent Document 1, "digitized data of existing parts of a building acquired from existing drawings is converted into 3D CAD data, and created from point cloud data acquired by a 3D laser scanner or the point cloud data. Existing partial survey means is stored together with various field survey data including the 3D polygon model that has been constructed, and newly constructed part objects are selected from member objects stored in the member library in advance for the 3D polygon model. A construction member design means for arranging a member object, and the member object obtained by reading, with an ID reader, an electronic tag attached to a member precut at a parts factory according to the member object arranged by the construction member design means. a CPU functioning as a member construction position output means for searching and outputting a member object corresponding to a unique ID together with its construction position information from a three-dimensional CAD model designed by the construction member design means; and an automatic position pointing device that indicates the construction position of the member in the existing part based on the construction position information of the member object output by the member construction position output means.
 また、特許文献2では、「撮像装置を用いて実空間を撮像することにより生成される入力画像を取得する画像取得部と、前記入力画像に映る1つ以上の特徴点の位置に基づいて、前記実空間と前記撮像装置との間の相対的な位置及び姿勢を認識する認識部と、認識される前記相対的な位置及び姿勢を用いた拡張現実アプリケーションを提供するアプリケーション部と、前記認識部により実行される認識処理が安定化するように、前記特徴点の分布に従って、前記撮像装置を操作するユーザを誘導する誘導オブジェクトを前記入力画像に重畳する表示制御部と、を備える画像処理装置」が開示されている。 Further, in Patent Document 2, "an image acquisition unit that acquires an input image generated by imaging a real space using an imaging device, and based on the position of one or more feature points reflected in the input image, a recognition unit that recognizes a relative position and orientation between the real space and the imaging device; an application unit that provides an augmented reality application using the recognized relative position and orientation; and the recognition unit. and a display control unit that superimposes a guiding object that guides a user operating the imaging device on the input image according to the distribution of the feature points so that the recognition processing performed by the image processing device is stabilized. is disclosed.
 しかしながら、特許文献1及び2はいずれも、3次元空間あるいは3次元空間内の物体を把握するための技術を開示しているが、特にビルや工場等の大規模な設備における3次元の点群データ等のデータ量が膨大になるという問題について解決するものではなく、建築途中の施工現場の状況を迅速に把握するために画像内の構造物の認識を自動化することに適したものではなかった。 However, although Patent Documents 1 and 2 both disclose techniques for grasping a three-dimensional space or an object in a three-dimensional space, in particular, three-dimensional point clouds in large-scale facilities such as buildings and factories are used. It did not solve the problem of huge amounts of data, and it was not suitable for automating the recognition of structures in images in order to quickly understand the situation at a construction site in the middle of construction. .
 また、特許文献1及び2はいずれも、施工現場に存在する金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等のノイズ構成物を含む現場画像についても対象となる構造物を高い精度で認識することや、最新の現場状況に合わせてモデルの再学習を行うこと等を考慮するものではなかった。 In addition, both Patent Documents 1 and 2 also apply to site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. There was no consideration given to recognizing structures with high accuracy or retraining the model to match the latest site conditions.
特開2013-149119号公報Japanese Patent Application Publication No. 2013-149119 特開2013-225245号公報JP2013-225245A
 そこで、本発明は、上記課題を解決し、施工現場に存在する金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等のノイズ構成物を含む現場画像についても高い精度で対象となる構造物を認識することができ、最新の現場状況に合わせた高精度な構造物の認識が可能な建屋内構造物認識システム及び建屋内構造物認識方法を提供するものである。 Therefore, the present invention solves the above-mentioned problems, and also handles site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. The present invention provides an in-building structure recognition system and an in-building structure recognition method that can recognize target structures with high accuracy and can recognize structures with high precision in accordance with the latest site conditions. be.
 また、本発明は、建屋内構造物認識方法の各ステップをコンピュータに実行させるためのプログラムを提供する。 The present invention also provides a program for causing a computer to execute each step of the method for recognizing structures inside a building.
 上記課題を解決するため、本発明では、機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識システムであって、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成する、第1の機械学習モデル生成部と、第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成する、第2の機械学習モデル生成部と、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する、スキャニング部と、スキャニング部で取得した第3の現場画像からノイズ構成物の画像を除去する、ノイズ構成物除去部と、ノイズ構成物除去部で得たノイズ構成物が除去された画像を用いて、第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成する、第3の機械学習モデル生成部と、第3の機械学習済モデルを用いて、第3の現場画像に含まれる建屋内の構造物を認識する、建屋内構造物認識部と、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力する、点群データ出力部とを備えることを特徴とする、建屋内構造物認識システムを提供する。 In order to solve the above problems, the present invention provides an in-building structure recognition system for recognizing in-building structures using a machine learning model, which uses BIM (Building Information Modeling) data and a first site image. A first machine learning model generation unit that performs machine learning using a first machine learning model to generate a first machine learned model, and an image of a noise construct that does not have BIM data for the first machine learned model a second machine learning model generation unit that performs relearning using a second on-site image containing images to generate a second machine learned model; a scanning unit that scans the interior of the building and acquires three-dimensional point cloud data and a third on-site image; and a noise component that removes an image of the noise constituent from the third on-site image acquired by the scanning unit. Retraining the second machine learned model using the removing unit and the image from which the noise components obtained by the noise constituent removing unit have been removed to generate a third machine learned model; Obtained by a building structure recognition unit and a scanning unit that recognize structures in the building included in the third site image using the third machine learning model generation unit and the third machine learned model. and a point cloud data output unit that extracts and outputs point cloud data of structures in the building recognized by the building structure recognition unit from the three-dimensional point cloud data. Provides a structure recognition system.
 本発明のある態様による建屋内構造物認識システムにおいて、第1の機械学習モデル生成部は、BIMデータから生成した画像を正解データとし、BIMデータをレンダリングして生成した画像を第1の現場画像から得た情報を用いて加工することにより得られた画像を観測データとして機械学習を行い、第1の機械学習済モデルを生成することを特徴とする。 In the intra-building structure recognition system according to an aspect of the present invention, the first machine learning model generation unit uses an image generated from BIM data as correct data and an image generated by rendering the BIM data as a first site image. The method is characterized in that machine learning is performed using an image obtained by processing using information obtained from observation data as observation data to generate a first machine learned model.
 本発明のある態様による建屋内構造物認識システムにおいて、第2の機械学習モデル生成部は、BIMデータを有しないノイズ構成物の画像を含む前記第2の現場画像についての正解データと観測データのセットを、第1の機械学習済モデルに入力して再学習を行い、第2の機械学習済モデルを生成することを特徴とする。 In the building structure recognition system according to an aspect of the present invention, the second machine learning model generation unit is configured to generate correct data and observation data for the second site image that includes an image of a noise component that does not have BIM data. The method is characterized in that the set is input to a first machine-learned model and re-learning is performed to generate a second machine-learned model.
 本発明のある態様による建屋内構造物認識システムにおいて、スキャニング部は、建屋内の画像を取得するとともに、連続するフレーム間に少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知することを特徴とする。 In the building structure recognition system according to an aspect of the present invention, the scanning unit acquires images inside the building, and at least one corresponding reference point or reference structure exists between successive frames. If at least one corresponding reference point or reference structure does not exist, an alert prompting rescanning is notified.
 本発明のある態様による建屋内構造物認識システムにおいて、ノイズ構成物除去部は、ステレオマッチングによりスキャニング部で取得した第3の現場画像からノイズ構成物の画像を抽出し、ノイズ構成物のマスク画像を生成し、マスク画像の部分を補間するように画像の再構成を行い、ノイズ構成物が除去された画像を生成することを特徴とする。 In the building structure recognition system according to an aspect of the present invention, the noise component removal unit extracts an image of the noise construct from the third on-site image acquired by the scanning unit by stereo matching, and generates a mask image of the noise construct. The method is characterized in that it generates an image, reconstructs the image so as to interpolate a portion of the mask image, and generates an image from which noise components have been removed.
 本発明のある態様による建屋内構造物認識システムにおいて、第3の機械学習モデル生成部は、ノイズ構成物除去部で得た、第3の現場画像からノイズ構成物が除去された画像についての正解データと観測データのセットを、第2の機械学習済モデルに入力して再学習を行い、第3の機械学習済モデルを生成することを特徴とする。 In the building structure recognition system according to an aspect of the present invention, the third machine learning model generation unit generates a correct answer for the image from which noise components have been removed from the third on-site image obtained by the noise component removal unit. The method is characterized in that a set of data and observed data is input to a second machine-learned model and re-learning is performed to generate a third machine-learned model.
 本発明のある態様による建屋内構造物認識システムにおいて、建屋内構造物認識部は、第3の機械学習済モデルに第3の現場画像を入力して、第3の現場画像に含まれる建屋内の構造物を認識することを特徴とする。 In the in-building structure recognition system according to an aspect of the present invention, the in-building structure recognition unit inputs a third on-site image to a third machine-learned model to identify inside buildings included in the third on-site image. The feature is that it recognizes the structure of.
 また、本発明では、機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法であって、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成するステップと、第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成するステップと、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の構造物の3次元点群データ及び建屋内の画像を取得する、スキャニング・ステップと、スキャニング部で取得した建屋内の画像からノイズ構成物の画像を除去するステップと、除去するステップで得たノイズ構成物が除去された画像を用いて、第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成するステップと、第3の機械学習済モデルを用いて、建屋内の構造物を認識するステップと、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力するステップとを含む、建屋内構造物認識方法を提供する。 Further, the present invention provides an in-building structure recognition method for recognizing structures in a building using a machine learning model, which performs machine learning using BIM (Building Information Modeling) data and a first site image. and retraining the first machine learned model using a second site image that includes an image of a noise composition that does not have BIM data. scanning the building while determining the success or failure of scanning the structures inside the building, and generating 3D point cloud data of the structures inside the building and images inside the building. a scanning step of acquiring a noise component, a step of removing an image of the noise component from the image inside the building acquired by the scanning unit, and a second re-learning the machine learned model to generate a third machine learned model; recognizing structures in the building using the third machine learned model; and a scanning unit A method for recognizing a structure inside a building is provided, which includes a step of extracting and outputting point cloud data of a structure inside a building recognized by an inside structure recognition unit from the three-dimensional point cloud data acquired in the above.
 本発明のある態様による建屋内構造物認識方法において、スキャニング・ステップは、建屋内の画像を取得するとともに、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知することを特徴とする。 In the method for recognizing structures inside a building according to an aspect of the present invention, the scanning step includes acquiring an image inside the building and determining whether or not there is at least one corresponding reference point or reference structure. The method is characterized in that an alert prompting rescanning is notified if at least one or more corresponding reference points or reference structures do not exist.
 また、本発明では、コンピュータに、上記建屋内構造物認識方法の各ステップを実行させることを特徴とするプログラムを提供する。 Furthermore, the present invention provides a program that causes a computer to execute each step of the above method for recognizing structures inside a building.
 本発明において、「BIM(Building Information Modeling)データ」とは、コンピュータ上に再現された建物の3次元モデルのデータをいう。 In the present invention, "BIM (Building Information Modeling) data" refers to data of a three-dimensional model of a building reproduced on a computer.
 本発明によれば、施工現場に存在する金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等のノイズ構成物を含む現場画像についても高い精度で認識することができるという効果を奏する。 According to the present invention, it is possible to recognize with high accuracy even site images that include noise components such as wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, and materials existing at the construction site. It has the effect of being able to.
 また、刻一刻と変換するノイズ構成物に対応して、最新の現場状況に合わせてモデルの再学習を行うことにより、構造物の認識精度を高めることができる。 In addition, the accuracy of structure recognition can be improved by relearning the model to match the latest site conditions in response to noise components that change moment by moment.
 さらに、現場のスキャニングの途中でスキャニングの成否をユーザに通知することができ、スキャニングが不良である場合にその場で再スキャンの実行を促すことができるため、後日、スキャニング作業をやり直すことを回避できる。
 本発明の他の目的、特徴および利点は添付図面に関する以下の本発明の実施例の記載から明らかになるであろう。
Furthermore, it is possible to notify the user of the success or failure of scanning during on-site scanning, and to prompt the user to re-scan on the spot if the scanning is defective, thereby avoiding having to re-do the scanning work at a later date. can.
Other objects, features and advantages of the invention will become apparent from the following description of embodiments of the invention, taken in conjunction with the accompanying drawings.
図1は、本発明による建屋内構造物認識システムの全体を示す概略図である。FIG. 1 is a schematic diagram showing the entire structure recognition system in a building according to the present invention. 図2は、本発明による建屋内構造物認識システムの各処理ごとの流れを示す図である。FIG. 2 is a diagram showing the flow of each process of the building structure recognition system according to the present invention. 図3は、本発明の第1の機械学習モデル生成部を示す概略図である。FIG. 3 is a schematic diagram showing the first machine learning model generation section of the present invention. 図4は、本発明の第2の機械学習モデル生成部を示す概略図である。FIG. 4 is a schematic diagram showing the second machine learning model generation section of the present invention. 図5は、本発明のスキャニング部を示す概略図である。FIG. 5 is a schematic diagram showing the scanning section of the present invention. 図6は、本発明のノイズ構成物除去部を示す概略図である。FIG. 6 is a schematic diagram illustrating the noise construct removal section of the present invention. 図7は、本発明の第3の機械学習モデル生成部を示す概略図である。FIG. 7 is a schematic diagram showing the third machine learning model generation section of the present invention. 図8は、本発明の建屋内構造物認識部を示す概略図である。FIG. 8 is a schematic diagram showing the intra-building structure recognition section of the present invention. 図9は、本発明による建屋内構造物認識方法の全体の流れを示す図である。FIG. 9 is a diagram showing the overall flow of the method for recognizing structures inside a building according to the present invention.
 図1は、本発明による建屋内構造物認識システム1の全体を示す概略図である。
 本発明による建屋内構造物認識システム1は、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成する、第1の機械学習モデル生成部11と、第1の機械学習済モデルM1に対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルM2を生成する、第2の機械学習モデル生成部12と、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する、スキャニング部20と、スキャニング部20で取得した建屋内の画像からノイズ構成物の画像を除去する、ノイズ構成物除去部30と、ノイズ構成物除去部で得た、第3の現場画像からノイズ構成物が除去された画像を用いて、第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルM3を生成する、第3の機械学習モデル生成部13と、第3の機械学習済モデルM3を用いて、建屋内の構造物を認識する、建屋内構造物認識部40と、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力する、点群データ出力部50とを備える。
FIG. 1 is a schematic diagram showing the entire building structure recognition system 1 according to the present invention.
The in-building structure recognition system 1 according to the present invention includes a first machine learning model that performs machine learning using BIM (Building Information Modeling) data and a first site image to generate a first machine learned model. The generation unit 11 performs relearning on the first machine learned model M1 using a second site image including an image of a noise component that does not have BIM data, and generates a second machine learned model M2. A second machine learning model generation unit 12 that generates 3D point cloud data inside the building and a third on-site image are obtained by scanning the inside of the building while determining the success or failure of scanning the structures inside the building. From the third on-site image obtained by the scanning unit 20 and the noise constituent removal unit 30 and the noise constituent removal unit, which removes the image of the noise constituent from the image inside the building acquired by the scanning unit 20. a third machine learning model generation unit 13 that performs relearning on the second machine learned model using the image from which noise components have been removed, and generates a third machine learned model M3; The building structure recognition unit 40 uses the third machine learned model M3 to recognize the structures in the building, and the building structure recognition unit recognizes the structures from the three-dimensional point cloud data acquired by the scanning unit. and a point cloud data output unit 50 that extracts and outputs point cloud data of structures in the building.
 第1の機械学習モデル生成部11は、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成する。具体的には、第1の機械学習モデル生成部11は、BIMデータから生成した画像を正解データとし、BIMデータをレンダリングして生成した画像を第1の現場画像から得た情報を用いて加工することにより得られた画像を観測データとして機械学習を行い、第1の機械学習済モデルを生成する。 The first machine learning model generation unit 11 performs machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model. Specifically, the first machine learning model generation unit 11 uses the image generated from the BIM data as correct data, and processes the image generated by rendering the BIM data using the information obtained from the first site image. Machine learning is performed using the image obtained by this as observation data to generate a first machine learned model.
 第2の機械学習モデル生成部12は、第1の機械学習済モデルM1に対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルM2を生成する。具体的には、第2の機械学習モデル生成部12は、BIMデータを有しないノイズ構成物の画像を含む前記第2の現場画像についての正解データと観測データのセットを、第1の機械学習済モデルに入力して再学習を行い、第2の機械学習済モデルを生成する。 The second machine learning model generation unit 12 performs relearning on the first machine learned model M1 using a second site image including an image of a noise component that does not have BIM data, and A machine learned model M2 is generated. Specifically, the second machine learning model generation unit 12 generates a set of correct data and observed data for the second site image including an image of a noise component that does not have BIM data, using the first machine learning model generation unit 12. A second machine-learned model is generated by inputting the input into the trained model and performing re-learning.
 スキャニング部20は、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する。スキャニング部20は、建屋内の画像を取得するとともに、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知する。 The scanning unit 20 scans the inside of the building while determining the success or failure of scanning the structures inside the building, and acquires three-dimensional point cloud data inside the building and a third on-site image. The scanning unit 20 acquires an image inside the building, determines whether or not there is at least one corresponding reference point or reference structure, and determines whether there is at least one or more corresponding reference point or reference structure. If a structure does not exist, an alert will be sent prompting a rescan.
 ノイズ構成物除去部30は、スキャニング部20で取得した建屋内の画像からノイズ構成物が除去された画像を抽出する。ノイズ構成物除去部30は、ステレオマッチングによりスキャニング部20で取得した前記第3の現場画像からノイズ構成物の画像を抽出し、ノイズ構成物のマスク画像を生成し、マスク画像の部分を補間するように画像の再構成を行い、ノイズ構成物が除去された画像を生成する。 The noise constituent removal unit 30 extracts an image from which noise constituents have been removed from the image inside the building acquired by the scanning unit 20. The noise constituent removing unit 30 extracts an image of the noise constituent from the third scene image acquired by the scanning unit 20 by stereo matching, generates a mask image of the noise constituent, and interpolates the portion of the mask image. The image is reconstructed to generate an image from which noise components have been removed.
 第3の機械学習モデル生成部13は、ノイズ構成物除去部で抽出したノイズ構成物が除去された画像を用いて、第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルM3を生成する。具体的には、第3の機械学習モデル生成部13は、ノイズ構成物除去部30で得た、第3の現場画像からノイズ構成物が除去された画像についての正解データと観測データのセットを、第2の機械学習済モデルに入力して再学習を行い、第3の機械学習済モデルを生成する。 The third machine learning model generation unit 13 performs relearning on the second machine learning model using the image from which the noise components extracted by the noise component removal unit have been removed, and generates a third machine learning model. Generate trained model M3. Specifically, the third machine learning model generation unit 13 generates a set of correct data and observed data for the image from which noise components have been removed from the third on-site image, obtained by the noise component removal unit 30. , and performs relearning by inputting it into the second machine learned model to generate a third machine learned model.
 建屋内構造物認識部40は、第3の機械学習済モデルM3を用いて、建屋内の構造物を認識する。建屋内構造物認識部40は、第3の機械学習済モデルに第3の現場画像を入力して、第3の現場画像に含まれる建屋内の構造物を認識する。 The in-building structure recognition unit 40 recognizes the structures in the building using the third machine learned model M3. The intra-building structure recognition unit 40 inputs the third on-site image to the third machine-learned model and recognizes the intra-building structures included in the third on-site image.
 点群データ出力部50は、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力する。 The point cloud data output unit 50 extracts and outputs point cloud data of the structures inside the building recognized by the building structure recognition unit from the three-dimensional point cloud data acquired by the scanning unit.
 図2は、本発明による建屋内構造物認識システムの各処理ごとの流れを示す図である。
 図2は、建屋内構造物認識システムにおいて実行される処理のうち、機械学習モデル生成処理、データ取得/スキャニング処理、ノイズ構成物除去処理、及び建屋内構造物認識処理について、それぞれの関係を示している。機械学習モデル生成処理は、第1の機械学習モデル生成部11、第2の機械学習モデル生成部12、又は第3の機械学習モデル生成部13によって実行される。データ取得/スキャニング処理は、スキャニング部20又は点群データ出力部50によって実行される。スキャニング前の事前処理については、図示しない外部の任意の撮像装置若しくはスキャニング装置によって実行されるようにしてもよい。ノイズ構成物除去処理は、ノイズ構成物除去部30によって実行される。建屋内構造物認識処理は、建屋内構造物認識部40によって実行される。
FIG. 2 is a diagram showing the flow of each process of the building structure recognition system according to the present invention.
Figure 2 shows the relationships between machine learning model generation processing, data acquisition/scanning processing, noise component removal processing, and building structure recognition processing among the processes executed in the building structure recognition system. ing. The machine learning model generation process is executed by the first machine learning model generation unit 11, the second machine learning model generation unit 12, or the third machine learning model generation unit 13. The data acquisition/scanning process is executed by the scanning section 20 or the point cloud data output section 50. The pre-processing before scanning may be performed by any external imaging device or scanning device (not shown). The noise constituent removal process is executed by the noise constituent removal unit 30. The building structure recognition process is executed by the building structure recognition section 40.
 全体の処理の流れは、スキャニング前の事前処理とスキャニング以降の処理に大別される。まず、現場での本番のスキャンニングを行う前の事前処理について説明する。スキャニング前の事前処理は、第1の機械学習済モデルを生成することと、第1の機械学習済モデルに対して再学習を行い第2の機械学習済モデルを生成することとが含まれる。まず、第1の機械学習済モデルを生成するためのBIMデータと第1の現場画像が取得される。取得したBIMデータと第1の現場画像を用いて、第1の機械学習済モデルを生成する。第1の機械学習済モデルは、理想的な現場を想定して作成され、様々な現場に対して汎用的に用いることができる。しかしながら、実際の現場には、通常、BIMデータに含まれない物品(金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等)が存在し得る。また、そのような物品の他に、意図しない人物の映り込みが生じる場合もある。そこで、建屋内の構造物を認識するためにノイズとなるそのようなノイズ構成物を含む第2の現場画像を取得し、ノイズ構成物をノイズとして学習させるため、第1の機械学習済モデルに対して再学習が行われる。このような再学習が行われることにより、実際の現場の状況に合わせた第2の機械学習済モデルが得られる。第2の現場画像の取得は、現場での本番のスキャニングを行う前に現場を視察する際等に行われる。 The overall processing flow is roughly divided into pre-processing before scanning and processing after scanning. First, we will explain the preprocessing before performing actual scanning on site. The pre-processing before scanning includes generating a first machine learned model and relearning the first machine learned model to generate a second machine learned model. First, BIM data and a first site image for generating a first machine learned model are acquired. A first machine learned model is generated using the acquired BIM data and the first site image. The first machine learned model is created assuming an ideal site, and can be used universally for various sites. However, in the actual site, there may be items (wire mesh, protective nets and sheets, temporarily installed iron fences and poles, trash, materials, etc.) that are not included in the BIM data. In addition to such objects, unintended persons may also be reflected. Therefore, in order to recognize the structures inside the building, we acquired a second site image that includes such noise components that become noise, and added it to the first machine learned model in order to learn the noise components as noise. Re-learning will be performed. By performing such re-learning, a second machine-learned model matching the actual site situation is obtained. The second site image is acquired when inspecting the site before performing actual scanning at the site.
 次に、現場での本番のスキャニング以降の処理について説明する。スキャニング以降の処理は、スキャンニングを行うことと、ノイズ構成物を除去することと、第3の機械学習済モデルを生成することと、建屋内構造物を認識することと、点群データを取得することとが含まれる。まず、現場である建屋内のスキャニングを行い、第3の現場画像と3次元点群データを取得する。この際、スキャニングを行いながら、スキャニングの成否を判定し、スキャンのやり直しが必要な場合は、再スキャンを促すアラートが通知される。再スキャンを促すアラートが通知された場合には、再スキャンが行われる。これを建屋内のスキャン対象が全てスキャンされるまで繰り返す。次に、スキャンした第3の現場画像からノイズ構成物を除去する。現場に存在するノイズ構成物は、刻一刻と変化することが多く、第3の現場画像には、スキャニング前の事前処理で取得した第2の現場画像には含まれていない新たなノイズ構成物が含まれ得る。また、第2の現場画像に含まれていたノイズ構成物が他の場所に移動されている場合もある。意図しない人物の映り込み等もノイズ構成物となり得る。そこで、スキャンした画像からノイズ構成物を除去し、ノイズ構成物が除去された画像を更に学習させるため、第2の機械学習済モデルに対して再学習が行われる。このような再学習が行われることにより、実際の現場の最新の状況に合わせた第3の機械学習済モデルが得られる。次に、第3の機械学習済モデルを使用して、第3の現場画像について建屋内構造物の認識が行われる。最後に、認識された建屋内構造物に対応する3次元点群データが得られる。 Next, we will explain the processing after the actual scanning at the site. Processing after scanning involves scanning, removing noise components, generating a third machine learned model, recognizing structures inside the building, and acquiring point cloud data. It includes doing. First, the inside of the building, which is the site, is scanned to obtain a third site image and three-dimensional point cloud data. At this time, while scanning is performed, the success or failure of scanning is determined, and if it is necessary to rescan, an alert is sent to prompt rescanning. If an alert prompting a rescan is received, a rescan will be performed. Repeat this until all objects in the building are scanned. Noise components are then removed from the scanned third scene image. Noise components present at the scene often change from moment to moment, and the third scene image may contain new noise components that are not included in the second scene image acquired through pre-processing before scanning. may be included. Further, noise components included in the second scene image may have been moved to another location. Unintended reflections of people can also be noise components. Therefore, in order to remove the noise components from the scanned image and further learn the image from which the noise components have been removed, relearning is performed on the second machine learned model. By performing such re-learning, a third machine-learned model matching the latest situation at the actual site is obtained. Next, using the third machine learned model, recognition of the structures inside the building is performed on the third site image. Finally, three-dimensional point cloud data corresponding to the recognized structures inside the building is obtained.
 図3は、本発明の第1の機械学習モデル生成部を示す概略図である。
 第1の機械学習モデル生成部11は、BIMデータから生成した画像を正解データ(正解画像)とし、BIMデータをレンダリングして生成した画像を第1の現場画像から得た情報を用いて加工することにより得られた画像を観測データ(観測画像)として機械学習を行い、第1の機械学習済モデルを生成する。BIMデータから生成される正解画像は、画像中の構造物を背景と区別するように示したものである。正解画像は、例えば画像内の構造物の部分を人手により塗りつぶす等、手動で生成したものであってもよい。正解画像は、例えば構造物の部分と背景部分とを区別可能な2値化画像であってもよい。また、観測画像もBIMデータから生成される。まず、BIMデータをレンダリングしたレンダリング画像を生成する。次に、第1の現場画像から抽出したテクスチャ等の情報を用いて、レンダリング画像に対してテクスチャ等の追加を行い、より実際の現場の写真に近い画像を生成し、これを観測画像として用いる。このような正解画像と観測画像とのセットを用いて機械学習モデル生成処理を行い、第1の機械学習済モデルM1を生成する。第1の機械学習済モデルM1は、建屋内の構造物の認識をするための汎用的な機械学習済モデルとして使用することができる。特に、ノイズ構成物が存在しない理想的な環境の建屋内についてスキャニングを行い、建屋内の構造物の認識を行う場合は、第1の機械学習済モデルM1を用いることができる。
FIG. 3 is a schematic diagram showing the first machine learning model generation section of the present invention.
The first machine learning model generation unit 11 uses the image generated from the BIM data as correct data (correct image), and processes the image generated by rendering the BIM data using the information obtained from the first site image. Machine learning is performed using the image obtained by this as observation data (observed image) to generate a first machine learned model. The correct image generated from the BIM data shows the structure in the image in a way that distinguishes it from the background. The correct image may be one that is manually generated, such as by manually filling in parts of the structure within the image. The correct image may be, for example, a binarized image in which a structure part and a background part can be distinguished. Observation images are also generated from BIM data. First, a rendered image is generated by rendering BIM data. Next, using information such as textures extracted from the first site image, textures, etc. are added to the rendered image to generate an image that is more similar to the actual photograph of the site, and this is used as an observation image. . A machine learning model generation process is performed using such a set of the correct image and observed image to generate a first machine learned model M1. The first machine learned model M1 can be used as a general-purpose machine learned model for recognizing structures inside a building. In particular, when scanning a building in an ideal environment where no noise components exist and recognizing structures within the building, the first machine-learned model M1 can be used.
 図4は、本発明の第2の機械学習モデル生成部を示す概略図である。
 第2の機械学習モデル生成部12は、第2の現場画像に含まれるBIMデータを有しないノイズ構成物画像についての正解データと観測データのセットを、第1の機械学習済モデルM1に入力して再学習を行い、第2の機械学習済モデルM2を生成する。第1の機械学習済モデルM1を生成する際は、BIMデータに基づいて正解画像と観測画像を生成して機械学習に用いているため、第1の機械学習済モデルM1は、ノイズ構成物が存在しない理想的な建屋内の環境をスキャンする場合に適している。一方、実際の現場では、BIMデータを有しないノイズ構成物が存在することも多い。そこで、第1の機械学習済モデルM1に対してノイズ構成物を含む第2の現場画像を追加学習させることにより、ノイズ構成物を含む建屋内の環境においても精度よく構造物の認識を行うことが可能な第2の機械学習済モデルM2が生成される。第1の機械学習済モデルM1に対する再学習に用いるための正解画像と観測画像のセットは、第2の現場画像から生成される。第2の現場画像の正解画像は、画像中の構造物を背景と区別するように示したものである。第2の現場画像の正解画像は、例えば画像内の構造物の部分を人手により塗りつぶす等、手動で生成したものであってもよい。第2の現場画像の正解画像は、例えば構造物の部分と背景部分とを区別可能な2値化画像であってもよい。第2の現場画像の観測画像は、ノイズ構成物を含む第2の現場画像をそのまま用いてもよい。また、第2の現場画像の観測画像は、ノイズ構成物を含む第2の現場画像に対して、必要に応じて前処理にて加工を行った画像を用いてもよい。このような第2の現場画像の正解画像と観測画像とのセットを用いて、第1の機械学習済モデルM1に対して再学習処理を行い、第2の機械学習済モデルM2を生成する。
FIG. 4 is a schematic diagram showing the second machine learning model generation section of the present invention.
The second machine learning model generation unit 12 inputs a set of correct data and observation data regarding the noise component image that does not have BIM data included in the second site image into the first machine learned model M1. Then, relearning is performed to generate a second machine learned model M2. When generating the first machine learned model M1, since the correct image and observed image are generated based on BIM data and used for machine learning, the first machine learned model M1 is free of noise components. Suitable for scanning ideal building environments that do not exist. On the other hand, in actual sites, noise components that do not have BIM data often exist. Therefore, by additionally learning the second site image containing noise components to the first machine learned model M1, it is possible to accurately recognize structures even in the building environment including noise components. A second machine learned model M2 that is capable of is generated. A set of correct images and observed images to be used for relearning the first machine learned model M1 is generated from the second on-site image. The correct image of the second scene image shows the structure in the image so as to be distinguished from the background. The correct image of the second site image may be one that is manually generated, such as by manually filling in a portion of the structure in the image. The correct image of the second site image may be, for example, a binarized image in which a structure part and a background part can be distinguished. As the observed image of the second on-site image, the second on-site image including noise components may be used as is. Further, as the observed image of the second on-site image, an image obtained by pre-processing the second on-site image including noise components as necessary may be used. Using such a set of the correct image of the second on-site image and the observed image, a relearning process is performed on the first machine learned model M1 to generate a second machine learned model M2.
 図5は、本発明のスキャニング部を示す概略図である。
 スキャニング部20は、建屋内の画像を取得するとともに、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知する。スキャニング部20は、機能ごとにスキャニング成否判定部203、アラート通知部204、及び再スキャン処理部205を有するようにしてもよい。
FIG. 5 is a schematic diagram showing the scanning section of the present invention.
The scanning unit 20 acquires an image inside the building, determines whether or not there is at least one corresponding reference point or reference structure, and determines whether there is at least one or more corresponding reference point or reference structure. If a structure does not exist, an alert will be sent prompting a rescan. The scanning unit 20 may include a scanning success/failure determination unit 203, an alert notification unit 204, and a rescan processing unit 205 for each function.
 スキャニング成否判定部203は、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断する。ここで、「基準点」とは、連続するフレームのマッチングを行うための基準となる点であり、例えば、建屋内の構造物にマーカー等を付して基準点としてもよい。また、「基準となる構造物」とは、連続するフレームのマッチングを行うための基準となる構造物であり、例えば柱の角等の直線状の部分を有する構造物を基準となる構造物とするようにしてもよい。 The scanning success/failure determination unit 203 determines whether there is at least one corresponding reference point or reference structure. Here, the "reference point" is a point that serves as a reference for matching consecutive frames, and for example, a marker or the like may be attached to a structure in a building to serve as the reference point. In addition, a "reference structure" is a structure that serves as a reference for matching consecutive frames, and for example, a structure that has a straight part, such as a corner of a column, is a structure that serves as a reference. You may also do so.
 アラート通知部204は、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知する。アラートは、ユーザに対して再スキャンが必要であることを通知するためのものであり、測距スキャナ201や撮像装置202等のスキャニング装置の表示画面へのアイコンやメッセージの表示や、警告音等であってもよい。 The alert notification unit 204 notifies an alert prompting rescanning when at least one corresponding reference point or reference structure does not exist. The alert is for notifying the user that rescanning is necessary, and includes the display of an icon or message on the display screen of a scanning device such as the distance measuring scanner 201 or the imaging device 202, a warning sound, etc. It may be.
 再スキャン処理部205は、ユーザから再スキャンの指示を受け付けて、再スキャンを行う。 The rescan processing unit 205 receives a rescan instruction from the user and performs the rescan.
 スキャニング成否判定部203、アラート通知部204、及び再スキャン処理部205における処理は、スキャン対象の全てのスキャニングが終了するまで繰り返され、スキャニングが終了すると、第3の現場画像及び3次元点群データが取得される。スキャニング部20で取得した第3の現場画像及び3次元点群データには、ノイズ構成物の情報が含まれ得る。 The processing in the scanning success/failure determination unit 203, alert notification unit 204, and rescan processing unit 205 is repeated until all scanning of the scan target is completed, and when the scanning is completed, the third site image and the three-dimensional point cloud data are is obtained. The third on-site image and three-dimensional point cloud data acquired by the scanning unit 20 may include information on noise components.
 図6は、本発明のノイズ構成物除去部を示す概略図である。
 ノイズ構成物除去部30は、ステレオマッチングによりスキャニング部20で取得した前記第3の現場画像からノイズ構成物の画像を抽出し、ノイズ構成物のマスク画像を生成し、マスク画像の部分を補間するように画像の再構成を行い、ノイズ構成物が除去された画像を生成する。ノイズ構成物除去部30は、機能ごとにステレオマッチング部301、マスク画像生成部302、及び画像再構成部303を有するようにしてもよい。
FIG. 6 is a schematic diagram illustrating the noise construct removal section of the present invention.
The noise constituent removing unit 30 extracts an image of the noise constituent from the third scene image acquired by the scanning unit 20 by stereo matching, generates a mask image of the noise constituent, and interpolates the portion of the mask image. The image is reconstructed to generate an image from which noise components have been removed. The noise component removal unit 30 may include a stereo matching unit 301, a mask image generation unit 302, and an image reconstruction unit 303 for each function.
 ステレオマッチング部301は、ステレオマッチングを行うための既存の機械学習済モデルに、スキャニング部20で取得した第3の現場画像のうち、同一の対象を別視点から撮像した2枚の画像(典型的には右画像及び左画像)を入力し、ピクセル毎に3次元の奥行きを推定し、推定したピクセル毎の奥行きをグラデーションで表した距離画像を生成する。ステレオマッチングを行うための既存の機械学習済モデルは、例えば、同一の対象を別視点から撮像した2枚の画像(典型的には右画像及び左画像)から2枚の画像の視差を表す視差画像を得るためのマッピング関数を畳み込みニューラルネットワーク(CNN:Convolution Neural Network)により学習させた機械学習済モデルであってもよい。また、ステレオマッチングを行うための既存の機械学習済モデルは、視差を粗から細に更新する再帰的洗練や、推論のための積層カスケードアーキテクチャを備えた階層ネットワークを用いるものであってもよい。 The stereo matching unit 301 adds two images of the same object taken from different viewpoints (typical (right image and left image) are input, three-dimensional depth is estimated for each pixel, and a distance image is generated in which the estimated depth for each pixel is represented by a gradation. Existing machine learned models for performing stereo matching, for example, calculate the parallax that represents the disparity between two images of the same object taken from different viewpoints (typically a right image and a left image). It may be a machine-learned model in which a mapping function for obtaining an image is learned using a convolutional neural network (CNN). Additionally, existing machine learned models for performing stereo matching may use recursive refinement to update disparity from coarse to fine, or a hierarchical network with a layered cascade architecture for inference.
 他の態様では、ステレオマッチング部301は、ステレオマッチングのための既存の機械学習済モデルを用いる上述の方法に代えて、機械学習モデルを用いない既存のステレオマッチング手法を用いてもよい。この場合には、例えば、2地点から撮像された2枚の画像(典型的には右画像及び左画像)を用いて、ピクセル毎の3次元の奥行を推定し、推定したピクセル毎の奥行をグラデーションで表した距離画像を生成するようにしてもよい。 In another aspect, the stereo matching unit 301 may use an existing stereo matching method that does not use a machine learning model instead of the above method that uses an existing machine learned model for stereo matching. In this case, for example, the three-dimensional depth of each pixel is estimated using two images (typically the right image and the left image) taken from two points, and the estimated depth of each pixel is A distance image represented by a gradation may be generated.
 マスク画像生成部302は、ステレオマッチング部301で生成した距離画像に対して閾値処理を行い、ノイズ構成物のマスク画像を生成する。例えば、認識対象の構造物の手前にノイズ構成物である金網が存在する場合、ノイズ構成物である金網部分のマスク画像が生成される。 The mask image generation unit 302 performs threshold processing on the distance image generated by the stereo matching unit 301 to generate a mask image of the noise component. For example, if a wire mesh that is a noise component exists in front of a structure to be recognized, a mask image of the wire mesh portion that is a noise component is generated.
 画像再構成部303は、マスク画像生成部302で生成したマスク画像のマスク部分をオリジナル画像から除去し、マスク画像の部分を補間するように画像の再構成された画像を得る。画像再構成部303は、画像の再構成を行うための既存の機械学習済モデルに、ノイズ構成物を含む第3の現場画像と、マスク画像生成部302で生成したノイズ構成物のマスク画像を入力し、マスク画像の領域が除去されてマスク画像の部分を補間するように再構成された画像を得るようにしてもよい。画像の再構成を行うための既存の機械学習済モデルは、ニューラルネットワークによる深層学習により生成された既存の機械学習済モデルであってもよい。これにより、画像再構成部303は、第3の現場画像からノイズ構成物が除去された画像を得る。 The image reconstruction unit 303 removes the mask portion of the mask image generated by the mask image generation unit 302 from the original image, and obtains a reconstructed image by interpolating the mask image portion. The image reconstruction unit 303 adds the third on-site image containing noise components and the mask image of the noise components generated by the mask image generation unit 302 to an existing machine learned model for image reconstruction. A reconstructed image may be obtained such that regions of the mask image are removed and portions of the mask image are interpolated. The existing machine learned model for reconstructing an image may be an existing machine learned model generated by deep learning using a neural network. Thereby, the image reconstruction unit 303 obtains an image from which noise components have been removed from the third on-site image.
 図7は、本発明の第3の機械学習モデル生成部を示す概略図である。
 第3の機械学習モデル生成部13は、ノイズ構成物除去部30で抽出した建屋内の画像から抽出したノイズ構成物が除去された画像についての正解データと観測データのセットを、第2の機械学習済モデルに入力して再学習を行い、第3の機械学習済モデルを生成する。第2の機械学習済モデルM2を生成する際は、スキャニング前の事前処理において取得した、ノイズ構成物を含む第2の現場画像の正解画像と観測画像が再学習に用いられる。しかしながら、スキャニング前の事前処理の時点では存在しなかった新たなノイズ構成物にも対応するため、本番のスキャニング時に取得した、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像を追加学習させることにより、さらに対象の現場に即した構造物の認識を行うことが可能な第3の機械学習済モデルM3が生成される。第2の機械学習済モデルM2に対する再学習に用いるための正解画像と観測画像のセットは、ノイズ構成物除去部30で得た、第3の現場画像からノイズ構成物が除去された画像から生成される。ノイズ構成物が除去された画像の正解画像は、画像中の構造物を背景と区別するように示したものである。ノイズ構成物が除去された画像の正解画像は、例えば画像内の構造物の部分を人手により塗りつぶす等、手動で生成したものであってもよい。ノイズ構成物が除去された画像の正解画像は、例えば構造物の部分と背景部分とを区別可能な2値化画像であってもよい。ノイズ構成物が除去された画像の観測画像は、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像をそのまま用いてもよい。また、ノイズ構成物が除去された画像の観測画像は、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像に対して、必要に応じて前処理にて加工を行った画像を用いてもよい。このようなノイズ構成物が除去された画像の正解画像と観測画像とのセットを用いて、第2の機械学習済モデルM2に対して再学習処理を行い、第3の機械学習済モデルM3を生成する。
FIG. 7 is a schematic diagram showing the third machine learning model generation section of the present invention.
The third machine learning model generation unit 13 generates a set of correct data and observed data for the image from which noise components have been removed from the image inside the building extracted by the noise component removal unit 30. The input is input to the trained model and relearning is performed to generate a third machine learned model. When generating the second machine-learned model M2, the correct image of the second on-site image containing noise components and the observed image obtained in pre-processing before scanning are used for re-learning. However, in order to accommodate new noise components that were not present at the time of pre-processing before scanning, noise components were removed from the third field image that contained the noise components and was acquired during the actual scanning. By additionally learning the images, a third machine-learned model M3 is generated that is capable of recognizing structures further suited to the target site. The set of correct images and observed images to be used for relearning the second machine learned model M2 is generated from the image obtained by the noise component removal unit 30 from which noise components have been removed from the third on-site image. be done. The correct image of the image from which the noise components have been removed is one that shows the structures in the image in a way that distinguishes them from the background. The correct image of the image from which the noise components have been removed may be one that is manually generated, for example, by manually filling in portions of the structure in the image. The correct image of the image from which noise components have been removed may be, for example, a binarized image in which a structure part and a background part can be distinguished. As the observed image of the image from which the noise components have been removed, the image from which the noise components have been removed from the third on-site image containing the noise components may be used as is. In addition, the observed image of the image from which the noise components were removed was processed by preprocessing as necessary on the image from which the noise components were removed from the third on-site image containing the noise components. Images may also be used. Using the set of the correct image and observed image from which such noise components have been removed, relearning processing is performed on the second machine learned model M2, and the third machine learned model M3 is generate.
 図8は、本発明の建屋内構造物認識部及び点群データ出力部を示す概略図である。
 建屋内構造物認識部40は、第3の機械学習済モデルM3に第3の現場画像を入力して、第3の現場画像に含まれる建屋内の構造物を認識する。第3の機械学習済モデルM3からの出力は、画像中の建屋内の構造物を背景と区別するように示したものである。第3の機械学習済モデルM3からの出力は、建屋内の構成物の部分と背景部分とを区別可能な2値化画像であってもよい。第3の現場画像は、現場のスキャニング時に存在したノイズ構成物を含み得る。第3の機械学習済モデルM3は、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像を追加学習させたものであり、第3の現場画像がノイズ構成物を含む場合であっても精度よく建屋内の構造物を認識することができる。
FIG. 8 is a schematic diagram showing an intra-building structure recognition section and a point cloud data output section of the present invention.
The intra-building structure recognition unit 40 inputs the third on-site image to the third machine-learned model M3 and recognizes the intra-building structures included in the third on-site image. The output from the third machine learned model M3 shows the structure inside the building in the image so as to distinguish it from the background. The output from the third machine-learned model M3 may be a binarized image that can distinguish between the components inside the building and the background. The third scene image may include noise constructs that were present during the scanning of the scene. The third machine learned model M3 is obtained by additionally learning an image in which noise components are removed from a third on-site image that includes noise components, and when the third on-site image includes noise components, Structures inside a building can be recognized with high accuracy even when
 点群データ出力部50は、建屋内構造物認識部40で認識した建屋内の構造物の画像情報に基づいて、認識した建屋内の構造物に対応する部分の3次元点群データをスキャニング部20でスキャンした3次元点群データから抽出し、点群データとして出力する。これにより、建屋内の構造物の3次元点群データが得られ、得られた3次元点群データについてレンダリング等を行うことにより、構造物を含む建屋内の3次元CADデータを生成する用途等に使用することが可能となる。 The point cloud data output unit 50 is a scanning unit that generates three-dimensional point cloud data of a portion corresponding to the recognized structure in the building based on the image information of the structure in the building recognized by the structure recognition unit 40 in the building. It is extracted from the three-dimensional point cloud data scanned in step 20 and output as point cloud data. As a result, 3D point cloud data of the structures inside the building is obtained, and by performing rendering etc. on the obtained 3D point cloud data, it is used to generate 3D CAD data of the inside of the building including the structure, etc. It becomes possible to use it for.
 以下、機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法について説明する。
 図9は、本発明による建屋内構造物認識方法の全体の流れを示す図である。
 本発明による建屋内構造物認識方法は、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成するステップS901と、第1の機械学習済モデルM1に対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルM2を生成するステップS902と、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の構造物の3次元点群データ及び建屋内の画像を取得する、スキャニング・ステップS903と、スキャニング部で取得した建屋内の画像からノイズ構成物の画像を除去するステップS904と、除去するステップで得たノイズ構成物が除去された画像を用いて、第2の機械学習済モデルM2に対して再学習を行い、第3の機械学習済モデルM3を生成するステップS905と、第3の機械学習済モデルM3を用いて、建屋内の構造物を認識するステップとS906、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力するステップS907とを含む、建屋内構造物認識方法を提供する。
A method for recognizing structures inside a building using a machine learning model will be described below.
FIG. 9 is a diagram showing the overall flow of the method for recognizing structures inside a building according to the present invention.
The indoor structure recognition method according to the present invention includes step S901 of performing machine learning using BIM (Building Information Modeling) data and a first site image to generate a first machine learned model; Step S902 of performing re-learning on the learned model M1 using a second site image including images of noise components that do not have BIM data to generate a second machine-learned model M2; A scanning step S903 in which the building is scanned while determining the success or failure of scanning of the structure in the building, and three-dimensional point cloud data of the structure in the building and an image inside the building are obtained; Step S904 of removing the image of the noise component from the image of Step S905 of generating the third machine learned model M3; Step S906 of recognizing structures in the building using the third machine learned model M3; and S906, from the three-dimensional point cloud data acquired by the scanning unit. An in-building structure recognition method is provided, which includes step S907 of extracting and outputting point cloud data of an in-building structure recognized by an in-building structure recognition unit.
 本発明のある態様による建屋内構造物認識方法において、スキャニング・ステップS903は、建屋内の画像を取得するとともに、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知する。 In the method for recognizing structures inside a building according to an aspect of the present invention, the scanning step S903 acquires an image inside the building and determines whether there is at least one corresponding reference point or reference structure. If at least one corresponding reference point or reference structure does not exist, an alert prompting rescanning is sent.
 また、本発明では、コンピュータに、上記建屋内構造物認識方法の各ステップを実行させることを特徴とするプログラムを提供する。 Furthermore, the present invention provides a program that causes a computer to execute each step of the above method for recognizing structures inside a building.
 以下、実施例2として、上記実施例1で説明した建屋内構造物認識システム1において、第2の機械学習モデル生成部12を有しない例、及び上記実施例1で説明した建屋内認識方法において、第2の機械学習済モデルを生成するステップを有しない例について説明する。以下で特に説明しない点については、実施例1の建屋内構造物認識システム1と同様である。 Hereinafter, as a second embodiment, an example in which the indoor structure recognition system 1 described in the above embodiment 1 does not include the second machine learning model generation unit 12, and an example in which the indoor structure recognition method described in the above embodiment 1 is , an example that does not include the step of generating a second machine learned model will be described. Points not particularly described below are the same as the building structure recognition system 1 of the first embodiment.
 実施例2における機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識システムは、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルM1を生成する、第1の機械学習モデル生成部11と、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する、スキャニング部20と、スキャニング部20で取得した建屋内の画像からノイズ構成物の画像を除去する、ノイズ構成物除去部30と、ノイズ構成物除去部30で得たノイズ構成物が除去された画像を用いて、第1の機械学習済モデルM1に対して再学習を行い、第3の機械学習済モデルM3を生成する、第3の機械学習モデル生成部13と、第3の機械学習済モデルM3を用いて、建屋内の構造物を認識する、建屋内構造物認識部40と、スキャニング部20で取得した3次元点群データから、建屋内構造物認識部40で認識された前記建屋内の構造物の点群データを抽出して出力する、点群データ出力部50とを備えることを特徴とする。 An in-building structure recognition system for recognizing structures in a building using a machine learning model in Example 2 performs machine learning using BIM (Building Information Modeling) data and a first site image, and The first machine learning model generation unit 11 generates the first machine learned model M1, scans the building while determining the success or failure of scanning the structures in the building, and generates three-dimensional point cloud data and data in the building. A scanning section 20 that acquires a third on-site image; a noise component removal section 30 that removes an image of noise components from the image inside the building acquired by the scanning section 20; a third machine learning model generation unit 13 that performs relearning on the first machine learned model M1 using the image from which noise components have been removed, and generates a third machine learned model M3; The building structure recognition unit 40 recognizes the structures in the building using the third machine learned model M3, and the three-dimensional point cloud data acquired by the scanning unit 20 recognizes the structures in the building. The present invention is characterized by comprising a point cloud data output section 50 that extracts and outputs point cloud data of the structures in the building recognized by the section 40.
 実施例1の建屋内構造物認識システム1と異なる点は、第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成する第2の機械学習モデル生成部12を有しない点である。また、実施例2においては、第3の機械学習モデル生成部13は、第2の機械学習済モデルM2ではなく、第1の機械学習済モデルM1に対して再学習を行い、第3の機械学習済モデルM3を生成する。即ち、実施例2において、第3の機械学習モデル生成部13は、ノイズ構成物除去部30で抽出した建屋内の画像から抽出したノイズ構成物が除去された画像についての正解データと観測データのセットを、第1の機械学習済モデルM1に入力して再学習を行い、第3の機械学習済モデルM3を生成する。 The difference from the building structure recognition system 1 of Example 1 is that the first machine-learned model is re-trained using a second site image that includes images of noise components that do not have BIM data. It does not include a second machine learning model generation unit 12 that generates a second machine learned model. In addition, in the second embodiment, the third machine learning model generation unit 13 performs relearning on the first machine learned model M1 instead of the second machine learned model M2, and Generate trained model M3. That is, in the second embodiment, the third machine learning model generation unit 13 calculates the difference between the correct data and the observed data for the image from which the noise components extracted from the building image extracted by the noise component removal unit 30 are removed. The set is input to the first machine learned model M1 and relearning is performed to generate the third machine learned model M3.
 第1の機械学習済モデルM1に対する再学習に用いるための正解画像と観測画像のセットは、ノイズ構成物除去部30で得た、第3の現場画像からノイズ構成物が除去された画像から生成される。ノイズ構成物が除去された画像の正解画像は、画像中の構造物を背景と区別するように示したものである。ノイズ構成物が除去された画像の正解画像は、例えば画像内の構造物の部分を人手により塗りつぶす等、手動で生成したものであってもよい。ノイズ構成物が除去された画像の正解画像は、例えば構造物の部分と背景部分とを区別可能な2値化画像であってもよい。ノイズ構成物が除去された画像の観測画像は、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像をそのまま用いてもよい。また、ノイズ構成物が除去された画像の観測画像は、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像に対して、必要に応じて前処理にて加工を行った画像を用いてもよい。このようなノイズ構成物が除去された画像の正解画像と観測画像とのセットを用いて、第1の機械学習済モデルM1に対して再学習処理を行い、第3の機械学習済モデルM3を生成する。 A set of correct images and observed images to be used for relearning the first machine-learned model M1 is generated from an image obtained by the noise component removal unit 30 from which noise components have been removed from the third on-site image. be done. The correct image of the image from which the noise components have been removed is one that shows the structures in the image in a way that distinguishes them from the background. The correct image of the image from which the noise components have been removed may be one that is manually generated, for example, by manually filling in portions of the structure in the image. The correct image of the image from which noise components have been removed may be, for example, a binarized image in which a structure part and a background part can be distinguished. As the observed image of the image from which the noise components have been removed, the image from which the noise components have been removed from the third on-site image containing the noise components may be used as is. In addition, the observed image of the image from which the noise components were removed was processed by preprocessing as necessary on the image from which the noise components were removed from the third on-site image containing the noise components. Images may also be used. Using a set of the correct image from which such noise components have been removed and the observed image, relearning processing is performed on the first machine learned model M1, and the third machine learned model M3 is generate.
 実施例2における機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法は、機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法であって、BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成するステップS901と、建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の構造物の3次元点群データ及び建屋内の画像を取得する、スキャニング・ステップS903と、スキャニング部で取得した建屋内の画像からノイズ構成物の画像を除去するステップS904と、除去するステップで得たノイズ構成物が除去された画像を用いて、第1の学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成するステップS905と、第3の機械学習済モデルを用いて、建屋内の構造物を認識するステップS906と、スキャニング部で取得した3次元点群データから、建屋内構造物認識部で認識された建屋内の構造物の点群データを抽出して出力するステップS907とを含む。 The in-building structure recognition method for recognizing structures in a building using a machine learning model in Example 2 is the in-building structure recognition method for recognizing structures in a building using a machine learning model. A step S901 of performing machine learning using BIM (Building Information Modeling) data and a first site image to generate a first machine learned model, and determining the success or failure of scanning the structure in the building. scanning step S903, in which the inside of the building is scanned to obtain three-dimensional point cloud data of structures inside the building and an image inside the building, and an image of noise components is acquired from the inside image acquired by the scanning unit. Step S905 of performing re-learning on the first learned model to generate a third machine-learned model using the removing step S904 and the image from which the noise components obtained in the removing step have been removed. Step S906 recognizes the structures inside the building using the third machine learned model, and from the three-dimensional point cloud data acquired by the scanning section, Step S907 of extracting and outputting point cloud data of the structure.
 実施例1の建屋内構造物認識方法と異なる点は、第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成するステップS902を有しない点である。また、実施例2においては、第3の機械学習済モデルを生成するステップS905は、第2の機械学習済モデルM2ではなく、第1の機械学習済モデルM1に対して再学習を行い、第3の機械学習済モデルM3を生成する。即ち、実施例2において、ノイズ構成物除去部30で得た、ノイズ構成物を含む第3の現場画像からノイズ構成物が除去された画像についての正解データと観測データのセットを、第1の機械学習済モデルM1に入力して再学習を行い、第3の機械学習済モデルM3を生成する。 The difference from the building structure recognition method of Example 1 is that the first machine learned model is retrained using a second site image that includes images of noise components that do not have BIM data. , does not include step S902 of generating the second machine learned model. In addition, in the second embodiment, step S905 of generating the third machine learned model performs relearning on the first machine learned model M1 instead of the second machine learned model M2, and 3 machine learned model M3 is generated. That is, in Example 2, the set of correct data and observed data for the image from which noise components have been removed from the third on-site image containing noise components obtained by the noise component removal unit 30 is It is input to the machine learned model M1 and relearning is performed to generate a third machine learned model M3.
 以上により説明した本発明による建屋内構造物認識システム及び建屋内構造物認識方法によれば、建築現場で注目すべき部材に注目して形状や位置の計測が出来るようになり、精度、スピードを向上させることができる。また、建築現場で管理するべき部材の量を減らすことが出来、これに伴い、建築現場の部材管理システムがハンドルするデータ量を大幅に減らすことが出来る。また、本発明による建屋内構造物認識システム及び建屋内構造物認識方法によれば、施工現場に存在する金網、保護用の網やシート、一時的に設置される鉄柵やポール、ゴミ、資材等のノイズ構成物を含む現場画像についても高い精度で認識することができる。また、刻一刻と変換するノイズ構成物に対応して、最新の現場状況に合わせてモデルの再学習を行うことにより、構造物の認識精度を高めることができる。さらに、現場のスキャニングの途中でスキャニングの成否をユーザに通知することができ、スキャニングが不良である場合にその場で再スキャンの実行を促すことができるため、後日、スキャニング作業をやり直すことを回避できる。このように、本発明によれば、実際の現場において、その現場に特有の様々な状況に対応し、その現場に合わせたモデルを生成し直すための時間やコストを最小限にすることが出来る。
 上記記載は実施例についてなされたが、本発明はそれに限らず、本発明の原理と添付の請求の範囲の範囲内で種々の変更および修正をすることができることは当業者に明らかである。
According to the building structure recognition system and the building structure recognition method according to the present invention described above, it becomes possible to measure the shape and position of noteworthy members at a construction site, thereby improving accuracy and speed. can be improved. Furthermore, the amount of components to be managed at a construction site can be reduced, and accordingly, the amount of data handled by the construction site component management system can be significantly reduced. In addition, according to the building structure recognition system and building structure recognition method according to the present invention, wire mesh, protective nets and sheets, temporarily installed iron fences and poles, garbage, materials, etc. existing at the construction site can be used. Even on-site images containing noise components can be recognized with high accuracy. In addition, the accuracy of structure recognition can be improved by relearning the model to match the latest site conditions in response to noise components that change moment by moment. Furthermore, it is possible to notify the user of the success or failure of scanning during on-site scanning, and to prompt the user to re-scan on the spot if the scanning is defective, thereby avoiding having to re-do the scanning work at a later date. can. As described above, according to the present invention, it is possible to minimize the time and cost for regenerating a model tailored to the actual site by responding to various situations unique to the site. .
Although the above description has been made regarding the embodiments, it will be apparent to those skilled in the art that the present invention is not limited thereto, and that various changes and modifications can be made within the scope of the principles of the present invention and the appended claims.
1 建屋内構造物認識システム
11 第1の機械学習モデル生成部
12 第2の機械学習モデル生成部
13 第3の機械学習モデル生成部
20 スキャニング部
30 ノイズ構成物除去部
40 建屋内構造物認識部
50 点群データ出力部
 
1 Building structure recognition system 11 First machine learning model generation section 12 Second machine learning model generation section 13 Third machine learning model generation section 20 Scanning section 30 Noise component removal section 40 Building structure recognition section 50 Point cloud data output section

Claims (14)

  1.  機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識システムであって、
     BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成する、第1の機械学習モデル生成部と、
     前記第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成する、第2の機械学習モデル生成部と、
     建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する、スキャニング部と、
     前記スキャニング部で取得した前記第3の現場画像からノイズ構成物の画像を除去する、ノイズ構成物除去部と、
     前記ノイズ構成物除去部で得た前記ノイズ構成物が除去された画像を用いて、前記第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成する、第3の機械学習モデル生成部と、
     前記第3の機械学習済モデルを用いて、建屋内の構造物を認識する、建屋内構造物認識部と、
     前記スキャニング部で取得した前記3次元点群データから、前記建屋内構造物認識部で認識された前記建屋内の構造物の点群データを抽出して出力する、点群データ出力部と
    を備えることを特徴とする、建屋内構造物認識システム。
    An in-building structure recognition system for recognizing in-building structures using a machine learning model, the system comprising:
    a first machine learning model generation unit that performs machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model;
    re-learning the first machine-learned model using a second site image including an image of a noise component that does not have BIM data to generate a second machine-learned model; a machine learning model generation unit,
    a scanning unit that scans the inside of the building while determining the success or failure of scanning the structure inside the building, and acquires three-dimensional point cloud data and a third on-site image inside the building;
    a noise component removal unit that removes an image of noise components from the third on-site image acquired by the scanning unit;
    relearning the second machine learned model using the image from which the noise constituents obtained by the noise constituent removal unit have been removed to generate a third machine learned model; 3 machine learning model generation unit,
    an in-building structure recognition unit that recognizes structures in the building using the third machine-learned model;
    and a point cloud data output unit that extracts and outputs point cloud data of the structure inside the building recognized by the building structure recognition unit from the three-dimensional point cloud data acquired by the scanning unit. A building structure recognition system characterized by:
  2.  機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識システムであって、
     BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成する、第1の機械学習モデル生成部と、
     建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の3次元点群データ及び第3の現場画像を取得する、スキャニング部と、
     前記スキャニング部で取得した前記第3の現場画像からノイズ構成物の画像を除去する、ノイズ構成物除去部と、
     前記ノイズ構成物除去部で得た前記ノイズ構成物が除去された画像を用いて、前記第1の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成する、第3の機械学習モデル生成部と、
     前記第3の機械学習済モデルを用いて、建屋内の構造物を認識する、建屋内構造物認識部と、
     前記スキャニング部で取得した前記3次元点群データから、前記建屋内構造物認識部で認識された前記建屋内の構造物の点群データを抽出して出力する、点群データ出力部と
    を備えることを特徴とする、建屋内構造物認識システム。
    An in-building structure recognition system for recognizing in-building structures using a machine learning model, the system comprising:
    a first machine learning model generation unit that performs machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model;
    a scanning unit that scans the inside of the building while determining the success or failure of scanning the structure inside the building, and acquires three-dimensional point cloud data and a third on-site image inside the building;
    a noise component removal unit that removes an image of noise components from the third on-site image acquired by the scanning unit;
    re-learning the first machine-learned model using the image from which the noise constituents obtained by the noise constituent removal unit have been removed to generate a third machine-learned model; 3 machine learning model generation unit,
    an in-building structure recognition unit that recognizes structures in the building using the third machine-learned model;
    and a point cloud data output unit that extracts and outputs point cloud data of the structure inside the building recognized by the building structure recognition unit from the three-dimensional point cloud data acquired by the scanning unit. A building structure recognition system characterized by:
  3.  前記第1の機械学習モデル生成部は、前記BIMデータから生成した画像を正解データとし、前記BIMデータをレンダリングして生成した画像を前記第1の現場画像から得た情報を用いて加工することにより得られた画像を観測データとして機械学習を行い、第1の機械学習済モデルを生成することを特徴とする、請求項1又は2に記載の建屋内構造物認識システム。 The first machine learning model generation unit may process the image generated from the BIM data as correct data and the image generated by rendering the BIM data using information obtained from the first site image. 3. The building structure recognition system according to claim 1, wherein a first machine learned model is generated by performing machine learning using images obtained by the above as observation data.
  4.  前記第2の機械学習モデル生成部は、前記BIMデータを有しないノイズ構成物の画像を含む前記第2の現場画像についての正解データと観測データのセットを、前記第1の機械学習済モデルに入力して再学習を行い、前記第2の機械学習済モデルを生成することを特徴とする、請求項1に記載の建屋内構造物認識システム。 The second machine learning model generation unit generates a set of correct data and observation data for the second site image, which includes an image of a noise component that does not have the BIM data, into the first machine learning model. The building structure recognition system according to claim 1, wherein the second machine learned model is generated by inputting and relearning.
  5.  前記スキャニング部は、前記建屋内の画像を取得するとともに、連続するフレーム間に少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、前記少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知することを特徴とする、請求項1又は2に記載の建屋内構造物認識システム。 The scanning unit acquires an image inside the building, and determines whether there is at least one corresponding reference point or reference structure between consecutive frames, and 3. The in-building structure recognition system according to claim 1, wherein an alert prompting rescanning is provided when there is no corresponding reference point or reference structure.
  6.  前記ノイズ構成物除去部は、ステレオマッチングにより前記スキャニング部で取得した前記第3の現場画像から前記ノイズ構成物の画像を抽出し、前記ノイズ構成物のマスク画像を生成し、前記マスク画像の部分を補間するように画像の再構成を行い、前記ノイズ構成物が除去された画像を生成することを特徴とする、請求項1又は2に記載の建屋内構造物認識システム。 The noise component removing unit extracts an image of the noise component from the third on-site image acquired by the scanning unit by stereo matching, generates a mask image of the noise component, and removes a portion of the mask image. 3. The building structure recognition system according to claim 1, wherein the system reconstructs the image so as to interpolate the noise components and generates an image from which the noise components are removed.
  7.  前記第3の機械学習モデル生成部は、前記ノイズ構成物除去部で得た、前記第3の現場画像から前記ノイズ構成物が除去された画像についての正解データと観測データのセットを、前記第2の機械学習済モデルに入力して再学習を行い、第3の機械学習済モデルを生成することを特徴とする、請求項1に記載の建屋内構造物認識システム。 The third machine learning model generation section generates a set of correct data and observation data for the image from which the noise components have been removed from the third on-site image obtained by the noise component removal section. 2. The building structure recognition system according to claim 1, wherein the third machine-learned model is generated by inputting the second machine-learned model and performing re-learning.
  8.  前記第3の機械学習モデル生成部は、前記ノイズ構成物除去部で、前記第3の現場画像から前記ノイズ構成物が除去された画像についての正解データと観測データのセットを、前記第1の機械学習済モデルに入力して再学習を行い、第3の機械学習済モデルを生成することを特徴とする、請求項2に記載の建屋内構造物認識システム。 The third machine learning model generation unit generates a set of correct data and observed data for the image from which the noise constituents have been removed from the third on-site image in the noise constituent removal unit. 3. The building structure recognition system according to claim 2, wherein a third machine-learned model is generated by inputting the information into a machine-learned model and performing re-learning.
  9.  前記建屋内構造物認識部は、前記第3の機械学習済モデルに前記第3の現場画像を入力して、前記第3の現場画像に含まれる前記建屋内の構造物を認識することを特徴とする、請求項1又は2に記載の建屋内構造物認識システム。 The in-building structure recognition unit inputs the third on-site image to the third machine-learned model and recognizes the structures in the building included in the third on-site image. The building structure recognition system according to claim 1 or 2.
  10.  機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法であって、
     BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成するステップと、
     前記第1の機械学習済モデルに対して、BIMデータを有しないノイズ構成物の画像を含む第2の現場画像を用いて再学習を行い、第2の機械学習済モデルを生成するステップと、
     建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の構造物の3次元点群データ及び第3の現場画像を取得する、スキャニング・ステップと、
     前記スキャニング・ステップで取得した前記建屋内の画像からノイズ構成物の画像を除去するステップと、
     前記除去するステップで得た前記ノイズ構成物が除去された画像を用いて、前記第2の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成するステップと、
     前記第3の機械学習済モデルを用いて、建屋内の構造物を認識するステップと、
     前記スキャニング・ステップで取得した前記3次元点群データから、前記認識するステップで認識された前記建屋内の構造物の点群データを抽出して出力するステップと
    を含む、建屋内構造物認識方法。
    An in-building structure recognition method for recognizing in-building structures using a machine learning model, the method comprising:
    performing machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model;
    Re-learning the first machine-learned model using a second site image including an image of a noise component that does not have BIM data to generate a second machine-learned model;
    a scanning step of scanning the building while determining the success or failure of scanning the structure inside the building, and acquiring three-dimensional point cloud data of the structure inside the building and a third on-site image;
    removing an image of noise components from the image inside the building obtained in the scanning step;
    retraining the second machine learned model using the image from which the noise components obtained in the removing step are removed to generate a third machine learned model;
    Recognizing structures within the building using the third machine learned model;
    A method for recognizing a structure in a building, the method comprising: extracting and outputting point cloud data of the structure in the building recognized in the recognizing step from the three-dimensional point cloud data acquired in the scanning step. .
  11.  前記スキャニング・ステップは、前記建屋内の画像を取得するとともに、連続するフレーム間に少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するか否かを判断し、前記少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知することを特徴とする、請求項10に記載の建屋内構造物認識方法。 The scanning step includes acquiring an image inside the building, determining whether there is at least one corresponding reference point or reference structure between successive frames, and determining whether or not there is at least one corresponding reference point or reference structure between successive frames; 11. The method for recognizing a structure in a building according to claim 10, further comprising: notifying an alert prompting rescanning when the corresponding reference point or reference structure does not exist.
  12.  機械学習モデルを用いて建屋内の構造物を認識するための建屋内構造物認識方法であって、
     BIM(Building Information Modeling)データ及び第1の現場画像を用いて機械学習を行い、第1の機械学習済モデルを生成するステップと、
     建屋内の構造物のスキャニングの成否を判定しながら建屋内をスキャニングし、建屋内の構造物の3次元点群データ及び建屋内の画像を取得する、スキャニング・ステップと、
     前記スキャニング・ステップで取得した前記建屋内の画像からノイズ構成物の画像を除去するステップと、
     前記除去するステップで得た前記ノイズ構成物が除去された画像を用いて、前記第1の機械学習済モデルに対して再学習を行い、第3の機械学習済モデルを生成するステップと、
     前記第3の機械学習済モデルを用いて、建屋内の構造物を認識するステップと、
     前記スキャニング・ステップで取得した前記3次元点群データから、前記認識するステップで認識された前記建屋内の構造物の点群データを抽出して出力するステップと
    を含む、建屋内構造物認識方法。
    An in-building structure recognition method for recognizing in-building structures using a machine learning model, the method comprising:
    performing machine learning using BIM (Building Information Modeling) data and the first site image to generate a first machine learned model;
    a scanning step of scanning the building while determining the success or failure of scanning the structure inside the building, and acquiring three-dimensional point cloud data of the structure inside the building and an image inside the building;
    removing an image of noise components from the image inside the building obtained in the scanning step;
    Using the image from which the noise components obtained in the removing step have been removed, the first machine learned model is retrained to generate a third machine learned model;
    Recognizing structures within the building using the third machine learned model;
    A method for recognizing a structure in a building, the method comprising: extracting and outputting point cloud data of the structure in the building recognized in the recognizing step from the three-dimensional point cloud data acquired in the scanning step. .
  13.  前記スキャニング・ステップは、前記建屋内の画像を取得するとともに、連続するフレーム間に少なくとも1つ以上の対応する基準点又は基準となる構造物が存在するかを判断し、前記少なくとも1つ以上の対応する基準点又は基準となる構造物が存在しない場合に、再スキャンを促すアラートを通知することを特徴とする、請求項12に記載の建屋内構造物認識方法。 The scanning step acquires an image inside the building, determines whether there is at least one corresponding reference point or reference structure between successive frames, and determines whether or not there is at least one corresponding reference point or reference structure between successive frames. 13. The method for recognizing a structure in a building according to claim 12, further comprising: notifying an alert prompting rescanning when there is no corresponding reference point or reference structure.
  14.  コンピュータに、請求項10から13までのいずれか一項に記載の方法の各ステップを実行させることを特徴とするプログラム。
     
    A program for causing a computer to execute each step of the method according to any one of claims 10 to 13.
PCT/JP2022/031247 2022-08-18 2022-08-18 Building interior structure recognition system and building interior structure recognition method WO2024038551A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/031247 WO2024038551A1 (en) 2022-08-18 2022-08-18 Building interior structure recognition system and building interior structure recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/031247 WO2024038551A1 (en) 2022-08-18 2022-08-18 Building interior structure recognition system and building interior structure recognition method

Publications (1)

Publication Number Publication Date
WO2024038551A1 true WO2024038551A1 (en) 2024-02-22

Family

ID=89941509

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/031247 WO2024038551A1 (en) 2022-08-18 2022-08-18 Building interior structure recognition system and building interior structure recognition method

Country Status (1)

Country Link
WO (1) WO2024038551A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018805A1 (en) * 2016-07-13 2018-01-18 Intel Corporation Three dimensional scene reconstruction based on contextual analysis
JP2021140445A (en) * 2020-03-05 2021-09-16 株式会社トプコン Information processing apparatus, inference model construction method, information processing method, inference model, program, and recording medium
JP2021140379A (en) * 2020-03-04 2021-09-16 株式会社フジタ Learning method for abnormality detection model of building, learning device, generation method for landscape information of building, generation device, and computer program
JP2022089663A (en) * 2020-12-04 2022-06-16 株式会社竹中工務店 Information processing apparatus
JP7118490B1 (en) * 2021-12-13 2022-08-16 株式会社センシンロボティクス Information processing system, information processing method, program, mobile object, management server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018805A1 (en) * 2016-07-13 2018-01-18 Intel Corporation Three dimensional scene reconstruction based on contextual analysis
JP2021140379A (en) * 2020-03-04 2021-09-16 株式会社フジタ Learning method for abnormality detection model of building, learning device, generation method for landscape information of building, generation device, and computer program
JP2021140445A (en) * 2020-03-05 2021-09-16 株式会社トプコン Information processing apparatus, inference model construction method, information processing method, inference model, program, and recording medium
JP2022089663A (en) * 2020-12-04 2022-06-16 株式会社竹中工務店 Information processing apparatus
JP7118490B1 (en) * 2021-12-13 2022-08-16 株式会社センシンロボティクス Information processing system, information processing method, program, mobile object, management server

Similar Documents

Publication Publication Date Title
Reja et al. Computer vision-based construction progress monitoring
Han et al. Appearance-based material classification for monitoring of operation-level construction progress using 4D BIM and site photologs
Huang et al. Semantics-aided 3D change detection on construction sites using UAV-based photogrammetric point clouds
KR20170068462A (en) 3-Dimensional Model Generation Using Edges
KR102113068B1 (en) Method for Automatic Construction of Numerical Digital Map and High Definition Map
CN114842139A (en) Building three-dimensional digital model construction method based on spatial analysis
CN118071999B (en) Multi-view 3D target detection method based on sampling self-adaption continuous NeRF
JP2019029915A (en) Creation method, creation system, creation device and creation system of radio wave propagation simulation model
CN113567550A (en) Ground material detection method and device, electronic equipment, chip and storage medium
CN117593702B (en) Remote monitoring method, device, equipment and storage medium
Pleansamai et al. M-estimator sample consensus planar extraction from image-based 3D point cloud for building information modelling
CN116993926B (en) Single-view human body three-dimensional reconstruction method
CN111583417B (en) Method and device for constructing indoor VR scene based on image semantics and scene geometry joint constraint, electronic equipment and medium
WO2024038551A1 (en) Building interior structure recognition system and building interior structure recognition method
Chen et al. Plane segmentation for a building roof combining deep learning and the RANSAC method from a 3D point cloud
Hernandez et al. Three-dimensional image-based approach for imperfect structures surface modeling
JP5620741B2 (en) Information processing apparatus, information processing method, and program
JP7390743B2 (en) Object measuring device and method
JP4714050B2 (en) 3D shape model generation system
Meng et al. Precise determination of mini railway track with ground based laser scanning
JP7344620B1 (en) Building structure recognition system and building structure recognition method
Stojanovic et al. A service-oriented indoor point cloud processing pipeline
JP7403108B2 (en) Building structure recognition system and building structure recognition method
CN112652059B (en) Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method
Xie et al. As-built BIM reconstruction of piping systems using smartphone videogrammetry and terrestrial laser scanning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22955725

Country of ref document: EP

Kind code of ref document: A1