WO2023108210A1 - Infrastructure safety inspection system - Google Patents
Infrastructure safety inspection system Download PDFInfo
- Publication number
- WO2023108210A1 WO2023108210A1 PCT/AU2022/051501 AU2022051501W WO2023108210A1 WO 2023108210 A1 WO2023108210 A1 WO 2023108210A1 AU 2022051501 W AU2022051501 W AU 2022051501W WO 2023108210 A1 WO2023108210 A1 WO 2023108210A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- safety inspection
- map
- damage
- images
- inspection method
- Prior art date
Links
- 238000007689 inspection Methods 0.000 title claims abstract description 55
- 230000006378 damage Effects 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 39
- 238000013528 artificial neural network Methods 0.000 claims abstract description 6
- 238000010801 machine learning Methods 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 7
- 230000004807 localization Effects 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 3
- 238000005286 illumination Methods 0.000 claims description 2
- 238000001454 recorded image Methods 0.000 claims description 2
- 230000007547 defect Effects 0.000 description 15
- 230000007170 pathology Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 238000013519 translation Methods 0.000 description 6
- 230000014616 translation Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000007797 corrosion Effects 0.000 description 5
- 238000005260 corrosion Methods 0.000 description 5
- 238000005336 cracking Methods 0.000 description 5
- 230000032258 transport Effects 0.000 description 5
- 238000012937 correction Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000011179 visual inspection Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 230000032798 delamination Effects 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013481 data capture Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000004901 spalling Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241001061260 Emmelichthys struhsakeri Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 229910001294 Reinforcing steel Inorganic materials 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000002329 infrared spectrum Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000001931 thermography Methods 0.000 description 1
- 238000002211 ultraviolet spectrum Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/89—Lidar systems specially adapted for specific applications for mapping or imaging
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/84—Systems specially adapted for particular applications
- G01N21/88—Investigating the presence of flaws or contamination
- G01N21/8851—Scan or image signal processing specially adapted therefor, e.g. for scan signal adjustment, for detecting different kinds of defects, for compensating for structures, markings, edges
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M5/00—Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
- G01M5/0033—Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by determining damage, crack or wear
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M5/00—Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
- G01M5/0091—Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by using electromagnetic excitation or detection
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/38—Concrete; ceramics; glass; bricks
- G01N33/383—Concrete, cement
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/86—Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/003—Transmission of data between radar, sonar or lidar systems and remote stations
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4802—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4808—Evaluating distance, position or velocity data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/0008—Industrial image inspection checking presence/absence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/001—Industrial image inspection using an image reference approach
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/647—Three-dimensional objects by matching two-dimensional images to three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30132—Masonry; Concrete
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30181—Earth observation
- G06T2207/30184—Infrastructure
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/521—Depth or shape recovery from laser ranging, e.g. using interferometry; from the projection of structured light
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/02—Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
- H04W84/10—Small scale networks; Flat hierarchical networks
- H04W84/12—WLAN [Wireless Local Area Networks]
Definitions
- the present invention relates to the field of visual inspections of structures and infrastructure for signs of damage, cracking and corrosion.
- US Patent US20210174492A1 represented an advance in inspection methods by providing a mixed reality headset to an operator which photographs a structure, sends the data to a server for machine learning defect identification, and returns answers which are updated in the mixed reality headset visualisation where a user interacts with the data through the headset as an inspection tool.
- the improved method does not remove the need for an operator to work at heights or in confined spaces.
- forcing an operator to wear a mixed reality headset while working may in fact increase the level of risk by inhibiting the field of view and perception of the operator.
- More traditional methods of manual inspections, such as photographing discovered defects, writing reports and triggering workflows can be a subjective process based on the inspector’s judgement and diligence, and is not infallible. Defects can be missed.
- the invention resides in a method of detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure.
- the recorded images are selected from 3D depth images and 2D video images.
- the 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing, or is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images.
- SLAM Simultaneous Localisation and Mapping
- the step of detecting damage suitably includes using machine learning to compare images of the concrete structure to a library of images indicative of damage.
- the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM.
- the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from INS.
- the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from GNSS or optimized GNSS position using RTK or PPK.
- the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from visual odometry.
- the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear pointcloud transformation.
- the method may include the further steps of generating alerts and triggering workflow.
- the system is mounted on an unmanned vehicle such as a drone for areal inspections, or a ground or water based remote inspection vehicle as appropriate.
- the invention resides in a concrete safety inspection system whereby a 3D map of a concrete structure is generated in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of odometry from 3D depth images, odometry from 2D video data, position estimation from inertial navigation system (INS) and a Global Navigation Satellite System (GNSS), and optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station.
- SLAM Simultaneous Localisation and Mapping
- INS inertial navigation system
- GNSS Global Navigation Satellite System
- optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station.
- a neural network analyses the 2D video data for the detection of concrete damage. The location of the damage is mapped from the 2D imagery
- FIG 1 is a block diagram of the major components of the invention.
- FIG 2 is a flowchart depicting the major steps in the implementation of the invention.
- FIG 3 shows a concrete bridge undergoing a safety inspection
- FIG 4 shows a concrete enclosed space undergoing a safety inspection
- FIG 5 shows an indoor and outdoor concrete inspection
- Embodiments of the present invention reside primarily in a infrastructure safety inspection system. Accordingly, the method steps and elements have been illustrated in concise schematic form in the drawings, showing only those specific details that are necessary for understanding the embodiments of the present invention, but so as not to obscure the disclosure with excessive detail that will be readily apparent to those of ordinary skill in the art having the benefit of the present description.
- adjectives such as first and second, left and right, and the like may be used solely to distinguish one element or action from another element or action without necessarily requiring or implying any actual such relationship or order.
- Words such as “comprises” or “includes” are intended to define a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed, including elements that are inherent to such a process, method, article, or apparatus.
- FIG 1 presents a block diagram of the major components required to implement the invention.
- the elements within the larger block [110] are co-located sensors which comprise the scanner element of the system, including a user interface [120] which can be physically separated from the system [110], while an optional separate GNSS reference data stream [130] may also be leveraged by the system.
- the scanner [110] is transported around the surfaces of a concrete structure by operators in a manner which suits the asset being inspected, while the user interface can be used to control and visualise the output of the scanner wherever the human operator is situated either remotely or on the structure, and data is streamed wirelessly between [110] and [120].
- the scanner [110] includes a 2D video capture system [111] which is preferably in the visible light spectrum for ease of human understanding, but may occupy a selected spectrum such as an infrared spectrum especially useful for thermal imaging which may give signs of active concrete deformation from dynamic kinematic motion or the relative coolness of moisture in areas. It may also operate in the ultra-violet spectrum or indeed in a hyper-spectral or multi- spectral frequencies which can aid in the detection of moisture content or changes in the concrete material properties.
- the source of the video data may optionally be the left or right image of a stereo camera, but the inventors have found a separate video capture system allows greatest flexibility.
- a 2D video capture system can also be aided by internal signal processing, such as automatic white balance and also high dynamic range (HDR) signal processing.
- HDR signal processing involves taking several images at high speed at slightly different exposure values, and integrating the images into a single image to capture a larger array of tonal values in the same image. This can be helpful to deal with areas of a concrete structure that may be back-lit by the sun or sky, and also for areas of with bright light and shadows in the same scene.
- an external lighting source to illuminate the scene can be useful, especially in dark concrete structures, such as tunnels, or in the eves of bridges.
- This lighting source may be co-located with the scanner or may be external to the source.
- the 2D video data capture system [111] may be co-located with a 3D depth image capture system [112] with a common origin and line of sight.
- the 3D depth images contain a 3D spatial reference of the scene from a local origin and field of view along a line of sight.
- Stereo camera systems provide high density depth images with moderate depth accuracy, while LiDAR depth images which generally operate with lower density depth images at higher depth accuracy.
- Red Green Blue Depth (RGB-D) cameras which use active illumination patterns such as Microsoft Kinect or Intel Realsense RGB-D cameras or the like also produce depth images fused with Red, Green and Blue channels and may be used in the invention.
- RGB-D Red Green Blue Depth
- the inventors have found the properties of depth images derived from LiDAR and stereo-cameras to be most beneficial for the application.
- a concrete safety inspection system acquires GNSS data when it is available.
- a GNSS capture system [113] may be co-located with the other sensors in the scanner.
- an external GNSS correction reference data stream [130] may be available.
- An inertial navigation system comprising one or more of a gyroscope, accelerometer, magnetometer preferably with a temperature sensor and pressure sensor for INS data correction may also be collocated with the sensors in the scanner.
- the INS fuses the data from each inertial sensor, for which there are several available options, including INS sensors which pre-fuse the data to generate a single position and pose estimate in 6 degrees of freedom. Alternatively the data may be fused by using a Kalman Filter to fuse the data and obtain an updated pose and position estimate.
- the user interface [120] communicates with the embedded computer to allow an operator to control the scanner and visualise the output data in near real-time remotely or locally to the operator while scanning is taking place.
- 2D video data capture system [111] and the 3D depth image capture system [112] generate data which can be fused by establishing a common origin and line of sight of the two sensors.
- a static frame from the 2D video image data may taken at the same time as a frame from the 3D depth image data, and a template matching method using an iterative closest point (ICP) or similar is applied.
- ICP is a family of template matching algorithms which use an affine transformation matrix to move a target image and reference image in iterative steps of translation and rotation until the sum of the distance between the two templates is minimised.
- the output of the ICP is a translation matrix for the 6 degrees of freedom, namely pitch, roll and yaw angles, and x, y and z translations. Applying this matrix to one of the two sensor data types delivers a common origin and line of sight for the sensors.
- the 2D image capture and 3D image capture systems are mechanically aligned in pitch, roll and yaw angles, and are aligned in two of the three translation axes, and are offset by a known translation in the third axis which optionally can be accounted for by a translation of the output data. This method gives a common line of sight and a common origin.
- Another effective embodiment for this alignment is to use the sensor output data of both a 2D image and 3D depth image, and input a selection of three or more common points in a scene in both a frame from the 2D video image data and the matching frame from the 3D depth image data. These points are used to create a transformation matrix to align the two data types and generate a common origin and line of sight for the two sensors.
- the purpose is to fuse the data between sensors, bringing together the video capture data and the depth image data and the derivatives of these data.
- FIG 2 there are two main streams illustrated in this workflow, namely a SLAM stream [270] and an defect detection stream using machine learning [280], the streams are shaded in light grey for ease of understanding the flow chart.
- the two streams fuse to locate any defects on the 3D map.
- BIM data or similar 3D maps of a structure are typically point clouds or 3D meshes.
- the user may elect to provide an existing 3D map of the structure [250] which is used as a ‘ground truth’ 3D reference guide for the SLAM stream [270] later in the process.
- a 3D map of the structure is generated as a separate process to the concrete inspection specifically for the process, whereby the structure is modelled using a laser scanning system, or photogrammetry, or by other appropriate surveying methods.
- a user may elect to use such a 3D map [260] a 3D ‘ground truth’ reference for the SLAM processing [270].
- the 3D map is generated during the concrete safety inspection by the scanner by using a SLAM algorithm, without any external reference, which is explained by the SLAM stream [270] of FIG 2.
- the main process loop begins at the start of the flow chart [201 ], with the simultaneous acquisition of 3D depth image data [202], GNSS data and GNSS reference data [203] where available, INS data [204] and 2D video data [205] by the scanner, all synchronised by the same real-time clock.
- FIG 2 illustrates all four sensors, however only one or more sensors are required to locate defects on a 3D map and implement the invention.
- 2D image data alone is acquired, with monocular SLAM using visual odometry is implemented on the 2D image data to gain an estimate of the 2D image data’s origin and line of sight, and the 2D image data is projected onto a 3D map.
- all four of the sensors are used in the SLAM stream.
- an algorithm that fuses image odometry estimates [206], INS pose and position estimates [207], GNSS data where available [208] by algorithms such as a Kalman Filter, averaging, weighted averaging, or other suitable method is used.
- the output of this fusion is a single position and pose estimate [209] for the scanner this could also be described as the updated origin and line of sight of the 3D depth image and its pre-fused image data.
- the depth image can transformed to a global co-ordinate system [210]. Once transformed, the plurality of 3D depth image data contribute to become a singular high density 3D model of the scene, referred to as a global point cloud [210].
- the SLAM output can be optimised using one or more additional point cloud corrections: an estimate the shape of the surface being scanned and correct irregularities in the point cloud [211], and using Loop Closure [212] to correct for map drift errors by relocating previously scanned areas and closing the scan loop by making corrections to the entire map to ensure both scans of the same area have the matching coordinates.
- SLAM algorithm by matching the shape of the local point cloud and the global point cloud to update the point cloud [211] by using shape estimates can be beneficial for mapping concrete assets with distinctive geometric forms.
- the inventors have found such as plane-of-best fit, RANSAC, and gridding methods, such as Normal Distribution Transform (NDT) can aid in SLAM mapping by matching planes and shapes in the 3D data.
- NDT Normal Distribution Transform
- the SLAM output map is put through a non-linear transformation so that its point cloud set matches the reference map [290].
- This non-rigid transformation may be done using a special case polyharmonic spline algorithm for smoothing and interpolation: Thin Plate Spline Robust Point Matching (TPS- RPM). TPS-RPM morphs and smooths the SLAM 3D map to closely match the ground truth 3D reference map.
- the fused 3D point cloud with its global coordinate system is also useful when saved for future reference by concrete inspectors and operators to compare changes in shape and appearance over time.
- the global point cloud is down- sampled to a low-density occupancy 3D grid or voxel.
- a voxel is a regular 3D grid defined by three axes, making a regular rectangular prisms or cubes.
- a voxel is determined to be occupied if a threshold number of points is present in the voxel.
- This data is classically used by robotic navigation systems for local and wide area navigation [216] which is not the topic of this invention.
- the inventors have found it useful to create multiple levels of detail and sizes, but preferably three levels of 30cm 3 , 60cm 3 and 1 m 3 voxels, to aid in navigation and collision avoidance for unmanned vehicles whilst optimising memory and processing speed [214],
- 2D image data is acquired [205] and a machine learning (ML) algorithm is used to identify concrete pathologies in the 2D video data [220].
- the algorithm ingests the 2D video data as a series of 2D images or frames. Each image is analysed and the ML algorithm searches for patterns in the shape and colour channels of the pixel data of a group of frames and produces a probability that concrete damage is present in the image.
- supervised learning is used whereby the algorithm has been trained for this pattern detection on a dataset of human-labelled 2D video data images or frames.
- the human labels the images into categories such as ‘crack’, ‘concrete’, ‘high porosity concrete’, ‘corrosion’, ‘internal corrosion’, ‘delamination’, ‘repaired concrete’, ‘not concrete’ or other categories as may be appropriate. Ideally thousands, of such labelled images are required.
- each image before labelling takes place, each image ideally is broken into smaller image tiles with the same pixel size and dimensions per tile, such as 224 x 224 pixels, or 256 x 256 pixels although any other size may be used.
- Each image tile is separately labelled by the human according to its classification.
- object detection is used, whereby the visible extents of the object being labelled in each frame is annotated by the human with a bounding box the corners of which reach the maximum and minimum X and Y coordinates of the object in the image.
- Another embodiment takes the approach of semantic segmentation, where the human labeller segments each frame into its separate categories with complex polygons covering only the pixels belonging to this category, for example an image of a crack in competent concrete are segmented in such a way that pixels belonging to the crack are grouped together and likewise the competent concrete are also segmented.
- CNNs Convolutional Neural Networks
- the training phase is where the algorithm, running on a computer, ingests the labelled image data and detects patterns in the labelled datasets. This is done by leveraging convoluted layers, starting with an input layer, various levels of hidden layers which spatially convolute image image and retain pattern information, before finally passing to an output layer, which is the label given by the human.
- the image data in this case is 2D video data with both its natural and fused reference systems [217].
- the image is then broken into the same-size image tiles as were used in the training phase, and the global coordinates (taken from the SLAM algorithm described earlier) and preferably the global coordinates of the four corners of each image tile are recorded by default.
- the global coordinates of the centre pixel of each image tile or other forms of averaging such as a mean or median of each axis of the coordinate system or excluding pixel coordinates from such calculations by means of only using pixels with coordinate values within one or two standard deviations of the mean may be selected by a user.
- a user may elect to apply a depth mage threshold to the fused 2D image data whereby images with distant pixels are excluded from the fused 2D image data hence removing distant objects like the ground or trees or water courses adjacent to a concrete structure which may appear in part of an image tile.
- the ML algorithm then separates each image tile, region or pixel into its pre-trained categories. If the algorithm identifies a type of pathology, the tile is saved as a 2D image, and as a 3D object with a bounding box defined by the coordinates of the image tile, or other point data or image masked data as applicable depending on the ML algorithm selected by the operator. These objects can be visualised in runtime for a user on the user interface, and are also saved in a process we shall now explore.
- Concrete structures can have a wide range of surfaces that can pose challenges to machine learning models. Concrete surfaces can be painted, they can have exposed aggregate, or have sandy or smooth surfaces, they can be weathered, stained or blackened, they can have a range of base colours such as light-grey, white, pink, dark grey or be actively coloured during manufacturing.
- the large database is split into surface-types, and separate machine learning models are generated by surface type.
- the appropriate model is then applied to a concrete asset being inspected. This method gives a higher accuracy, recall and precision, but requires a lot more training data to be useful.
- Alerting and workflow logic is then applied to the data for useful outputs, which may include on-screen alerts and visualisations of the pathologies so an operator realises pathology is present while the operator is performing the inspection [222],
- the 3D global map can be visualised to allow operators to determine if they have scanned all areas of interest to them [215], which is a second useful output reducing the need for rework or revisit by an operator who does not have real-time feedback.
- Another useful output [223] is to pass all image tile coordinates of detected defects, or 2D image tiles or 3D image tile objects to work-flow software, which can trigger human-led tasks such as repair work, further inspections, or more detailed engineering investigations. In many cases the detailed inspections can be done using the output data of the concrete safety inspection system and do not require any further field work. Triggers of investigations may be based on the type, location and density of pathologies detected by a concrete safety inspection system. This work can then be scheduled and carried out to ensure concrete assets are kept safe and operational.
- the workflow can be triggered when the scanner is connected to the internet and can contact a workflow server.
- the scanner sends the server an output, such as a message, command or alert to trigger a workflow based on the location, type and density of pathologies detected.
- This data and the 3D defect detection data can be stored in a database to trigger workflows, or individual alerts can be raised by email, SMS, social media messaging, or other messaging services such as WhatsApp.
- Alerts may by filtered by pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
- pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
- Concrete structures come in many forms, from bridges to dams, from tunnels to buildings, from wide open pavements and roads to confined spaces of tanks and pipes and exposed foundations. All concrete structures, regardless of the form they take, require regular visual inspection for detection of damage, such as cracking, spalling, delamination, chemical damage, seepage, pitting and internal corrosion of reinforcing steel.
- FIG 3 shows a concrete bridge with cracking and damage in the deck [321], a pylon [331] and the pavement [311] of the bridge.
- a concrete safety inspection system [312, 322, 332] is used to inspect the structure, detect damage and locate defects on a 1 :1 scale 3D map.
- the means of transporting the scanner is secondary to operation of the invention, as the scanner has no awareness of the method of transport.
- an operator [310] may choose to walk with the scanner on a pavement of the bridge while it is being used by other pedestrians [313], and later by mounting the scanner on a human operated vehicle [324] to scan the deck of the bridge even if the deck is being used by other cars, and perhaps finally by an Unmanned Aerial Vehicle (UAV) [334] for all other surfaces, or by other such methods.
- UAV Unmanned Aerial Vehicle
- FIG 3 The detail of FIG 3 reveals a vehicle [324] with an operator [320] transporting a concrete safety inspection system [322] aided by a user interface [325].
- the damage [321] on the deck of the bridge is illuminated by the scanner’s field of view [326].
- an unmanned aerial vehicle [332] transports the same concrete safety inspection system [331] with a scanner footprint [336] covering damage [331] with a remote operator [330] aided by an integrated user interface acting as a system controller and visualiser.
- a Global Navigation Satellite System (GNSS) base station [340] which transmits real-time GNSS position reference data [341], GNSS position reference data streams [342] may also be attained by subscribing to available public GNSS base stations [343].
- the concrete safety inspection system receives this data [342 or 341] in real-time when it is detectable and available, preferably at some point during the inspection.
- a concrete safety inspection system [403] with an integrated user interface [407] is used by an operator [406] inside an enclosed space [401 ] to scan the walls and ceiling of the space for cracking or damage [402] the virtual footprint of the scanner is shown for illustrative purposes [404],
- the means of transport for the scanner inside an enclosed space may also be by other suitable means such as on a UAV or an operated vehicle. Inspections can take place outdoors, indoors or in other enclosed spaces or may commute amongst these spaces as required to inspect the structure.
- FIG 5 An indoor example is shown in FIG 5 where a concrete safety inspection system [503] with an integrated user interface is mounted on a cart [506] for transportation being pushed by an operator [508] scans for damage [502] and the scanner’s footprint is shown [504],
Abstract
The invention is an infrastructure safety inspection method including the steps of recording images of a structure, acquiring a 3D map of the structure, detecting damage in the structure by analysing 2D images with a neural network and locating the detected damage on the 3D map of the structure.
Description
TITLE
INFRASTRUCTURE SAFETY INSPECTION SYSTEM
FIELD OF THE INVENTION
[001] The present invention relates to the field of visual inspections of structures and infrastructure for signs of damage, cracking and corrosion.
BACKGROUND TO THE INVENTION
[002] Concrete structures and infrastructure such as bridges, pylons, buildings, dams, cooling towers, foundations, tunnels and sewers require regular safety inspections to detect signs of damage such as cracking, concrete cancer, delamination, spalling, corrosion or other pathologies. Detection of these defects can trigger certain workflows, such as structural integrity modelling, or repair and maintenance work, or in extreme cases the temporary or permanent closure of infrastructure due to safety concerns.
[003] This work is largely done by visual inspection, and at times requires inspectors to work in confined spaces, at heights, on abseiling ropes, or suspended in baskets from a crane, or on busy and congested roads, all of which can represent safety hazards to inspectors or the local community from falling tools or equipment. Infrastructure inspection is dangerous work.
[004] US Patent US20210174492A1 represented an advance in inspection methods by providing a mixed reality headset to an operator which photographs a structure, sends the data to a server for machine learning defect identification, and returns answers which are updated in the mixed reality headset visualisation where a user interacts with the data through the headset as an inspection tool. Despite the improved method, it does not remove the need for an operator to work at heights or in confined spaces. In fact, forcing an operator to wear a mixed reality headset while working, may in fact increase the level of risk by inhibiting the field of view and perception of the operator.
[005] More traditional methods of manual inspections, such as photographing discovered defects, writing reports and triggering workflows can be a subjective process based on the inspector’s judgement and diligence, and is not infallible. Defects can be missed.
[006] Attempts have been made to remove this subjectivity, such as Chinese patent CN1 13096088A which uses machine learning to identify defects in 2D image data. A limitation with this method is the lack of geospatial reference or 3D measurements, meaning either a course estimate of the location must be given by the operator ‘by eye’ or it may in fact trigger more inspections at height or in confined spaces.
[007] Likewise, using human-led visual inspections with photographs leaves no full 3D model of the condition and shape of the structure meaning future comparisons lack spatial context, and future inspections rely on unreliable human memories and 2D images to try to resolve changes which may have occurred in 3D.
[008] There is therefore a need for an infrastructure safety inspection system to provide remote, automatic damage detection, 3D mapping, and immediate alerts or triggering of workflows for damaged areas of concrete structures.
SUMMARY OF THE INVENTION
[009] In one form, although it need not be the only or indeed the broadest form, the invention resides in a method of detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure.
[0010] Suitably the recorded images are selected from 3D depth images and 2D video images.
[0011 ] The 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing, or is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images.
[0012] The step of detecting damage suitably includes using machine learning to compare images of the concrete structure to a library of images indicative of damage.
[0013] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM.
[0014] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from INS.
[0015] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from GNSS or optimized GNSS position using RTK or PPK.
[0016] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from visual odometry.
[0017] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear pointcloud transformation.
[0018] The method may include the further steps of generating alerts and triggering workflow.
[0019] Preferably the system is mounted on an unmanned vehicle such as a drone for areal inspections, or a ground or water based remote inspection vehicle as appropriate.
[0020] In a further form, the invention resides in a concrete safety inspection system whereby a 3D map of a concrete structure is generated in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of odometry from 3D depth images, odometry from 2D video data, position estimation from inertial navigation system (INS) and a Global Navigation Satellite System (GNSS), and optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station. A neural network analyses the 2D video data for the detection of concrete damage. The location of the damage is mapped from the 2D imagery to the 3D model, and subsequent alerts and workflows are triggered.
[0021 ] Further features and advantages of the present invention will become apparent from the following detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] To assist in understanding the invention and to enable a person skilled in the art to put the invention into practical effect, preferred embodiments of the invention will be described by way of example only with reference to the accompanying drawings, in which:
[0023] FIG 1 is a block diagram of the major components of the invention;
[0024] FIG 2 is a flowchart depicting the major steps in the implementation of the invention;
[0025] FIG 3 shows a concrete bridge undergoing a safety inspection;
[0026] FIG 4 shows a concrete enclosed space undergoing a safety inspection;
[0027] FIG 5 shows an indoor and outdoor concrete inspection;
DETAILED DESCRIPTION OF THE INVENTION
[0028] Embodiments of the present invention reside primarily in a infrastructure safety inspection system. Accordingly, the method steps and elements have been illustrated in concise schematic form in the drawings,
showing only those specific details that are necessary for understanding the embodiments of the present invention, but so as not to obscure the disclosure with excessive detail that will be readily apparent to those of ordinary skill in the art having the benefit of the present description.
[0029] In this specification, adjectives such as first and second, left and right, and the like may be used solely to distinguish one element or action from another element or action without necessarily requiring or implying any actual such relationship or order. Words such as “comprises” or “includes” are intended to define a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed, including elements that are inherent to such a process, method, article, or apparatus.
[0030] To assist in the understanding of the invention, FIG 1 presents a block diagram of the major components required to implement the invention. The elements within the larger block [110] are co-located sensors which comprise the scanner element of the system, including a user interface [120] which can be physically separated from the system [110], while an optional separate GNSS reference data stream [130] may also be leveraged by the system.
[0031 ] The scanner [110] is transported around the surfaces of a concrete structure by operators in a manner which suits the asset being inspected, while the user interface can be used to control and visualise the output of the scanner wherever the human operator is situated either remotely or on the structure, and data is streamed wirelessly between [110] and [120].
[0032] The scanner [110] includes a 2D video capture system [111] which is preferably in the visible light spectrum for ease of human understanding, but may occupy a selected spectrum such as an infrared spectrum especially useful for thermal imaging which may give signs of active concrete deformation from dynamic kinematic motion or the relative coolness of moisture in areas. It may also operate in the ultra-violet spectrum or indeed in a hyper-spectral or multi- spectral frequencies which can aid in the detection of moisture content or changes in the concrete material properties. The source of the video data may
optionally be the left or right image of a stereo camera, but the inventors have found a separate video capture system allows greatest flexibility.
[0033] A 2D video capture system can also be aided by internal signal processing, such as automatic white balance and also high dynamic range (HDR) signal processing. HDR signal processing involves taking several images at high speed at slightly different exposure values, and integrating the images into a single image to capture a larger array of tonal values in the same image. This can be helpful to deal with areas of a concrete structure that may be back-lit by the sun or sky, and also for areas of with bright light and shadows in the same scene.
[0034] Using an external lighting source to illuminate the scene can be useful, especially in dark concrete structures, such as tunnels, or in the eves of bridges. This lighting source may be co-located with the scanner or may be external to the source.
[0035] The 2D video data capture system [111] may be co-located with a 3D depth image capture system [112] with a common origin and line of sight. The 3D depth images contain a 3D spatial reference of the scene from a local origin and field of view along a line of sight. The inventors have found that using stereo-camera systems or LiDAR systems as the source of the depth image data are useful. Stereo camera systems provide high density depth images with moderate depth accuracy, while LiDAR depth images which generally operate with lower density depth images at higher depth accuracy. Red Green Blue Depth (RGB-D) cameras which use active illumination patterns such as Microsoft Kinect or Intel Realsense RGB-D cameras or the like also produce depth images fused with Red, Green and Blue channels and may be used in the invention. The inventors have found the properties of depth images derived from LiDAR and stereo-cameras to be most beneficial for the application.
[0036] In one embodiment, a concrete safety inspection system acquires GNSS data when it is available. For this purpose a GNSS capture system [113] may be co-located with the other sensors in the scanner. Likewise, in a further
embodiment, an external GNSS correction reference data stream [130] may be available.
[0037] An inertial navigation system (INS) [114] comprising one or more of a gyroscope, accelerometer, magnetometer preferably with a temperature sensor and pressure sensor for INS data correction may also be collocated with the sensors in the scanner. The INS fuses the data from each inertial sensor, for which there are several available options, including INS sensors which pre-fuse the data to generate a single position and pose estimate in 6 degrees of freedom. Alternatively the data may be fused by using a Kalman Filter to fuse the data and obtain an updated pose and position estimate.
[0038] Sensors are synchronised in time by a Real Time Clock [115] and are processed by an embedded computer [116].
[0039] The user interface [120] communicates with the embedded computer to allow an operator to control the scanner and visualise the output data in near real-time remotely or locally to the operator while scanning is taking place.
[0040] 2D video data capture system [111] and the 3D depth image capture system [112] generate data which can be fused by establishing a common origin and line of sight of the two sensors.
[0041 ] To achieve this fusion a static frame from the 2D video image data may taken at the same time as a frame from the 3D depth image data, and a template matching method using an iterative closest point (ICP) or similar is applied. ICP is a family of template matching algorithms which use an affine transformation matrix to move a target image and reference image in iterative steps of translation and rotation until the sum of the distance between the two templates is minimised.
[0042] The output of the ICP is a translation matrix for the 6 degrees of freedom, namely pitch, roll and yaw angles, and x, y and z translations. Applying this matrix to one of the two sensor data types delivers a common origin and line of sight for the sensors.
[0043] In a further embodiment to generate a common origin and line of sight, the 2D image capture and 3D image capture systems are mechanically aligned
in pitch, roll and yaw angles, and are aligned in two of the three translation axes, and are offset by a known translation in the third axis which optionally can be accounted for by a translation of the output data. This method gives a common line of sight and a common origin.
[0044] Another effective embodiment for this alignment is to use the sensor output data of both a 2D image and 3D depth image, and input a selection of three or more common points in a scene in both a frame from the 2D video image data and the matching frame from the 3D depth image data. These points are used to create a transformation matrix to align the two data types and generate a common origin and line of sight for the two sensors.
[0045] In any of these embodiments for establishing a common origin and line of sight, the purpose is to fuse the data between sensors, bringing together the video capture data and the depth image data and the derivatives of these data.
[0046] Turning now to FIG 2, there are two main streams illustrated in this workflow, namely a SLAM stream [270] and an defect detection stream using machine learning [280], the streams are shaded in light grey for ease of understanding the flow chart. The two streams fuse to locate any defects on the 3D map.
[0047] Many concrete structures have accompanying 3D digital representation of the physical and functional characteristics of the structure, often referred to as Building Information Models (BIM). BIM data or similar 3D maps of a structure are typically point clouds or 3D meshes. In one embodiment, the user may elect to provide an existing 3D map of the structure [250] which is used as a ‘ground truth’ 3D reference guide for the SLAM stream [270] later in the process.
[0048] In another embodiment, a 3D map of the structure is generated as a separate process to the concrete inspection specifically for the process, whereby the structure is modelled using a laser scanning system, or photogrammetry, or by other appropriate surveying methods. A user may elect to use such a 3D map [260] a 3D ‘ground truth’ reference for the SLAM processing [270].
[0049] In a further embodiment, the 3D map is generated during the concrete safety inspection by the scanner by using a SLAM algorithm, without any external reference, which is explained by the SLAM stream [270] of FIG 2.
[0050] The main process loop begins at the start of the flow chart [201 ], with the simultaneous acquisition of 3D depth image data [202], GNSS data and GNSS reference data [203] where available, INS data [204] and 2D video data [205] by the scanner, all synchronised by the same real-time clock. FIG 2 illustrates all four sensors, however only one or more sensors are required to locate defects on a 3D map and implement the invention. In one embodiment of the invention, 2D image data alone is acquired, with monocular SLAM using visual odometry is implemented on the 2D image data to gain an estimate of the 2D image data’s origin and line of sight, and the 2D image data is projected onto a 3D map. In another embodiment, all four of the sensors are used in the SLAM stream.
[0051] If more than one sensor is used, an algorithm that fuses image odometry estimates [206], INS pose and position estimates [207], GNSS data where available [208] by algorithms such as a Kalman Filter, averaging, weighted averaging, or other suitable method is used.
[0052] The output of this fusion is a single position and pose estimate [209] for the scanner this could also be described as the updated origin and line of sight of the 3D depth image and its pre-fused image data. Once a new origin and line of site is determined, the depth image can transformed to a global co-ordinate system [210]. Once transformed, the plurality of 3D depth image data contribute to become a singular high density 3D model of the scene, referred to as a global point cloud [210].
[0053] Whilst this is sufficient for the SLAM stream of the invention, in one embodiment the SLAM output can be optimised using one or more additional point cloud corrections: an estimate the shape of the surface being scanned and correct irregularities in the point cloud [211], and using Loop Closure [212] to correct for map drift errors by relocating previously scanned areas and closing
the scan loop by making corrections to the entire map to ensure both scans of the same area have the matching coordinates.
[0054] Optimising the SLAM algorithm by matching the shape of the local point cloud and the global point cloud to update the point cloud [211] by using shape estimates can be beneficial for mapping concrete assets with distinctive geometric forms. The inventors have found such as plane-of-best fit, RANSAC, and gridding methods, such as Normal Distribution Transform (NDT) can aid in SLAM mapping by matching planes and shapes in the 3D data.
[0055] If the user has imported a 3D reference map before the main loop began, then the SLAM output map is put through a non-linear transformation so that its point cloud set matches the reference map [290]. This non-rigid transformation may be done using a special case polyharmonic spline algorithm for smoothing and interpolation: Thin Plate Spline Robust Point Matching (TPS- RPM). TPS-RPM morphs and smooths the SLAM 3D map to closely match the ground truth 3D reference map.
[0056] The fused 3D point cloud with its global coordinate system is also useful when saved for future reference by concrete inspectors and operators to compare changes in shape and appearance over time.
[0057] The output of these processes is to optimise the global point cloud to reduce errors and store it [213].
[0058] For the operator to be able to visualise which areas of the concrete structure have been inspected, it can be useful to render an updated 3D map on the user interface at regular intervals, generally at the rate of 1 hz or 2hz or another suitable update rate. This allows an operator to see on the 3D map if some parts of the structure have been missed due to occlusions or shadows in a line of sight, and to ensure all desired areas are inspected [215].
[0059] In one embodiment of the invention, the global point cloud is down- sampled to a low-density occupancy 3D grid or voxel. For clarity, a voxel is a regular 3D grid defined by three axes, making a regular rectangular prisms or cubes. A voxel is determined to be occupied if a threshold number of points is present in the voxel. This data is classically used by robotic navigation systems
for local and wide area navigation [216] which is not the topic of this invention. The inventors have found it useful to create multiple levels of detail and sizes, but preferably three levels of 30cm3, 60cm3 and 1 m3 voxels, to aid in navigation and collision avoidance for unmanned vehicles whilst optimising memory and processing speed [214],
[0060] Turning now to the Machine Learning flow [280] of FIG 1 . 2D image data is acquired [205] and a machine learning (ML) algorithm is used to identify concrete pathologies in the 2D video data [220]. The algorithm ingests the 2D video data as a series of 2D images or frames. Each image is analysed and the ML algorithm searches for patterns in the shape and colour channels of the pixel data of a group of frames and produces a probability that concrete damage is present in the image.
[0061 ] In one embodiment of the invention, supervised learning is used whereby the algorithm has been trained for this pattern detection on a dataset of human-labelled 2D video data images or frames. The human labels the images into categories such as ‘crack’, ‘concrete’, ‘high porosity concrete’, ‘corrosion’, ‘internal corrosion’, ‘delamination’, ‘repaired concrete’, ‘not concrete’ or other categories as may be appropriate. Ideally thousands, of such labelled images are required.
[0062] In an embodiment of the invention, before labelling takes place, each image ideally is broken into smaller image tiles with the same pixel size and dimensions per tile, such as 224 x 224 pixels, or 256 x 256 pixels although any other size may be used. Each image tile is separately labelled by the human according to its classification.
[0063] In another embodiment, object detection is used, whereby the visible extents of the object being labelled in each frame is annotated by the human with a bounding box the corners of which reach the maximum and minimum X and Y coordinates of the object in the image.
[0064] Another embodiment takes the approach of semantic segmentation, where the human labeller segments each frame into its separate categories with complex polygons covering only the pixels belonging to this category, for
example an image of a crack in competent concrete are segmented in such a way that pixels belonging to the crack are grouped together and likewise the competent concrete are also segmented.
[0065] Once data has been labelled, a specific algorithm variant must be selected and the data is applied to the training phase of machine learning. Convolutional Neural Networks (CNNs) are well known for their accuracy and efficiency in image processing and understanding. After trialling many variants of CNNs, the inventors have found several CNN families to be most accurate and efficient of the various methods developed and trialled for the implementation of the invention. These include Tensorflow lite, mobilenet, Alexnet, Yolo, ResNet, classic CNNs, VGG, Inception and Xception.
[0066] The training phase is where the algorithm, running on a computer, ingests the labelled image data and detects patterns in the labelled datasets. This is done by leveraging convoluted layers, starting with an input layer, various levels of hidden layers which spatially convolute image image and retain pattern information, before finally passing to an output layer, which is the label given by the human.
[0067] At completion of training, subsequent validation and testing occurs on separate labelled data the algorithm has not been exposed to, which allows one to determine the accuracy of the model.
[0068] This trained model can now be deployed on fresh image data. The image data in this case is 2D video data with both its natural and fused reference systems [217], The image is then broken into the same-size image tiles as were used in the training phase, and the global coordinates (taken from the SLAM algorithm described earlier) and preferably the global coordinates of the four corners of each image tile are recorded by default. Optionally the global coordinates of the centre pixel of each image tile or other forms of averaging such as a mean or median of each axis of the coordinate system or excluding pixel coordinates from such calculations by means of only using pixels with coordinate values within one or two standard deviations of the mean may be selected by a user. Finally a user may elect to apply a depth mage threshold to
the fused 2D image data whereby images with distant pixels are excluded from the fused 2D image data hence removing distant objects like the ground or trees or water courses adjacent to a concrete structure which may appear in part of an image tile.
[0069] The ML algorithm then separates each image tile, region or pixel into its pre-trained categories. If the algorithm identifies a type of pathology, the tile is saved as a 2D image, and as a 3D object with a bounding box defined by the coordinates of the image tile, or other point data or image masked data as applicable depending on the ML algorithm selected by the operator. These objects can be visualised in runtime for a user on the user interface, and are also saved in a process we shall now explore.
[0070] As each pixel of the 2D video data has previously been fused to the corresponding 3D depth image [217] once the transformation of coordinates [210] and subsequent updates [211 and 212] of the 3D depth data has taken place, the 2D image data pixels are in-turn transformed into the global coordinate reference system. This data is used to geolocate the concrete pathologies recognised from the machine learning classification system in 3D [221], The defects detected by ML are saved as 3D objects using the global coordinate system and added to a global defect map [222],
[0071 ] Concrete structures can have a wide range of surfaces that can pose challenges to machine learning models. Concrete surfaces can be painted, they can have exposed aggregate, or have sandy or smooth surfaces, they can be weathered, stained or blackened, they can have a range of base colours such as light-grey, white, pink, dark grey or be actively coloured during manufacturing.
These changes in appearance can be handled by machine learning using one of several approaches.
[0072] In the first approach, a large database is collected of all bridge surface types to create a generalised model. The authors have found this approach is easiest to use, but can sacrifice precision, accuracy and recall.
[0073] In another approach, the large database is split into surface-types, and separate machine learning models are generated by surface type. The
appropriate model is then applied to a concrete asset being inspected. This method gives a higher accuracy, recall and precision, but requires a lot more training data to be useful.
[0074] An alternative method which deals with are first converted to a monochromatic scale using either a combination of two or three of the RGB channels, or a single RGB channel or their derivate channels of hue, saturation and intensity. The authors have also found this to be useful in that only a single model is required for all surfaces, but again it has a lower recall, precision and accuracy than individual models. It also is useful if one surface type has a small database of training data.
[0075] Alerting and workflow logic is then applied to the data for useful outputs, which may include on-screen alerts and visualisations of the pathologies so an operator realises pathology is present while the operator is performing the inspection [222],
[0076] The 3D global map can be visualised to allow operators to determine if they have scanned all areas of interest to them [215], which is a second useful output reducing the need for rework or revisit by an operator who does not have real-time feedback.
[0077] Another useful output [223] is to pass all image tile coordinates of detected defects, or 2D image tiles or 3D image tile objects to work-flow software, which can trigger human-led tasks such as repair work, further inspections, or more detailed engineering investigations. In many cases the detailed inspections can be done using the output data of the concrete safety inspection system and do not require any further field work. Triggers of investigations may be based on the type, location and density of pathologies detected by a concrete safety inspection system. This work can then be scheduled and carried out to ensure concrete assets are kept safe and operational.
[0078] In such an embodiment, the workflow can be triggered when the scanner is connected to the internet and can contact a workflow server. The scanner sends the server an output, such as a message, command or alert to
trigger a workflow based on the location, type and density of pathologies detected. This data and the 3D defect detection data, can be stored in a database to trigger workflows, or individual alerts can be raised by email, SMS, social media messaging, or other messaging services such as WhatsApp.
[0079] Alerts may by filtered by pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
[0080] Leveraging collaboration of multiple skilled operators or civil engineers may a far better engineering assessment of the state of concrete structure, compared to using only the judgement of an inspector in the field. To facilitate this collaboration, multiple users in a workflow software may be alerted when a pathology is detected, and their input is gathered in the form of opinions and comments on the 3D map, the location data and the image data and detection classification. The workflow may also evolve collaboratively by passing through stage gates from remote analysis of the data collected, additional field work, modelling of a pathology’s impact on structural integrity, trigger a repair workflow, or in some extreme cases even close a structure on safety grounds.
[0081] Concrete structures come in many forms, from bridges to dams, from tunnels to buildings, from wide open pavements and roads to confined spaces of tanks and pipes and exposed foundations. All concrete structures, regardless of the form they take, require regular visual inspection for detection of damage, such as cracking, spalling, delamination, chemical damage, seepage, pitting and internal corrosion of reinforcing steel.
[0082] Due to the many forms a concrete structure can take, it is necessary to highlight how a concrete safety inspection system can be used to map and locate concrete damage and pathologies in 3D space to trigger workflows. To assist with this, FIG 3 shows a concrete bridge with cracking and damage in the deck [321], a pylon [331] and the pavement [311] of the bridge. A concrete safety
inspection system [312, 322, 332] is used to inspect the structure, detect damage and locate defects on a 1 :1 scale 3D map.
[0083] The inventors have found that the means of transporting the scanner is secondary to operation of the invention, as the scanner has no awareness of the method of transport. In the case of a concrete bridge, as represented in FIG 3, an operator [310] may choose to walk with the scanner on a pavement of the bridge while it is being used by other pedestrians [313], and later by mounting the scanner on a human operated vehicle [324] to scan the deck of the bridge even if the deck is being used by other cars, and perhaps finally by an Unmanned Aerial Vehicle (UAV) [334] for all other surfaces, or by other such methods. Regardless of the transport method, the output of the three scanning epochs used in this example can be combined into one single dataset.
[0084] While it is effective to transport the system manually and by vehicles, the inventors have found it to be most practical to mount the scanner on an unmanned vehicle, such as a UAV [334],
[0085] The detail of FIG 3 reveals a vehicle [324] with an operator [320] transporting a concrete safety inspection system [322] aided by a user interface [325]. The damage [321] on the deck of the bridge is illuminated by the scanner’s field of view [326].
[0086] Independently, an unmanned aerial vehicle (UAV) [332] transports the same concrete safety inspection system [331] with a scanner footprint [336] covering damage [331] with a remote operator [330] aided by an integrated user interface acting as a system controller and visualiser.
[0087] The operator [310] walking along the pavement of the bridge holds the concrete safety scanner [312] with an integrated user interface illuminating a portion of the bridge covered by the footprint of the scanner [316] over damaged pavement [311].
[0088] A Global Navigation Satellite System (GNSS) base station [340] is shown which transmits real-time GNSS position reference data [341], GNSS position reference data streams [342] may also be attained by subscribing to available public GNSS base stations [343]. In one embodiment of the invention,
the concrete safety inspection system receives this data [342 or 341] in real-time when it is detectable and available, preferably at some point during the inspection.
[0089] For a concrete structure taking the form of an enclosed space, such as a tunnel, underground mine or a pipeline, such as the one illustrated in FIG 4, a concrete safety inspection system [403] with an integrated user interface [407] is used by an operator [406] inside an enclosed space [401 ] to scan the walls and ceiling of the space for cracking or damage [402] the virtual footprint of the scanner is shown for illustrative purposes [404], Like an outdoor inspection, the means of transport for the scanner inside an enclosed space may also be by other suitable means such as on a UAV or an operated vehicle. Inspections can take place outdoors, indoors or in other enclosed spaces or may commute amongst these spaces as required to inspect the structure.
[0090] An indoor example is shown in FIG 5 where a concrete safety inspection system [503] with an integrated user interface is mounted on a cart [506] for transportation being pushed by an operator [508] scans for damage [502] and the scanner’s footprint is shown [504],
[0091 ] On the outside of this concrete structure a UAV [510], with a scanner [511] scans the walls of the concrete building and its footprint is illuminated for illustrative purposes [512], The system detects defects [515] in the concrete, which are displayed on the remote user interface [516], held by the operator [517],
[0092] The above description of various embodiments of the present invention is provided for purposes of description to one of ordinary skill in the related art. It is not intended to be exhaustive or to limit the invention to a single disclosed embodiment. As mentioned above, numerous alternatives and variations to the present invention will be apparent to those skilled in the art of the above teaching. Accordingly, while some alternative embodiments have been discussed specifically, other embodiments will be apparent or relatively easily developed by those of ordinary skill in the art. Accordingly, this invention is intended to embrace all alternatives, modifications and variations of the present
invention that have been discussed herein, and other embodiments that fall within the spirit and scope of the above described invention.
Claims
CLAIMS A safety inspection method for detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure. The safety inspection method as in Claim 1 wherein the recorded images are selected from 3D depth images and 2D video images. The safety inspection method as in Claim 1 wherein the 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing. The safety inspection method as in Claim 1 wherein the step of detecting damage includes using machine learning to compare images of the structure to a library of images indicative of damage. The safety inspection method as in Claim 1 wherein the 3D map is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images. The safety inspection method as in Claim 5, wherein the step of locating the detected damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM. The safety inspection method as in Claim 1 wherein the step of locating the detected damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from any one of: an Inertial navigation System (INS); a Global Navigation Satellite System (GNSS); an optimized GNSS position using Real-Time Kinematics
SUBSTITUTE SHEET (RULE 26)
(RTK) or Post-Processed Kinematics (PPK); visual odometry. The safety inspection method as in Claim 1 wherein the step of locating the damage on the 3D map is performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear point-cloud transformation. The safety inspection method as in Claim 1 further including the step of generating alerts and triggering workflows when damage is detected. A safety inspection system that detects and maps damage in structures comprising: a mobile platform; a user interface in communication with and in control of the mobile platform; a 2D video camera mounted on the mobile platform, the 2D video camera generating 2D video data of the structure from which 2D imagery of the structure is produced; a neural network that analyses the 2D video data and detects damage; and a processor that generates or retrieves from a library a 3D map of the structure and locates the detected damage on the 3D map of the structure.The safety inspection system as in Claim 10 the processor generates a 3D map of the structure in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of: odometry from 3D depth images; odometry from 2D video data; position estimation from an inertial navigation system (INS); position estimation from a Global Navigation Satellite System (GNSS); or position estimation from an optimised GNSS position using Real- Time Kinematics [RTK] or Post-Processed Kinematics (PPK). The safety inspection system of Claim 1 1 wherein the processor maps the location of the damage from the 2D imagery to the 3D map. The safety inspection system of Claim 10 further comprising a processor that triggers alerts and workflows. The safety inspection system as in Claim 10 wherein the mobile platform is an unmanned vehicle such as a drone for aerial inspections, or a ground or water based remote inspection vehicle. The safety inspection system as in claim 10 further comprising an illumination
SUBSTITUTE SHEET (RULE 26)
source mounted on the mobile platform. The safety inspection system as in Claim 10 further comprising a storage device storing a database of machine-learning models for different structure surface types.
SUBSTITUTE SHEET (RULE 26)
22
A concrete safety inspection system comprising: a base station a mobile platform a neural network a processor
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2021904044A AU2021904044A0 (en) | 2021-12-14 | Concrete Safety Inspection System | |
AU2021904044 | 2021-12-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023108210A1 true WO2023108210A1 (en) | 2023-06-22 |
Family
ID=86775133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/AU2022/051501 WO2023108210A1 (en) | 2021-12-14 | 2022-12-14 | Infrastructure safety inspection system |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023108210A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017213718A1 (en) * | 2016-06-09 | 2017-12-14 | Lockheed Martin Corporation | Automating the assessment of damage to infrastructure assets |
US10832332B1 (en) * | 2015-12-11 | 2020-11-10 | State Farm Mutual Automobile Insurance Company | Structural characteristic extraction using drone-generated 3D image data |
US10989542B2 (en) * | 2016-03-11 | 2021-04-27 | Kaarta, Inc. | Aligning measured signal data with slam localization data and uses thereof |
CN112767391A (en) * | 2021-02-25 | 2021-05-07 | 国网福建省电力有限公司 | Power grid line part defect positioning method fusing three-dimensional point cloud and two-dimensional image |
KR102254773B1 (en) * | 2020-09-14 | 2021-05-24 | 한국건설기술연구원 | Automatic decision and classification system for each defects of building components using image information, and method for the same |
-
2022
- 2022-12-14 WO PCT/AU2022/051501 patent/WO2023108210A1/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10832332B1 (en) * | 2015-12-11 | 2020-11-10 | State Farm Mutual Automobile Insurance Company | Structural characteristic extraction using drone-generated 3D image data |
US10989542B2 (en) * | 2016-03-11 | 2021-04-27 | Kaarta, Inc. | Aligning measured signal data with slam localization data and uses thereof |
WO2017213718A1 (en) * | 2016-06-09 | 2017-12-14 | Lockheed Martin Corporation | Automating the assessment of damage to infrastructure assets |
KR102254773B1 (en) * | 2020-09-14 | 2021-05-24 | 한국건설기술연구원 | Automatic decision and classification system for each defects of building components using image information, and method for the same |
CN112767391A (en) * | 2021-02-25 | 2021-05-07 | 国网福建省电力有限公司 | Power grid line part defect positioning method fusing three-dimensional point cloud and two-dimensional image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11599689B2 (en) | Methods and apparatus for automatically defining computer-aided design files using machine learning, image analytics, and/or computer vision | |
Shariq et al. | Revolutionising building inspection techniques to meet large-scale energy demands: A review of the state-of-the-art | |
Loupos et al. | Autonomous robotic system for tunnel structural inspection and assessment | |
Toschi et al. | Oblique photogrammetry supporting 3D urban reconstruction of complex scenarios | |
US10297074B2 (en) | Three-dimensional modeling from optical capture | |
Tan et al. | Mapping and modelling defect data from UAV captured images to BIM for building external wall inspection | |
US20190026400A1 (en) | Three-dimensional modeling from point cloud data migration | |
US11880943B2 (en) | Photogrammetry of building using machine learning based inference | |
Chen et al. | Geo-registering UAV-captured close-range images to GIS-based spatial model for building façade inspections | |
Taylor et al. | A mutual information approach to automatic calibration of camera and lidar in natural environments | |
Baik et al. | Jeddah historical building information modeling" jhbim" old Jeddah–Saudi Arabia | |
Chow et al. | Automated defect inspection of concrete structures | |
KR102346676B1 (en) | Method for creating damage figure using the deep learning-based damage image classification of facility | |
Brumana et al. | Combined geometric and thermal analysis from UAV platforms for archaeological heritage documentation | |
Dahaghin et al. | Precise 3D extraction of building roofs by fusion of UAV-based thermal and visible images | |
US20220398804A1 (en) | System for generation of three dimensional scans and models | |
Keyvanfar et al. | Emerging dimensions of unmanned aerial vehicle’s (uav) 3d reconstruction modeling and photogrammetry in architecture and construction management | |
WO2023108210A1 (en) | Infrastructure safety inspection system | |
Zhu | A pipeline of 3D scene reconstruction from point clouds | |
Stamnas et al. | Comparing 3D digital technologies for archaeological fieldwork documentation. The case of Thessaloniki Toumba Excavation, Greece | |
Chen | Analysis and management of uav-captured images towards automation of building facade inspections | |
Chen et al. | Improving completeness and accuracy of 3D point clouds by using deep learning for applications of digital twins to civil structures | |
Zainuddin et al. | 3D MODELLING METHOD OF HIGH ABOVE GROUND ROCK ART PAINTING USING MULTISPECTRAL CAMERA | |
Mariniuc et al. | Using 3D Scanning Techniques from Robotic Applications in the Constructions Domain | |
Ali et al. | UAV 3D Photogrammetric Mapping & GIS Development Case Study: Khartoum Railway Station |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22905526 Country of ref document: EP Kind code of ref document: A1 |