WO2023108210A1 - Infrastructure safety inspection system - Google Patents

Infrastructure safety inspection system Download PDF

Info

Publication number
WO2023108210A1
WO2023108210A1 PCT/AU2022/051501 AU2022051501W WO2023108210A1 WO 2023108210 A1 WO2023108210 A1 WO 2023108210A1 AU 2022051501 W AU2022051501 W AU 2022051501W WO 2023108210 A1 WO2023108210 A1 WO 2023108210A1
Authority
WO
WIPO (PCT)
Prior art keywords
safety inspection
map
damage
images
inspection method
Prior art date
Application number
PCT/AU2022/051501
Other languages
French (fr)
Inventor
Lachlan CAMPBELL
Original Assignee
Geobotica Survey Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2021904044A external-priority patent/AU2021904044A0/en
Application filed by Geobotica Survey Pty Ltd filed Critical Geobotica Survey Pty Ltd
Publication of WO2023108210A1 publication Critical patent/WO2023108210A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/84Systems specially adapted for particular applications
    • G01N21/88Investigating the presence of flaws or contamination
    • G01N21/8851Scan or image signal processing specially adapted therefor, e.g. for scan signal adjustment, for detecting different kinds of defects, for compensating for structures, markings, edges
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0033Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by determining damage, crack or wear
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0091Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by using electromagnetic excitation or detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/38Concrete; ceramics; glass; bricks
    • G01N33/383Concrete, cement
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/003Transmission of data between radar, sonar or lidar systems and remote stations
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/4802Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/4808Evaluating distance, position or velocity data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/0008Industrial image inspection checking presence/absence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30132Masonry; Concrete
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation
    • G06T2207/30184Infrastructure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/521Depth or shape recovery from laser ranging, e.g. using interferometry; from the projection of structured light
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/02Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
    • H04W84/10Small scale networks; Flat hierarchical networks
    • H04W84/12WLAN [Wireless Local Area Networks]

Definitions

  • the present invention relates to the field of visual inspections of structures and infrastructure for signs of damage, cracking and corrosion.
  • US Patent US20210174492A1 represented an advance in inspection methods by providing a mixed reality headset to an operator which photographs a structure, sends the data to a server for machine learning defect identification, and returns answers which are updated in the mixed reality headset visualisation where a user interacts with the data through the headset as an inspection tool.
  • the improved method does not remove the need for an operator to work at heights or in confined spaces.
  • forcing an operator to wear a mixed reality headset while working may in fact increase the level of risk by inhibiting the field of view and perception of the operator.
  • More traditional methods of manual inspections, such as photographing discovered defects, writing reports and triggering workflows can be a subjective process based on the inspector’s judgement and diligence, and is not infallible. Defects can be missed.
  • the invention resides in a method of detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure.
  • the recorded images are selected from 3D depth images and 2D video images.
  • the 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing, or is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images.
  • SLAM Simultaneous Localisation and Mapping
  • the step of detecting damage suitably includes using machine learning to compare images of the concrete structure to a library of images indicative of damage.
  • the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM.
  • the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from INS.
  • the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from GNSS or optimized GNSS position using RTK or PPK.
  • the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from visual odometry.
  • the step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear pointcloud transformation.
  • the method may include the further steps of generating alerts and triggering workflow.
  • the system is mounted on an unmanned vehicle such as a drone for areal inspections, or a ground or water based remote inspection vehicle as appropriate.
  • the invention resides in a concrete safety inspection system whereby a 3D map of a concrete structure is generated in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of odometry from 3D depth images, odometry from 2D video data, position estimation from inertial navigation system (INS) and a Global Navigation Satellite System (GNSS), and optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station.
  • SLAM Simultaneous Localisation and Mapping
  • INS inertial navigation system
  • GNSS Global Navigation Satellite System
  • optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station.
  • a neural network analyses the 2D video data for the detection of concrete damage. The location of the damage is mapped from the 2D imagery
  • FIG 1 is a block diagram of the major components of the invention.
  • FIG 2 is a flowchart depicting the major steps in the implementation of the invention.
  • FIG 3 shows a concrete bridge undergoing a safety inspection
  • FIG 4 shows a concrete enclosed space undergoing a safety inspection
  • FIG 5 shows an indoor and outdoor concrete inspection
  • Embodiments of the present invention reside primarily in a infrastructure safety inspection system. Accordingly, the method steps and elements have been illustrated in concise schematic form in the drawings, showing only those specific details that are necessary for understanding the embodiments of the present invention, but so as not to obscure the disclosure with excessive detail that will be readily apparent to those of ordinary skill in the art having the benefit of the present description.
  • adjectives such as first and second, left and right, and the like may be used solely to distinguish one element or action from another element or action without necessarily requiring or implying any actual such relationship or order.
  • Words such as “comprises” or “includes” are intended to define a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed, including elements that are inherent to such a process, method, article, or apparatus.
  • FIG 1 presents a block diagram of the major components required to implement the invention.
  • the elements within the larger block [110] are co-located sensors which comprise the scanner element of the system, including a user interface [120] which can be physically separated from the system [110], while an optional separate GNSS reference data stream [130] may also be leveraged by the system.
  • the scanner [110] is transported around the surfaces of a concrete structure by operators in a manner which suits the asset being inspected, while the user interface can be used to control and visualise the output of the scanner wherever the human operator is situated either remotely or on the structure, and data is streamed wirelessly between [110] and [120].
  • the scanner [110] includes a 2D video capture system [111] which is preferably in the visible light spectrum for ease of human understanding, but may occupy a selected spectrum such as an infrared spectrum especially useful for thermal imaging which may give signs of active concrete deformation from dynamic kinematic motion or the relative coolness of moisture in areas. It may also operate in the ultra-violet spectrum or indeed in a hyper-spectral or multi- spectral frequencies which can aid in the detection of moisture content or changes in the concrete material properties.
  • the source of the video data may optionally be the left or right image of a stereo camera, but the inventors have found a separate video capture system allows greatest flexibility.
  • a 2D video capture system can also be aided by internal signal processing, such as automatic white balance and also high dynamic range (HDR) signal processing.
  • HDR signal processing involves taking several images at high speed at slightly different exposure values, and integrating the images into a single image to capture a larger array of tonal values in the same image. This can be helpful to deal with areas of a concrete structure that may be back-lit by the sun or sky, and also for areas of with bright light and shadows in the same scene.
  • an external lighting source to illuminate the scene can be useful, especially in dark concrete structures, such as tunnels, or in the eves of bridges.
  • This lighting source may be co-located with the scanner or may be external to the source.
  • the 2D video data capture system [111] may be co-located with a 3D depth image capture system [112] with a common origin and line of sight.
  • the 3D depth images contain a 3D spatial reference of the scene from a local origin and field of view along a line of sight.
  • Stereo camera systems provide high density depth images with moderate depth accuracy, while LiDAR depth images which generally operate with lower density depth images at higher depth accuracy.
  • Red Green Blue Depth (RGB-D) cameras which use active illumination patterns such as Microsoft Kinect or Intel Realsense RGB-D cameras or the like also produce depth images fused with Red, Green and Blue channels and may be used in the invention.
  • RGB-D Red Green Blue Depth
  • the inventors have found the properties of depth images derived from LiDAR and stereo-cameras to be most beneficial for the application.
  • a concrete safety inspection system acquires GNSS data when it is available.
  • a GNSS capture system [113] may be co-located with the other sensors in the scanner.
  • an external GNSS correction reference data stream [130] may be available.
  • An inertial navigation system comprising one or more of a gyroscope, accelerometer, magnetometer preferably with a temperature sensor and pressure sensor for INS data correction may also be collocated with the sensors in the scanner.
  • the INS fuses the data from each inertial sensor, for which there are several available options, including INS sensors which pre-fuse the data to generate a single position and pose estimate in 6 degrees of freedom. Alternatively the data may be fused by using a Kalman Filter to fuse the data and obtain an updated pose and position estimate.
  • the user interface [120] communicates with the embedded computer to allow an operator to control the scanner and visualise the output data in near real-time remotely or locally to the operator while scanning is taking place.
  • 2D video data capture system [111] and the 3D depth image capture system [112] generate data which can be fused by establishing a common origin and line of sight of the two sensors.
  • a static frame from the 2D video image data may taken at the same time as a frame from the 3D depth image data, and a template matching method using an iterative closest point (ICP) or similar is applied.
  • ICP is a family of template matching algorithms which use an affine transformation matrix to move a target image and reference image in iterative steps of translation and rotation until the sum of the distance between the two templates is minimised.
  • the output of the ICP is a translation matrix for the 6 degrees of freedom, namely pitch, roll and yaw angles, and x, y and z translations. Applying this matrix to one of the two sensor data types delivers a common origin and line of sight for the sensors.
  • the 2D image capture and 3D image capture systems are mechanically aligned in pitch, roll and yaw angles, and are aligned in two of the three translation axes, and are offset by a known translation in the third axis which optionally can be accounted for by a translation of the output data. This method gives a common line of sight and a common origin.
  • Another effective embodiment for this alignment is to use the sensor output data of both a 2D image and 3D depth image, and input a selection of three or more common points in a scene in both a frame from the 2D video image data and the matching frame from the 3D depth image data. These points are used to create a transformation matrix to align the two data types and generate a common origin and line of sight for the two sensors.
  • the purpose is to fuse the data between sensors, bringing together the video capture data and the depth image data and the derivatives of these data.
  • FIG 2 there are two main streams illustrated in this workflow, namely a SLAM stream [270] and an defect detection stream using machine learning [280], the streams are shaded in light grey for ease of understanding the flow chart.
  • the two streams fuse to locate any defects on the 3D map.
  • BIM data or similar 3D maps of a structure are typically point clouds or 3D meshes.
  • the user may elect to provide an existing 3D map of the structure [250] which is used as a ‘ground truth’ 3D reference guide for the SLAM stream [270] later in the process.
  • a 3D map of the structure is generated as a separate process to the concrete inspection specifically for the process, whereby the structure is modelled using a laser scanning system, or photogrammetry, or by other appropriate surveying methods.
  • a user may elect to use such a 3D map [260] a 3D ‘ground truth’ reference for the SLAM processing [270].
  • the 3D map is generated during the concrete safety inspection by the scanner by using a SLAM algorithm, without any external reference, which is explained by the SLAM stream [270] of FIG 2.
  • the main process loop begins at the start of the flow chart [201 ], with the simultaneous acquisition of 3D depth image data [202], GNSS data and GNSS reference data [203] where available, INS data [204] and 2D video data [205] by the scanner, all synchronised by the same real-time clock.
  • FIG 2 illustrates all four sensors, however only one or more sensors are required to locate defects on a 3D map and implement the invention.
  • 2D image data alone is acquired, with monocular SLAM using visual odometry is implemented on the 2D image data to gain an estimate of the 2D image data’s origin and line of sight, and the 2D image data is projected onto a 3D map.
  • all four of the sensors are used in the SLAM stream.
  • an algorithm that fuses image odometry estimates [206], INS pose and position estimates [207], GNSS data where available [208] by algorithms such as a Kalman Filter, averaging, weighted averaging, or other suitable method is used.
  • the output of this fusion is a single position and pose estimate [209] for the scanner this could also be described as the updated origin and line of sight of the 3D depth image and its pre-fused image data.
  • the depth image can transformed to a global co-ordinate system [210]. Once transformed, the plurality of 3D depth image data contribute to become a singular high density 3D model of the scene, referred to as a global point cloud [210].
  • the SLAM output can be optimised using one or more additional point cloud corrections: an estimate the shape of the surface being scanned and correct irregularities in the point cloud [211], and using Loop Closure [212] to correct for map drift errors by relocating previously scanned areas and closing the scan loop by making corrections to the entire map to ensure both scans of the same area have the matching coordinates.
  • SLAM algorithm by matching the shape of the local point cloud and the global point cloud to update the point cloud [211] by using shape estimates can be beneficial for mapping concrete assets with distinctive geometric forms.
  • the inventors have found such as plane-of-best fit, RANSAC, and gridding methods, such as Normal Distribution Transform (NDT) can aid in SLAM mapping by matching planes and shapes in the 3D data.
  • NDT Normal Distribution Transform
  • the SLAM output map is put through a non-linear transformation so that its point cloud set matches the reference map [290].
  • This non-rigid transformation may be done using a special case polyharmonic spline algorithm for smoothing and interpolation: Thin Plate Spline Robust Point Matching (TPS- RPM). TPS-RPM morphs and smooths the SLAM 3D map to closely match the ground truth 3D reference map.
  • the fused 3D point cloud with its global coordinate system is also useful when saved for future reference by concrete inspectors and operators to compare changes in shape and appearance over time.
  • the global point cloud is down- sampled to a low-density occupancy 3D grid or voxel.
  • a voxel is a regular 3D grid defined by three axes, making a regular rectangular prisms or cubes.
  • a voxel is determined to be occupied if a threshold number of points is present in the voxel.
  • This data is classically used by robotic navigation systems for local and wide area navigation [216] which is not the topic of this invention.
  • the inventors have found it useful to create multiple levels of detail and sizes, but preferably three levels of 30cm 3 , 60cm 3 and 1 m 3 voxels, to aid in navigation and collision avoidance for unmanned vehicles whilst optimising memory and processing speed [214],
  • 2D image data is acquired [205] and a machine learning (ML) algorithm is used to identify concrete pathologies in the 2D video data [220].
  • the algorithm ingests the 2D video data as a series of 2D images or frames. Each image is analysed and the ML algorithm searches for patterns in the shape and colour channels of the pixel data of a group of frames and produces a probability that concrete damage is present in the image.
  • supervised learning is used whereby the algorithm has been trained for this pattern detection on a dataset of human-labelled 2D video data images or frames.
  • the human labels the images into categories such as ‘crack’, ‘concrete’, ‘high porosity concrete’, ‘corrosion’, ‘internal corrosion’, ‘delamination’, ‘repaired concrete’, ‘not concrete’ or other categories as may be appropriate. Ideally thousands, of such labelled images are required.
  • each image before labelling takes place, each image ideally is broken into smaller image tiles with the same pixel size and dimensions per tile, such as 224 x 224 pixels, or 256 x 256 pixels although any other size may be used.
  • Each image tile is separately labelled by the human according to its classification.
  • object detection is used, whereby the visible extents of the object being labelled in each frame is annotated by the human with a bounding box the corners of which reach the maximum and minimum X and Y coordinates of the object in the image.
  • Another embodiment takes the approach of semantic segmentation, where the human labeller segments each frame into its separate categories with complex polygons covering only the pixels belonging to this category, for example an image of a crack in competent concrete are segmented in such a way that pixels belonging to the crack are grouped together and likewise the competent concrete are also segmented.
  • CNNs Convolutional Neural Networks
  • the training phase is where the algorithm, running on a computer, ingests the labelled image data and detects patterns in the labelled datasets. This is done by leveraging convoluted layers, starting with an input layer, various levels of hidden layers which spatially convolute image image and retain pattern information, before finally passing to an output layer, which is the label given by the human.
  • the image data in this case is 2D video data with both its natural and fused reference systems [217].
  • the image is then broken into the same-size image tiles as were used in the training phase, and the global coordinates (taken from the SLAM algorithm described earlier) and preferably the global coordinates of the four corners of each image tile are recorded by default.
  • the global coordinates of the centre pixel of each image tile or other forms of averaging such as a mean or median of each axis of the coordinate system or excluding pixel coordinates from such calculations by means of only using pixels with coordinate values within one or two standard deviations of the mean may be selected by a user.
  • a user may elect to apply a depth mage threshold to the fused 2D image data whereby images with distant pixels are excluded from the fused 2D image data hence removing distant objects like the ground or trees or water courses adjacent to a concrete structure which may appear in part of an image tile.
  • the ML algorithm then separates each image tile, region or pixel into its pre-trained categories. If the algorithm identifies a type of pathology, the tile is saved as a 2D image, and as a 3D object with a bounding box defined by the coordinates of the image tile, or other point data or image masked data as applicable depending on the ML algorithm selected by the operator. These objects can be visualised in runtime for a user on the user interface, and are also saved in a process we shall now explore.
  • Concrete structures can have a wide range of surfaces that can pose challenges to machine learning models. Concrete surfaces can be painted, they can have exposed aggregate, or have sandy or smooth surfaces, they can be weathered, stained or blackened, they can have a range of base colours such as light-grey, white, pink, dark grey or be actively coloured during manufacturing.
  • the large database is split into surface-types, and separate machine learning models are generated by surface type.
  • the appropriate model is then applied to a concrete asset being inspected. This method gives a higher accuracy, recall and precision, but requires a lot more training data to be useful.
  • Alerting and workflow logic is then applied to the data for useful outputs, which may include on-screen alerts and visualisations of the pathologies so an operator realises pathology is present while the operator is performing the inspection [222],
  • the 3D global map can be visualised to allow operators to determine if they have scanned all areas of interest to them [215], which is a second useful output reducing the need for rework or revisit by an operator who does not have real-time feedback.
  • Another useful output [223] is to pass all image tile coordinates of detected defects, or 2D image tiles or 3D image tile objects to work-flow software, which can trigger human-led tasks such as repair work, further inspections, or more detailed engineering investigations. In many cases the detailed inspections can be done using the output data of the concrete safety inspection system and do not require any further field work. Triggers of investigations may be based on the type, location and density of pathologies detected by a concrete safety inspection system. This work can then be scheduled and carried out to ensure concrete assets are kept safe and operational.
  • the workflow can be triggered when the scanner is connected to the internet and can contact a workflow server.
  • the scanner sends the server an output, such as a message, command or alert to trigger a workflow based on the location, type and density of pathologies detected.
  • This data and the 3D defect detection data can be stored in a database to trigger workflows, or individual alerts can be raised by email, SMS, social media messaging, or other messaging services such as WhatsApp.
  • Alerts may by filtered by pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
  • pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
  • Concrete structures come in many forms, from bridges to dams, from tunnels to buildings, from wide open pavements and roads to confined spaces of tanks and pipes and exposed foundations. All concrete structures, regardless of the form they take, require regular visual inspection for detection of damage, such as cracking, spalling, delamination, chemical damage, seepage, pitting and internal corrosion of reinforcing steel.
  • FIG 3 shows a concrete bridge with cracking and damage in the deck [321], a pylon [331] and the pavement [311] of the bridge.
  • a concrete safety inspection system [312, 322, 332] is used to inspect the structure, detect damage and locate defects on a 1 :1 scale 3D map.
  • the means of transporting the scanner is secondary to operation of the invention, as the scanner has no awareness of the method of transport.
  • an operator [310] may choose to walk with the scanner on a pavement of the bridge while it is being used by other pedestrians [313], and later by mounting the scanner on a human operated vehicle [324] to scan the deck of the bridge even if the deck is being used by other cars, and perhaps finally by an Unmanned Aerial Vehicle (UAV) [334] for all other surfaces, or by other such methods.
  • UAV Unmanned Aerial Vehicle
  • FIG 3 The detail of FIG 3 reveals a vehicle [324] with an operator [320] transporting a concrete safety inspection system [322] aided by a user interface [325].
  • the damage [321] on the deck of the bridge is illuminated by the scanner’s field of view [326].
  • an unmanned aerial vehicle [332] transports the same concrete safety inspection system [331] with a scanner footprint [336] covering damage [331] with a remote operator [330] aided by an integrated user interface acting as a system controller and visualiser.
  • a Global Navigation Satellite System (GNSS) base station [340] which transmits real-time GNSS position reference data [341], GNSS position reference data streams [342] may also be attained by subscribing to available public GNSS base stations [343].
  • the concrete safety inspection system receives this data [342 or 341] in real-time when it is detectable and available, preferably at some point during the inspection.
  • a concrete safety inspection system [403] with an integrated user interface [407] is used by an operator [406] inside an enclosed space [401 ] to scan the walls and ceiling of the space for cracking or damage [402] the virtual footprint of the scanner is shown for illustrative purposes [404],
  • the means of transport for the scanner inside an enclosed space may also be by other suitable means such as on a UAV or an operated vehicle. Inspections can take place outdoors, indoors or in other enclosed spaces or may commute amongst these spaces as required to inspect the structure.
  • FIG 5 An indoor example is shown in FIG 5 where a concrete safety inspection system [503] with an integrated user interface is mounted on a cart [506] for transportation being pushed by an operator [508] scans for damage [502] and the scanner’s footprint is shown [504],

Abstract

The invention is an infrastructure safety inspection method including the steps of recording images of a structure, acquiring a 3D map of the structure, detecting damage in the structure by analysing 2D images with a neural network and locating the detected damage on the 3D map of the structure.

Description

TITLE
INFRASTRUCTURE SAFETY INSPECTION SYSTEM
FIELD OF THE INVENTION
[001] The present invention relates to the field of visual inspections of structures and infrastructure for signs of damage, cracking and corrosion.
BACKGROUND TO THE INVENTION
[002] Concrete structures and infrastructure such as bridges, pylons, buildings, dams, cooling towers, foundations, tunnels and sewers require regular safety inspections to detect signs of damage such as cracking, concrete cancer, delamination, spalling, corrosion or other pathologies. Detection of these defects can trigger certain workflows, such as structural integrity modelling, or repair and maintenance work, or in extreme cases the temporary or permanent closure of infrastructure due to safety concerns.
[003] This work is largely done by visual inspection, and at times requires inspectors to work in confined spaces, at heights, on abseiling ropes, or suspended in baskets from a crane, or on busy and congested roads, all of which can represent safety hazards to inspectors or the local community from falling tools or equipment. Infrastructure inspection is dangerous work.
[004] US Patent US20210174492A1 represented an advance in inspection methods by providing a mixed reality headset to an operator which photographs a structure, sends the data to a server for machine learning defect identification, and returns answers which are updated in the mixed reality headset visualisation where a user interacts with the data through the headset as an inspection tool. Despite the improved method, it does not remove the need for an operator to work at heights or in confined spaces. In fact, forcing an operator to wear a mixed reality headset while working, may in fact increase the level of risk by inhibiting the field of view and perception of the operator. [005] More traditional methods of manual inspections, such as photographing discovered defects, writing reports and triggering workflows can be a subjective process based on the inspector’s judgement and diligence, and is not infallible. Defects can be missed.
[006] Attempts have been made to remove this subjectivity, such as Chinese patent CN1 13096088A which uses machine learning to identify defects in 2D image data. A limitation with this method is the lack of geospatial reference or 3D measurements, meaning either a course estimate of the location must be given by the operator ‘by eye’ or it may in fact trigger more inspections at height or in confined spaces.
[007] Likewise, using human-led visual inspections with photographs leaves no full 3D model of the condition and shape of the structure meaning future comparisons lack spatial context, and future inspections rely on unreliable human memories and 2D images to try to resolve changes which may have occurred in 3D.
[008] There is therefore a need for an infrastructure safety inspection system to provide remote, automatic damage detection, 3D mapping, and immediate alerts or triggering of workflows for damaged areas of concrete structures.
SUMMARY OF THE INVENTION
[009] In one form, although it need not be the only or indeed the broadest form, the invention resides in a method of detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure.
[0010] Suitably the recorded images are selected from 3D depth images and 2D video images. [0011 ] The 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing, or is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images.
[0012] The step of detecting damage suitably includes using machine learning to compare images of the concrete structure to a library of images indicative of damage.
[0013] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM.
[0014] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from INS.
[0015] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from GNSS or optimized GNSS position using RTK or PPK.
[0016] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from and origin and line of sight estimated from visual odometry.
[0017] The step of locating the damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear pointcloud transformation.
[0018] The method may include the further steps of generating alerts and triggering workflow.
[0019] Preferably the system is mounted on an unmanned vehicle such as a drone for areal inspections, or a ground or water based remote inspection vehicle as appropriate. [0020] In a further form, the invention resides in a concrete safety inspection system whereby a 3D map of a concrete structure is generated in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of odometry from 3D depth images, odometry from 2D video data, position estimation from inertial navigation system (INS) and a Global Navigation Satellite System (GNSS), and optimised GNSS position using Real-Time Kinematics [RTK] from a rover and base station. A neural network analyses the 2D video data for the detection of concrete damage. The location of the damage is mapped from the 2D imagery to the 3D model, and subsequent alerts and workflows are triggered.
[0021 ] Further features and advantages of the present invention will become apparent from the following detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] To assist in understanding the invention and to enable a person skilled in the art to put the invention into practical effect, preferred embodiments of the invention will be described by way of example only with reference to the accompanying drawings, in which:
[0023] FIG 1 is a block diagram of the major components of the invention;
[0024] FIG 2 is a flowchart depicting the major steps in the implementation of the invention;
[0025] FIG 3 shows a concrete bridge undergoing a safety inspection;
[0026] FIG 4 shows a concrete enclosed space undergoing a safety inspection;
[0027] FIG 5 shows an indoor and outdoor concrete inspection;
DETAILED DESCRIPTION OF THE INVENTION
[0028] Embodiments of the present invention reside primarily in a infrastructure safety inspection system. Accordingly, the method steps and elements have been illustrated in concise schematic form in the drawings, showing only those specific details that are necessary for understanding the embodiments of the present invention, but so as not to obscure the disclosure with excessive detail that will be readily apparent to those of ordinary skill in the art having the benefit of the present description.
[0029] In this specification, adjectives such as first and second, left and right, and the like may be used solely to distinguish one element or action from another element or action without necessarily requiring or implying any actual such relationship or order. Words such as “comprises” or “includes” are intended to define a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed, including elements that are inherent to such a process, method, article, or apparatus.
[0030] To assist in the understanding of the invention, FIG 1 presents a block diagram of the major components required to implement the invention. The elements within the larger block [110] are co-located sensors which comprise the scanner element of the system, including a user interface [120] which can be physically separated from the system [110], while an optional separate GNSS reference data stream [130] may also be leveraged by the system.
[0031 ] The scanner [110] is transported around the surfaces of a concrete structure by operators in a manner which suits the asset being inspected, while the user interface can be used to control and visualise the output of the scanner wherever the human operator is situated either remotely or on the structure, and data is streamed wirelessly between [110] and [120].
[0032] The scanner [110] includes a 2D video capture system [111] which is preferably in the visible light spectrum for ease of human understanding, but may occupy a selected spectrum such as an infrared spectrum especially useful for thermal imaging which may give signs of active concrete deformation from dynamic kinematic motion or the relative coolness of moisture in areas. It may also operate in the ultra-violet spectrum or indeed in a hyper-spectral or multi- spectral frequencies which can aid in the detection of moisture content or changes in the concrete material properties. The source of the video data may optionally be the left or right image of a stereo camera, but the inventors have found a separate video capture system allows greatest flexibility.
[0033] A 2D video capture system can also be aided by internal signal processing, such as automatic white balance and also high dynamic range (HDR) signal processing. HDR signal processing involves taking several images at high speed at slightly different exposure values, and integrating the images into a single image to capture a larger array of tonal values in the same image. This can be helpful to deal with areas of a concrete structure that may be back-lit by the sun or sky, and also for areas of with bright light and shadows in the same scene.
[0034] Using an external lighting source to illuminate the scene can be useful, especially in dark concrete structures, such as tunnels, or in the eves of bridges. This lighting source may be co-located with the scanner or may be external to the source.
[0035] The 2D video data capture system [111] may be co-located with a 3D depth image capture system [112] with a common origin and line of sight. The 3D depth images contain a 3D spatial reference of the scene from a local origin and field of view along a line of sight. The inventors have found that using stereo-camera systems or LiDAR systems as the source of the depth image data are useful. Stereo camera systems provide high density depth images with moderate depth accuracy, while LiDAR depth images which generally operate with lower density depth images at higher depth accuracy. Red Green Blue Depth (RGB-D) cameras which use active illumination patterns such as Microsoft Kinect or Intel Realsense RGB-D cameras or the like also produce depth images fused with Red, Green and Blue channels and may be used in the invention. The inventors have found the properties of depth images derived from LiDAR and stereo-cameras to be most beneficial for the application.
[0036] In one embodiment, a concrete safety inspection system acquires GNSS data when it is available. For this purpose a GNSS capture system [113] may be co-located with the other sensors in the scanner. Likewise, in a further embodiment, an external GNSS correction reference data stream [130] may be available.
[0037] An inertial navigation system (INS) [114] comprising one or more of a gyroscope, accelerometer, magnetometer preferably with a temperature sensor and pressure sensor for INS data correction may also be collocated with the sensors in the scanner. The INS fuses the data from each inertial sensor, for which there are several available options, including INS sensors which pre-fuse the data to generate a single position and pose estimate in 6 degrees of freedom. Alternatively the data may be fused by using a Kalman Filter to fuse the data and obtain an updated pose and position estimate.
[0038] Sensors are synchronised in time by a Real Time Clock [115] and are processed by an embedded computer [116].
[0039] The user interface [120] communicates with the embedded computer to allow an operator to control the scanner and visualise the output data in near real-time remotely or locally to the operator while scanning is taking place.
[0040] 2D video data capture system [111] and the 3D depth image capture system [112] generate data which can be fused by establishing a common origin and line of sight of the two sensors.
[0041 ] To achieve this fusion a static frame from the 2D video image data may taken at the same time as a frame from the 3D depth image data, and a template matching method using an iterative closest point (ICP) or similar is applied. ICP is a family of template matching algorithms which use an affine transformation matrix to move a target image and reference image in iterative steps of translation and rotation until the sum of the distance between the two templates is minimised.
[0042] The output of the ICP is a translation matrix for the 6 degrees of freedom, namely pitch, roll and yaw angles, and x, y and z translations. Applying this matrix to one of the two sensor data types delivers a common origin and line of sight for the sensors.
[0043] In a further embodiment to generate a common origin and line of sight, the 2D image capture and 3D image capture systems are mechanically aligned in pitch, roll and yaw angles, and are aligned in two of the three translation axes, and are offset by a known translation in the third axis which optionally can be accounted for by a translation of the output data. This method gives a common line of sight and a common origin.
[0044] Another effective embodiment for this alignment is to use the sensor output data of both a 2D image and 3D depth image, and input a selection of three or more common points in a scene in both a frame from the 2D video image data and the matching frame from the 3D depth image data. These points are used to create a transformation matrix to align the two data types and generate a common origin and line of sight for the two sensors.
[0045] In any of these embodiments for establishing a common origin and line of sight, the purpose is to fuse the data between sensors, bringing together the video capture data and the depth image data and the derivatives of these data.
[0046] Turning now to FIG 2, there are two main streams illustrated in this workflow, namely a SLAM stream [270] and an defect detection stream using machine learning [280], the streams are shaded in light grey for ease of understanding the flow chart. The two streams fuse to locate any defects on the 3D map.
[0047] Many concrete structures have accompanying 3D digital representation of the physical and functional characteristics of the structure, often referred to as Building Information Models (BIM). BIM data or similar 3D maps of a structure are typically point clouds or 3D meshes. In one embodiment, the user may elect to provide an existing 3D map of the structure [250] which is used as a ‘ground truth’ 3D reference guide for the SLAM stream [270] later in the process.
[0048] In another embodiment, a 3D map of the structure is generated as a separate process to the concrete inspection specifically for the process, whereby the structure is modelled using a laser scanning system, or photogrammetry, or by other appropriate surveying methods. A user may elect to use such a 3D map [260] a 3D ‘ground truth’ reference for the SLAM processing [270]. [0049] In a further embodiment, the 3D map is generated during the concrete safety inspection by the scanner by using a SLAM algorithm, without any external reference, which is explained by the SLAM stream [270] of FIG 2.
[0050] The main process loop begins at the start of the flow chart [201 ], with the simultaneous acquisition of 3D depth image data [202], GNSS data and GNSS reference data [203] where available, INS data [204] and 2D video data [205] by the scanner, all synchronised by the same real-time clock. FIG 2 illustrates all four sensors, however only one or more sensors are required to locate defects on a 3D map and implement the invention. In one embodiment of the invention, 2D image data alone is acquired, with monocular SLAM using visual odometry is implemented on the 2D image data to gain an estimate of the 2D image data’s origin and line of sight, and the 2D image data is projected onto a 3D map. In another embodiment, all four of the sensors are used in the SLAM stream.
[0051] If more than one sensor is used, an algorithm that fuses image odometry estimates [206], INS pose and position estimates [207], GNSS data where available [208] by algorithms such as a Kalman Filter, averaging, weighted averaging, or other suitable method is used.
[0052] The output of this fusion is a single position and pose estimate [209] for the scanner this could also be described as the updated origin and line of sight of the 3D depth image and its pre-fused image data. Once a new origin and line of site is determined, the depth image can transformed to a global co-ordinate system [210]. Once transformed, the plurality of 3D depth image data contribute to become a singular high density 3D model of the scene, referred to as a global point cloud [210].
[0053] Whilst this is sufficient for the SLAM stream of the invention, in one embodiment the SLAM output can be optimised using one or more additional point cloud corrections: an estimate the shape of the surface being scanned and correct irregularities in the point cloud [211], and using Loop Closure [212] to correct for map drift errors by relocating previously scanned areas and closing the scan loop by making corrections to the entire map to ensure both scans of the same area have the matching coordinates.
[0054] Optimising the SLAM algorithm by matching the shape of the local point cloud and the global point cloud to update the point cloud [211] by using shape estimates can be beneficial for mapping concrete assets with distinctive geometric forms. The inventors have found such as plane-of-best fit, RANSAC, and gridding methods, such as Normal Distribution Transform (NDT) can aid in SLAM mapping by matching planes and shapes in the 3D data.
[0055] If the user has imported a 3D reference map before the main loop began, then the SLAM output map is put through a non-linear transformation so that its point cloud set matches the reference map [290]. This non-rigid transformation may be done using a special case polyharmonic spline algorithm for smoothing and interpolation: Thin Plate Spline Robust Point Matching (TPS- RPM). TPS-RPM morphs and smooths the SLAM 3D map to closely match the ground truth 3D reference map.
[0056] The fused 3D point cloud with its global coordinate system is also useful when saved for future reference by concrete inspectors and operators to compare changes in shape and appearance over time.
[0057] The output of these processes is to optimise the global point cloud to reduce errors and store it [213].
[0058] For the operator to be able to visualise which areas of the concrete structure have been inspected, it can be useful to render an updated 3D map on the user interface at regular intervals, generally at the rate of 1 hz or 2hz or another suitable update rate. This allows an operator to see on the 3D map if some parts of the structure have been missed due to occlusions or shadows in a line of sight, and to ensure all desired areas are inspected [215].
[0059] In one embodiment of the invention, the global point cloud is down- sampled to a low-density occupancy 3D grid or voxel. For clarity, a voxel is a regular 3D grid defined by three axes, making a regular rectangular prisms or cubes. A voxel is determined to be occupied if a threshold number of points is present in the voxel. This data is classically used by robotic navigation systems for local and wide area navigation [216] which is not the topic of this invention. The inventors have found it useful to create multiple levels of detail and sizes, but preferably three levels of 30cm3, 60cm3 and 1 m3 voxels, to aid in navigation and collision avoidance for unmanned vehicles whilst optimising memory and processing speed [214],
[0060] Turning now to the Machine Learning flow [280] of FIG 1 . 2D image data is acquired [205] and a machine learning (ML) algorithm is used to identify concrete pathologies in the 2D video data [220]. The algorithm ingests the 2D video data as a series of 2D images or frames. Each image is analysed and the ML algorithm searches for patterns in the shape and colour channels of the pixel data of a group of frames and produces a probability that concrete damage is present in the image.
[0061 ] In one embodiment of the invention, supervised learning is used whereby the algorithm has been trained for this pattern detection on a dataset of human-labelled 2D video data images or frames. The human labels the images into categories such as ‘crack’, ‘concrete’, ‘high porosity concrete’, ‘corrosion’, ‘internal corrosion’, ‘delamination’, ‘repaired concrete’, ‘not concrete’ or other categories as may be appropriate. Ideally thousands, of such labelled images are required.
[0062] In an embodiment of the invention, before labelling takes place, each image ideally is broken into smaller image tiles with the same pixel size and dimensions per tile, such as 224 x 224 pixels, or 256 x 256 pixels although any other size may be used. Each image tile is separately labelled by the human according to its classification.
[0063] In another embodiment, object detection is used, whereby the visible extents of the object being labelled in each frame is annotated by the human with a bounding box the corners of which reach the maximum and minimum X and Y coordinates of the object in the image.
[0064] Another embodiment takes the approach of semantic segmentation, where the human labeller segments each frame into its separate categories with complex polygons covering only the pixels belonging to this category, for example an image of a crack in competent concrete are segmented in such a way that pixels belonging to the crack are grouped together and likewise the competent concrete are also segmented.
[0065] Once data has been labelled, a specific algorithm variant must be selected and the data is applied to the training phase of machine learning. Convolutional Neural Networks (CNNs) are well known for their accuracy and efficiency in image processing and understanding. After trialling many variants of CNNs, the inventors have found several CNN families to be most accurate and efficient of the various methods developed and trialled for the implementation of the invention. These include Tensorflow lite, mobilenet, Alexnet, Yolo, ResNet, classic CNNs, VGG, Inception and Xception.
[0066] The training phase is where the algorithm, running on a computer, ingests the labelled image data and detects patterns in the labelled datasets. This is done by leveraging convoluted layers, starting with an input layer, various levels of hidden layers which spatially convolute image image and retain pattern information, before finally passing to an output layer, which is the label given by the human.
[0067] At completion of training, subsequent validation and testing occurs on separate labelled data the algorithm has not been exposed to, which allows one to determine the accuracy of the model.
[0068] This trained model can now be deployed on fresh image data. The image data in this case is 2D video data with both its natural and fused reference systems [217], The image is then broken into the same-size image tiles as were used in the training phase, and the global coordinates (taken from the SLAM algorithm described earlier) and preferably the global coordinates of the four corners of each image tile are recorded by default. Optionally the global coordinates of the centre pixel of each image tile or other forms of averaging such as a mean or median of each axis of the coordinate system or excluding pixel coordinates from such calculations by means of only using pixels with coordinate values within one or two standard deviations of the mean may be selected by a user. Finally a user may elect to apply a depth mage threshold to the fused 2D image data whereby images with distant pixels are excluded from the fused 2D image data hence removing distant objects like the ground or trees or water courses adjacent to a concrete structure which may appear in part of an image tile.
[0069] The ML algorithm then separates each image tile, region or pixel into its pre-trained categories. If the algorithm identifies a type of pathology, the tile is saved as a 2D image, and as a 3D object with a bounding box defined by the coordinates of the image tile, or other point data or image masked data as applicable depending on the ML algorithm selected by the operator. These objects can be visualised in runtime for a user on the user interface, and are also saved in a process we shall now explore.
[0070] As each pixel of the 2D video data has previously been fused to the corresponding 3D depth image [217] once the transformation of coordinates [210] and subsequent updates [211 and 212] of the 3D depth data has taken place, the 2D image data pixels are in-turn transformed into the global coordinate reference system. This data is used to geolocate the concrete pathologies recognised from the machine learning classification system in 3D [221], The defects detected by ML are saved as 3D objects using the global coordinate system and added to a global defect map [222],
[0071 ] Concrete structures can have a wide range of surfaces that can pose challenges to machine learning models. Concrete surfaces can be painted, they can have exposed aggregate, or have sandy or smooth surfaces, they can be weathered, stained or blackened, they can have a range of base colours such as light-grey, white, pink, dark grey or be actively coloured during manufacturing.
These changes in appearance can be handled by machine learning using one of several approaches.
[0072] In the first approach, a large database is collected of all bridge surface types to create a generalised model. The authors have found this approach is easiest to use, but can sacrifice precision, accuracy and recall.
[0073] In another approach, the large database is split into surface-types, and separate machine learning models are generated by surface type. The appropriate model is then applied to a concrete asset being inspected. This method gives a higher accuracy, recall and precision, but requires a lot more training data to be useful.
[0074] An alternative method which deals with are first converted to a monochromatic scale using either a combination of two or three of the RGB channels, or a single RGB channel or their derivate channels of hue, saturation and intensity. The authors have also found this to be useful in that only a single model is required for all surfaces, but again it has a lower recall, precision and accuracy than individual models. It also is useful if one surface type has a small database of training data.
[0075] Alerting and workflow logic is then applied to the data for useful outputs, which may include on-screen alerts and visualisations of the pathologies so an operator realises pathology is present while the operator is performing the inspection [222],
[0076] The 3D global map can be visualised to allow operators to determine if they have scanned all areas of interest to them [215], which is a second useful output reducing the need for rework or revisit by an operator who does not have real-time feedback.
[0077] Another useful output [223] is to pass all image tile coordinates of detected defects, or 2D image tiles or 3D image tile objects to work-flow software, which can trigger human-led tasks such as repair work, further inspections, or more detailed engineering investigations. In many cases the detailed inspections can be done using the output data of the concrete safety inspection system and do not require any further field work. Triggers of investigations may be based on the type, location and density of pathologies detected by a concrete safety inspection system. This work can then be scheduled and carried out to ensure concrete assets are kept safe and operational.
[0078] In such an embodiment, the workflow can be triggered when the scanner is connected to the internet and can contact a workflow server. The scanner sends the server an output, such as a message, command or alert to trigger a workflow based on the location, type and density of pathologies detected. This data and the 3D defect detection data, can be stored in a database to trigger workflows, or individual alerts can be raised by email, SMS, social media messaging, or other messaging services such as WhatsApp.
[0079] Alerts may by filtered by pathology type such as cracks of a certain size or exposed reenforcing steel bars for example. If these types defects are detected, one or more of the image data, location and 3D map may be sent to a user.
[0080] Leveraging collaboration of multiple skilled operators or civil engineers may a far better engineering assessment of the state of concrete structure, compared to using only the judgement of an inspector in the field. To facilitate this collaboration, multiple users in a workflow software may be alerted when a pathology is detected, and their input is gathered in the form of opinions and comments on the 3D map, the location data and the image data and detection classification. The workflow may also evolve collaboratively by passing through stage gates from remote analysis of the data collected, additional field work, modelling of a pathology’s impact on structural integrity, trigger a repair workflow, or in some extreme cases even close a structure on safety grounds.
[0081] Concrete structures come in many forms, from bridges to dams, from tunnels to buildings, from wide open pavements and roads to confined spaces of tanks and pipes and exposed foundations. All concrete structures, regardless of the form they take, require regular visual inspection for detection of damage, such as cracking, spalling, delamination, chemical damage, seepage, pitting and internal corrosion of reinforcing steel.
[0082] Due to the many forms a concrete structure can take, it is necessary to highlight how a concrete safety inspection system can be used to map and locate concrete damage and pathologies in 3D space to trigger workflows. To assist with this, FIG 3 shows a concrete bridge with cracking and damage in the deck [321], a pylon [331] and the pavement [311] of the bridge. A concrete safety inspection system [312, 322, 332] is used to inspect the structure, detect damage and locate defects on a 1 :1 scale 3D map.
[0083] The inventors have found that the means of transporting the scanner is secondary to operation of the invention, as the scanner has no awareness of the method of transport. In the case of a concrete bridge, as represented in FIG 3, an operator [310] may choose to walk with the scanner on a pavement of the bridge while it is being used by other pedestrians [313], and later by mounting the scanner on a human operated vehicle [324] to scan the deck of the bridge even if the deck is being used by other cars, and perhaps finally by an Unmanned Aerial Vehicle (UAV) [334] for all other surfaces, or by other such methods. Regardless of the transport method, the output of the three scanning epochs used in this example can be combined into one single dataset.
[0084] While it is effective to transport the system manually and by vehicles, the inventors have found it to be most practical to mount the scanner on an unmanned vehicle, such as a UAV [334],
[0085] The detail of FIG 3 reveals a vehicle [324] with an operator [320] transporting a concrete safety inspection system [322] aided by a user interface [325]. The damage [321] on the deck of the bridge is illuminated by the scanner’s field of view [326].
[0086] Independently, an unmanned aerial vehicle (UAV) [332] transports the same concrete safety inspection system [331] with a scanner footprint [336] covering damage [331] with a remote operator [330] aided by an integrated user interface acting as a system controller and visualiser.
[0087] The operator [310] walking along the pavement of the bridge holds the concrete safety scanner [312] with an integrated user interface illuminating a portion of the bridge covered by the footprint of the scanner [316] over damaged pavement [311].
[0088] A Global Navigation Satellite System (GNSS) base station [340] is shown which transmits real-time GNSS position reference data [341], GNSS position reference data streams [342] may also be attained by subscribing to available public GNSS base stations [343]. In one embodiment of the invention, the concrete safety inspection system receives this data [342 or 341] in real-time when it is detectable and available, preferably at some point during the inspection.
[0089] For a concrete structure taking the form of an enclosed space, such as a tunnel, underground mine or a pipeline, such as the one illustrated in FIG 4, a concrete safety inspection system [403] with an integrated user interface [407] is used by an operator [406] inside an enclosed space [401 ] to scan the walls and ceiling of the space for cracking or damage [402] the virtual footprint of the scanner is shown for illustrative purposes [404], Like an outdoor inspection, the means of transport for the scanner inside an enclosed space may also be by other suitable means such as on a UAV or an operated vehicle. Inspections can take place outdoors, indoors or in other enclosed spaces or may commute amongst these spaces as required to inspect the structure.
[0090] An indoor example is shown in FIG 5 where a concrete safety inspection system [503] with an integrated user interface is mounted on a cart [506] for transportation being pushed by an operator [508] scans for damage [502] and the scanner’s footprint is shown [504],
[0091 ] On the outside of this concrete structure a UAV [510], with a scanner [511] scans the walls of the concrete building and its footprint is illuminated for illustrative purposes [512], The system detects defects [515] in the concrete, which are displayed on the remote user interface [516], held by the operator [517],
[0092] The above description of various embodiments of the present invention is provided for purposes of description to one of ordinary skill in the related art. It is not intended to be exhaustive or to limit the invention to a single disclosed embodiment. As mentioned above, numerous alternatives and variations to the present invention will be apparent to those skilled in the art of the above teaching. Accordingly, while some alternative embodiments have been discussed specifically, other embodiments will be apparent or relatively easily developed by those of ordinary skill in the art. Accordingly, this invention is intended to embrace all alternatives, modifications and variations of the present invention that have been discussed herein, and other embodiments that fall within the spirit and scope of the above described invention.

Claims

CLAIMS A safety inspection method for detecting and mapping damage in structures including the steps of: recording images of a structure; acquiring a 3D map of the structure; detecting damage in the structure by analyzing 2D images with a neural network; and locating the detected damage on the 3D map of the structure. The safety inspection method as in Claim 1 wherein the recorded images are selected from 3D depth images and 2D video images. The safety inspection method as in Claim 1 wherein the 3D map is suitably a 3D point cloud or 3D mesh of the structure that is pre-existing. The safety inspection method as in Claim 1 wherein the step of detecting damage includes using machine learning to compare images of the structure to a library of images indicative of damage. The safety inspection method as in Claim 1 wherein the 3D map is generated using Simultaneous Localisation and Mapping (SLAM) with position estimation obtained from one or more of: odometry from 3D depth images or 2D images; Inertial Navigation System (INS); Global Navigation Satellite System (GNSS); Real-Time Kinematics [RTK]; Post- Processed Kinematics (PPK); and dense point cloud generated from 3D depth images or 2D images. The safety inspection method as in Claim 5, wherein the step of locating the detected damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from SLAM. The safety inspection method as in Claim 1 wherein the step of locating the detected damage on the 3D map is suitably performed by projecting the 2D image onto the 3D surface from an origin and line of sight estimated from any one of: an Inertial navigation System (INS); a Global Navigation Satellite System (GNSS); an optimized GNSS position using Real-Time Kinematics
SUBSTITUTE SHEET (RULE 26) (RTK) or Post-Processed Kinematics (PPK); visual odometry. The safety inspection method as in Claim 1 wherein the step of locating the damage on the 3D map is performed by projecting the 2D image onto the 3D surface using non-rigid, non-linear point-cloud transformation. The safety inspection method as in Claim 1 further including the step of generating alerts and triggering workflows when damage is detected. A safety inspection system that detects and maps damage in structures comprising: a mobile platform; a user interface in communication with and in control of the mobile platform; a 2D video camera mounted on the mobile platform, the 2D video camera generating 2D video data of the structure from which 2D imagery of the structure is produced; a neural network that analyses the 2D video data and detects damage; and a processor that generates or retrieves from a library a 3D map of the structure and locates the detected damage on the 3D map of the structure.The safety inspection system as in Claim 10 the processor generates a 3D map of the structure in real-time using Simultaneous Localisation and Mapping (SLAM) using one or more of: odometry from 3D depth images; odometry from 2D video data; position estimation from an inertial navigation system (INS); position estimation from a Global Navigation Satellite System (GNSS); or position estimation from an optimised GNSS position using Real- Time Kinematics [RTK] or Post-Processed Kinematics (PPK). The safety inspection system of Claim 1 1 wherein the processor maps the location of the damage from the 2D imagery to the 3D map. The safety inspection system of Claim 10 further comprising a processor that triggers alerts and workflows. The safety inspection system as in Claim 10 wherein the mobile platform is an unmanned vehicle such as a drone for aerial inspections, or a ground or water based remote inspection vehicle. The safety inspection system as in claim 10 further comprising an illumination
SUBSTITUTE SHEET (RULE 26) source mounted on the mobile platform. The safety inspection system as in Claim 10 further comprising a storage device storing a database of machine-learning models for different structure surface types.
SUBSTITUTE SHEET (RULE 26) 22
A concrete safety inspection system comprising: a base station a mobile platform a neural network a processor
PCT/AU2022/051501 2021-12-14 2022-12-14 Infrastructure safety inspection system WO2023108210A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2021904044A AU2021904044A0 (en) 2021-12-14 Concrete Safety Inspection System
AU2021904044 2021-12-14

Publications (1)

Publication Number Publication Date
WO2023108210A1 true WO2023108210A1 (en) 2023-06-22

Family

ID=86775133

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2022/051501 WO2023108210A1 (en) 2021-12-14 2022-12-14 Infrastructure safety inspection system

Country Status (1)

Country Link
WO (1) WO2023108210A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017213718A1 (en) * 2016-06-09 2017-12-14 Lockheed Martin Corporation Automating the assessment of damage to infrastructure assets
US10832332B1 (en) * 2015-12-11 2020-11-10 State Farm Mutual Automobile Insurance Company Structural characteristic extraction using drone-generated 3D image data
US10989542B2 (en) * 2016-03-11 2021-04-27 Kaarta, Inc. Aligning measured signal data with slam localization data and uses thereof
CN112767391A (en) * 2021-02-25 2021-05-07 国网福建省电力有限公司 Power grid line part defect positioning method fusing three-dimensional point cloud and two-dimensional image
KR102254773B1 (en) * 2020-09-14 2021-05-24 한국건설기술연구원 Automatic decision and classification system for each defects of building components using image information, and method for the same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10832332B1 (en) * 2015-12-11 2020-11-10 State Farm Mutual Automobile Insurance Company Structural characteristic extraction using drone-generated 3D image data
US10989542B2 (en) * 2016-03-11 2021-04-27 Kaarta, Inc. Aligning measured signal data with slam localization data and uses thereof
WO2017213718A1 (en) * 2016-06-09 2017-12-14 Lockheed Martin Corporation Automating the assessment of damage to infrastructure assets
KR102254773B1 (en) * 2020-09-14 2021-05-24 한국건설기술연구원 Automatic decision and classification system for each defects of building components using image information, and method for the same
CN112767391A (en) * 2021-02-25 2021-05-07 国网福建省电力有限公司 Power grid line part defect positioning method fusing three-dimensional point cloud and two-dimensional image

Similar Documents

Publication Publication Date Title
US11599689B2 (en) Methods and apparatus for automatically defining computer-aided design files using machine learning, image analytics, and/or computer vision
Shariq et al. Revolutionising building inspection techniques to meet large-scale energy demands: A review of the state-of-the-art
Loupos et al. Autonomous robotic system for tunnel structural inspection and assessment
Toschi et al. Oblique photogrammetry supporting 3D urban reconstruction of complex scenarios
US10297074B2 (en) Three-dimensional modeling from optical capture
Tan et al. Mapping and modelling defect data from UAV captured images to BIM for building external wall inspection
US20190026400A1 (en) Three-dimensional modeling from point cloud data migration
US11880943B2 (en) Photogrammetry of building using machine learning based inference
Chen et al. Geo-registering UAV-captured close-range images to GIS-based spatial model for building façade inspections
Taylor et al. A mutual information approach to automatic calibration of camera and lidar in natural environments
Baik et al. Jeddah historical building information modeling" jhbim" old Jeddah–Saudi Arabia
Chow et al. Automated defect inspection of concrete structures
KR102346676B1 (en) Method for creating damage figure using the deep learning-based damage image classification of facility
Brumana et al. Combined geometric and thermal analysis from UAV platforms for archaeological heritage documentation
Dahaghin et al. Precise 3D extraction of building roofs by fusion of UAV-based thermal and visible images
US20220398804A1 (en) System for generation of three dimensional scans and models
Keyvanfar et al. Emerging dimensions of unmanned aerial vehicle’s (uav) 3d reconstruction modeling and photogrammetry in architecture and construction management
WO2023108210A1 (en) Infrastructure safety inspection system
Zhu A pipeline of 3D scene reconstruction from point clouds
Stamnas et al. Comparing 3D digital technologies for archaeological fieldwork documentation. The case of Thessaloniki Toumba Excavation, Greece
Chen Analysis and management of uav-captured images towards automation of building facade inspections
Chen et al. Improving completeness and accuracy of 3D point clouds by using deep learning for applications of digital twins to civil structures
Zainuddin et al. 3D MODELLING METHOD OF HIGH ABOVE GROUND ROCK ART PAINTING USING MULTISPECTRAL CAMERA
Mariniuc et al. Using 3D Scanning Techniques from Robotic Applications in the Constructions Domain
Ali et al. UAV 3D Photogrammetric Mapping & GIS Development Case Study: Khartoum Railway Station

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22905526

Country of ref document: EP

Kind code of ref document: A1