US12223632B2 - Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges - Google Patents

Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges Download PDF

Info

Publication number
US12223632B2
US12223632B2 US17/755,086 US202117755086A US12223632B2 US 12223632 B2 US12223632 B2 US 12223632B2 US 202117755086 A US202117755086 A US 202117755086A US 12223632 B2 US12223632 B2 US 12223632B2
Authority
US
United States
Prior art keywords
tilde over
module
attention
max
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US17/755,086
Other versions
US20230351573A1 (en
Inventor
Jian Zhang
Zhili He
Shang Jiang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Assigned to SOUTHEAST UNIVERSITY reassignment SOUTHEAST UNIVERSITY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HE, Zhili, JIANG, Shang, ZHANG, JIAN
Publication of US20230351573A1 publication Critical patent/US20230351573A1/en
Application granted granted Critical
Publication of US12223632B2 publication Critical patent/US12223632B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B1/00Hydrodynamic or hydrostatic features of hulls or of hydrofoils
    • B63B1/02Hydrodynamic or hydrostatic features of hulls or of hydrofoils deriving lift mainly from water displacement
    • B63B1/10Hydrodynamic or hydrostatic features of hulls or of hydrofoils deriving lift mainly from water displacement with multiple hulls
    • B63B1/12Hydrodynamic or hydrostatic features of hulls or of hydrofoils deriving lift mainly from water displacement with multiple hulls the hulls being interconnected rigidly
    • B63B1/125Hydrodynamic or hydrostatic features of hulls or of hydrofoils deriving lift mainly from water displacement with multiple hulls the hulls being interconnected rigidly comprising more than two hulls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B35/00Vessels or similar floating structures specially adapted for specific purposes and not otherwise provided for
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B45/00Arrangements or adaptations of signalling or lighting devices
    • B63B45/04Arrangements or adaptations of signalling or lighting devices the devices being intended to indicate the vessel or parts thereof
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B79/00Monitoring properties or operating parameters of vessels in operation
    • B63B79/40Monitoring properties or operating parameters of vessels in operation for controlling the operation of vessels, e.g. monitoring their speed, routing or maintenance schedules
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0008Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings of bridges
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0033Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by determining damage, crack or wear
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0075Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by means of external apparatus, e.g. test benches or portable test systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M5/00Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings
    • G01M5/0091Investigating the elasticity of structures, e.g. deflection of bridges or air-craft wings by using electromagnetic excitation or detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B35/00Vessels or similar floating structures specially adapted for specific purposes and not otherwise provided for
    • B63B2035/006Unmanned surface vessels, e.g. remotely controlled
    • B63B2035/008Unmanned surface vessels, e.g. remotely controlled remotely controlled
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B63SHIPS OR OTHER WATERBORNE VESSELS; RELATED EQUIPMENT
    • B63BSHIPS OR OTHER WATERBORNE VESSELS; EQUIPMENT FOR SHIPPING 
    • B63B2211/00Applications
    • B63B2211/02Oceanography
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation
    • G06T2207/30184Infrastructure
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/56Cameras or camera modules comprising electronic image sensors; Control thereof provided with illuminating means

Definitions

  • the invention belongs to the field of structural fault detection in civil engineering, and in particular relates to an intelligent detection method for multi-type faults of a near-water bridge and an unmanned surface vehicle.
  • the intelligent detection method is represented by deep learning technology, which has brought revolutionary solutions to many industries, such as medicine and health, aerospace and material science.
  • the patent document with the publication number CN111862112A discloses a learned medical image segmentation method
  • the patent document with the publication number CN111651916A discloses a material property prediction method based on deep learning.
  • the use of deep learning techniques for intelligence of structural faults is attracting more and more attention.
  • researchers apply deep learning methods to the detection of different faults and different infrastructures.
  • the patent document with the publication number CN112171692A discloses a flying adsorption robot suitable for intelligent detection of bridge deflection;
  • the patent document with the publication number CN111413353A discloses an intelligent mobile comprehensive detection equipment for tunnel lining faults;
  • patent document with the publication number CN111021244A discloses an orthotropic steel bridge deck fatigue crack detection robot;
  • the patent document with the publication number CN109978847A discloses a cable-robot-based method for identifying the fault of the noose.
  • the current intelligent detection method is mainly based on the Anchor-based method, that is, a large number of a priori boxes need to be pre-set, that is, anchor boxes, so it is named Anchor-based method.
  • Anchor-based method a large number of a priori boxes need to be pre-set, that is, anchor boxes, so it is named Anchor-based method.
  • the patent document with the publication number CN111062437A discloses a bridge fault target detection model based on the Faster R-CNN model
  • the patent document with the publication number CN111310558A also discloses a road damage extraction method based on the Faster R-CNN model.
  • the patent document with the publication number CN111127399A discloses a method for detecting bridge pier faults based on the YOLOv3 model.
  • Anchor-based methods Both the Faster R-CNN model and the YOLO series models are very classic Anchor-based methods.
  • the first prominent problem of Anchor-based methods is that the effect of the algorithm will be affected by the pre-set prior box. When dealing with features with complex shapes, having multiple aspect ratios and multiple sizes, the size and aspect ratio of the prior box may be too different from the target, which will reduce the recall rate of the prediction results. Therefore, in order to improve the detection accuracy, a large number of prior frames are often preset. This also brings about the second prominent problem of the Anchor-based method. A large number of a priori boxes will introduce a large number of hyperparameters and design choices, which will make the model very complex, and the computational load is large, and the computational efficiency is often not high.
  • the present invention discloses an intelligent detection method and unmanned surface vehicle for multi-type faults of near-water bridges, which are suitable for automatic and intelligent detection of faults at the bottom of small and medium bridges.
  • the proposed solution includes intelligent algorithms and hardware equipment. It can ensure the detection accuracy while taking into account the detection speed, and has a wide adaptability and applicability to complex engineering environments.
  • An intelligent detection system for detecting multiple types of faults for near-water bridges comprises a first component, a second component, and a third component.
  • the first component is an intelligent detection algorithm: CenWholeNet, an infrastructure fault target detection network based on deep learning.
  • the second component is an embedded parallel attention module PAM into the target detection network CenWholeNet, and the parallel attention module includes two sub-modules: a spatial attention sub-module and a channel attention sub-module.
  • the third component is an intelligent detection equipment assembly: an unmanned surface vehicle system based on lidar navigation, the unmanned surface vehicle includes four modules, a hull module, a video acquisition module, a lidar navigation module and a ground station module.
  • the infrastructure fault target detection network CenWholeNet described in the first component comprises the following steps:
  • Step 1 of the infrastructure fault target detection network CenWholeNet in the first component has the primary network
  • the method of using the primary network is as follows: giving an input image P ⁇ W ⁇ H ⁇ 3 , wherein W is the width of the image, H is the height of the image, and 3 represents the number of channels of the image, that is, three RGB channels; extracting features of the input image P through the primary network;
  • Step 2 of the infrastructure fault target detection network CenWholeNet in the first component has the detector, the method of using the detector is as follows:
  • H c , x , y max p [ Y p ( c , x , y ) ] , c ⁇ [ 1 , C ] , x ⁇ [ 1 , W r ] , y ⁇ [ 1 , H r ]
  • o k ( x k r - ⁇ x k r ⁇ , y k r - ⁇ y k r ⁇ )
  • ⁇ k ⁇ - arctan ⁇ ( y k 2 - y k 1 x k 2 - x k 1 )
  • Step 3 of the infrastructure fault target detection network CenWholeNet in the first component the method of outputting a result is as follows:
  • H ⁇ c , x , y max i 2 ⁇ 1 , j 2 ⁇ 1 , i , j , ⁇ Z [ H ⁇ c , x + i , y + j ]
  • a method of establishing the parallel attention module in the second component is as follows.
  • attention plays a very important role in human perception.
  • human eyes or ears and other organs acquire information, they tend to focus on more interesting targets and improve their attention; while suppressing uninteresting targets, reduce its attention.
  • attention mechanism by embedding attention modules in neural networks, increase the weight of feature tensors in meaningful regions, reducing the weights of areas such as meaningless backgrounds, which can improve the performance of the network.
  • the present invention discloses a lightweight, plug-and-play parallel attention module PAM, configured to improves expressiveness of neural networks; wherein PAM considers two dimensions of feature map attention, spatial attention and channel attention force, and combine them in parallel;
  • the LIDAR-based unmanned surface vehicle of the third component comprises four modules including, a hull module, a video acquisition module, a lidar navigation module and ground station module, working together in a cooperative manner.
  • the hull module includes a trimaran and a power system; the trimaran is configured to be stable, resist level 6 wind and waves, and has an effective remote control distance of 500 meters, adaptable to engineering application scenarios; the size of the hull is 75 ⁇ 47 ⁇ 28 cm, which is convenient for transportation; an effective load of the surface vehicle is 5 kg, and configured to be installed with multiple scientific instruments; in addition, the unmanned surface vehicle has the function of constant speed cruise, which reduces the control burden of personnel.
  • the video acquisition module is composed of a three-axis camera pan/tilt, a fixed front camera and a fill light; the three-axis camera pan/tilt supports lox optical zoom, auto focus, photography and 60 FPS video recording; said video acquisition module is configured to meet the needs of faults of different scales and locations shooting requirements; the fixed front camera is configured to determine a hull posture; a picture is transmitted back to a ground station in real time through a wireless image transmission device, on the one hand for fault identification, on the other hand for assisting a control of the USV; a controllable LED fill light board is installed to cope with small and medium-sized bridges and other low-light working environments, which contains 180 high-brightness LED lamp beads; 3D print a pan/tilt carrying the LED fill light board to meet the needs of multi-angle fill light; in addition, a fixed front-view LED light is also installed beads, providing light source support for the front-view camera.
  • the lidar navigation module includes lidar, mini computer, a set of transmission system and control system; lidar is configured to perform 3600 omnidirectional scanning; after it is connected with the mini computer, it can perform real-time mapping of the surrounding environment of the unmanned surface vehicle; through wireless image transmission, the information of the surrounding scene is transmitted back to the ground station in real time, so as to realize the lidar navigation of the unmanned surface vehicle; based on the lidar navigation, the unmanned surface vehicle no longer needs GPS positioning, in areas with weak GPS signals such as under the bridges and underground culverts; the wireless transmission system supports real-time transmission of 1080 P video, with a maximum transmission distance of 10 kilometers; redundant transmission is used to ensure link stability and strong anti-interference; the control system consists of wireless image transmission equipment, Pixhawk 2.4.8 flight control and SKYDROID T12 receiver, and through the flight control and receiver, the control system effectively control the equipment on board.
  • the ground station module includes two remote controls and multiple display devices; a main remote control is used to control the unmanned surface vehicle, and a secondary remote control is used to control the surface vehicle borne scientific instruments, and the display device is used to monitor the real-time information returned by the camera and lidar; on the one hand, the display device displays the picture in real time, and on the other hand, it processes the image in real time to identify the fault; the devices cooperate with each other to realize the intelligent fault detection without a GPS signal.
  • the present invention is the first application of Anchor-free target detection algorithm in the field of structural faults.
  • the detection results of the traditional Anchor-based method are affected by the setting of the prior frame (that is, the anchor boxes), which leads to this algorithm to deal with structural faults with complex shapes, various sizes, and various aspect ratio features (for example, the aspect ratio of the steel bar may be large, and the aspect ratio of the peeling may be small), the size and aspect ratio of the preset a priori frame will be very different from the target, which will cause low detection result recall rate.
  • a large number of a priori frames are often preset. This introduces many hyperparameters and design choices.
  • the method disclosed by the present invention abandons the complex a priori frame setting, directly predicts key points and related vectors (i.e. width, height and other information), and composes them into a detection frame.
  • the method of the invention is simpler, more direct and effective, solves the problem fundamentally, and is more suitable for the detection of engineering structure faults with complex features.
  • the present invention proposes a novel and lightweight attention module by considering the gain effect of the attention mechanism on the expressive ability of the neural network model.
  • the experimental results show that the method described in the present invention is superior to multiple neural network models with extensive influence, and achieves a comprehensive and better effect in the two dimensions of efficiency and accuracy.
  • the disclosed attention module can also improve different neural network models by sacrificing negligible computation.
  • the present invention discloses an unmanned surface vehicle solution that does not rely on GPS signals to detect faults at the bottom of small and medium bridges. Due to the constraints of design and performance, the current testing equipment is often not instrumental when inspecting a large number of small and medium-sized bridges. Taking drones as an example, their flight often requires a wider space free of interference and requires GPS-assisted positioning. However, in the area at the bottom of small and medium bridges with very low clearance, urban underground culverts and sewers, etc., the space is relatively closed, the GPS signal is often very weak, and the internal situation is very complicated. There are risks such as signal loss and collision damage when the drone flies in.
  • the present invention takes the lead in a highly robust unmanned surface system suitable for fault detection in relatively closed areas.
  • the experimental results show that while improving the detection efficiency, the system can reduce the safety risk and detection difficulty of engineers and save a lot of manpower cost, has strong engineering applicability and broad application prospects.
  • the system proposed by the present invention is not only suitable for the bottom of medium and small bridges, but also has great application potential in engineering scenarios such as urban underground culverts and sewers.
  • FIG. 1 is a schematic diagram of the overall framework in accordance with the aspects of the present invention.
  • FIG. 2 is a schematic diagram of the CenWholeNet network in accordance with the aspects of the present invention.
  • FIG. 3 is a detailed diagram of the attention module PAM in accordance with the aspects of the present invention.
  • FIG. 4 is a schematic diagram of the architecture of the unmanned ship system in accordance with the aspects of the present invention.
  • FIG. 5 is a schematic diagram of the polar coordinate supplementary information in accordance with the aspects of the present invention.
  • FIG. 6 the proposed PAM embedded ResNet network scheme diagram in accordance with the aspects of the present invention.
  • FIG. 7 PAM embedded Hourglass network scheme diagram in accordance with the aspects of the present invention.
  • FIG. 8 is a schematic diagram of the application of the method in accordance with the aspects of the present invention in a bridge group
  • FIG. 9 is a schematic diagram of the real-time mapping of the unmanned surface vehicle in accordance with the aspects of the present invention.
  • FIG. 10 is a schematic diagram of the detection results of the method in accordance with the aspects of the present invention.
  • FIG. 11 is a comparison table of the detection results between the algorithm framework in the present invention and other advanced target detection algorithms.
  • FIG. 12 is the algorithm framework by the present invention compared with a training process of other advanced target detection algorithms.
  • FIG. 1 An intelligent detection method for multi-type faults of near-water bridges.
  • the overall flow chart of the technical solution is shown in FIG. 1 , including the following components:
  • an intelligent detection algorithm CenWholeNet, an infrastructure fault target detection network based on deep learning, described and illustrated in FIG. 2 ;
  • an embedded parallel attention module PAM into the target detection network CenWholeNet includes two sub-modules: a spatial attention sub-module and a channel attention sub-module, process is illustrated in FIG. 3 ;
  • an intelligent detection equipment assembly an unmanned surface vehicle system based on lidar navigation, the unmanned surface vehicle includes four modules, a hull module, a video acquisition module, a lidar navigation module and a ground station module. Structural design of the unmanned surface vehicle is illustrated in FIG. 4 .
  • the infrastructure fault target detection network CenWholeNet described in the first component comprises the following steps.
  • Step 1 of the infrastructure fault target detection network CenWholeNet in the first component has the primary network
  • the method of using the primary network is as follows: n giving an input image P ⁇ W ⁇ H ⁇ 3 , wherein W is the width of the image, H is the height of the image, and 3 represents the number of channels of the image, that is, three RGB channels; extracting features of the input image P through the primary network; using two convolutional neural network models, Hourglass network and deep residual network ResNet.
  • Step 2 of the infrastructure fault target detection network CenWholeNet in the first component has the detector, the method of using the detector is as follows:
  • H c , x , y max p [ Y p ( c , x , y ) ] , c ⁇ [ 1 , C ] , x ⁇ [ 1 , W r ] , y ⁇ [ 1 , H r ]
  • o k ( x k r - ⁇ x k r ⁇ , y k r - ⁇ y k r ⁇ )
  • ⁇ k ⁇ - arctan ⁇ ( y k 2 - y k 1 x k 2 - x k 1 )
  • Step 3 of the infrastructure fault target detection network CenWholeNet in the first component the method of outputting a result is as follows:
  • H ⁇ c , x , y max i 2 ⁇ 1 , j 2 ⁇ 1 , i , j ⁇ Z [ H ⁇ c , x + i , y + j ]
  • NMS non-maximum suppression
  • a method of establishing the parallel attention module in the second component is as follows.
  • attention plays a very important role in human perception.
  • human eyes or ears and other organs acquire information, they tend to focus on more interesting targets and improve their attention; while suppressing uninteresting targets, reduce its attention.
  • attention mechanism by embedding attention modules in neural networks, increase the weight of feature tensors in meaningful regions, reducing the weights of areas such as meaningless backgrounds, which can improve the performance of the network.
  • the present invention discloses a lightweight, plug-and-play parallel attention module PAM, configured to improves expressiveness of neural networks; wherein PAM considers two dimensions of feature map attention, spatial attention and channel attention force, and combine them in parallel;
  • the LIDAR-based unmanned surface vehicle of the third component comprises four modules including, a hull module, a video acquisition module, a lidar navigation module and ground station module, working together in a cooperative manner.
  • the hull module includes a trimaran and a power system; the trimaran is configured to be stable, resist level 6 wind and waves, and has an effective remote control distance of 500 meters, adaptable to engineering application scenarios; the size of the hull is 75 ⁇ 47 ⁇ 28 cm, which is convenient for transportation; an effective load of the surface vehicle is 5 kg, and configured to be installed with multiple scientific instruments; in addition, the unmanned surface vehicle has the function of constant speed cruise, which reduces the control burden of personnel.
  • the video acquisition module is composed of a three-axis camera pan/tilt, a fixed front camera and a fill light; the three-axis camera pan/tilt supports 10 ⁇ optical zoom, auto focus, photography and 60 FPS video recording; said video acquisition module is configured to meet the needs of faults of different scales and locations shooting requirements; the fixed front camera is configured to determine a hull posture; a picture is transmitted back to a ground station in real time through a wireless image transmission device, on the one hand for fault identification, on the other hand for assisting a control of the USV; a controllable LED fill light board is installed to cope with small and medium-sized bridges and other low-light working environments, which contains 180 high-brightness LED lamp beads; 3D print a pan/tilt carrying the LED fill light board to meet the needs of multi-angle fill light; in addition, a fixed front-view LED light is also installed beads, providing light source support for the front-view camera.
  • the lidar navigation module includes lidar, mini computer, a set of transmission system and control system; lidar is configured to perform 360° omnidirectional scanning; after it is connected with the mini computer, it can perform real-time mapping of the surrounding environment of the unmanned surface vehicle; through wireless image transmission, the information of the surrounding scene is transmitted back to the ground station in real time, so as to realize the lidar navigation of the unmanned surface vehicle; based on the lidar navigation, the unmanned surface vehicle no longer needs GPS positioning, in areas with weak GPS signals such as under the bridges and underground culverts; the wireless transmission system supports real-time transmission of 1080 P video, with a maximum transmission distance of 10 kilometers; redundant transmission is used to ensure link stability and strong anti-interference; the control system consists of wireless image transmission equipment, Pixhawk 2.4.8 flight control and SKYDROID T12 receiver, and through the flight control and receiver, the control system effectively control the equipment on board.
  • the ground station module includes two remote controls and multiple display devices; a main remote control is used to control the unmanned surface vehicle, and a secondary remote control is used to control the surface vehicle borne scientific instruments, and the display device is used to monitor the real-time information returned by the camera and lidar; on the one hand, the display device displays the picture in real time, and on the other hand, it processes the image in real time to identify the fault; the devices cooperate with each other to realize the intelligent fault detection without a GPS signal.
  • the 3D lidar carried by the unmanned surface vehicle is combined with the SLAM algorithm, and the real-time mapping effect is shown in FIG. 9 .
  • the collected images include three types of faults: cracking, flaking and rebar exposure.
  • the pixel resolution of the fault images is 512 ⁇ 512.
  • the Batchsize during training is taken as 2
  • the Batchsize during testing is taken as 1
  • the learning rate is taken as 5 ⁇ 10 ⁇ 4 .
  • the detection result of the solution proposed by the present invention is shown in FIG. 9 , and the heat map is the visual result directly output by the network, which can provide evidence for the result of target detection.
  • the detection method disclosed in the present invention is also compared with the state-of-the-art object detection models on the same dataset, including the widely influential object detection method Faster R-CNN in Anchor-based methods and obtained in the industry.
  • the latest YOLOv5 model in the widely used YOLO method the acclaimed CenterNet method in Anchor-free.
  • attention module PAM of the present invention with SENet and CBAM, the excellent and classic attention modules recognized by the deep learning community.
  • the chosen evaluation metrics are the average precision AP and average recall AR, which are commonly used in the deep learning field. They are the average values of different categories and different images.
  • the calculation process is briefly described below. First introduce a key concept, the intersection of IoU. It is a common concept in the field of target detection. It measures the degree of overlap between the candidate box, that is, the prediction result of the model and the ground-truth bounding box, that is, the ratio of intersection and union, which can be calculated by the following formula.
  • IoU area ( Prediction ⁇ results ⁇ GroundTruth ) area ( Prediction ⁇ results ⁇ GroundTruth )
  • the recall rate can be calculated as
  • the IoU is usually divided into 10 classes, 0.50:0.05:0.95.
  • AP 50 used in the example is the precision when the IoU threshold is 0.50
  • AP 75 is the precision when the IoU threshold is 0.75
  • the average precision AP represents the average precision under 10 IoU thresholds, that is,
  • AP 1 1 ⁇ 0 ⁇ ( A ⁇ P 5 ⁇ 0 + A ⁇ P 5 ⁇ 5 + A ⁇ P 6 ⁇ 0 + ... + AP 9 ⁇ 5 )
  • the average recall AR is the maximum recall for each image given 1, 10, and 100 detections. Then averaging under the category and 10 IoU thresholds, 3 sub-indicators AR 1 , AR 10 and AR 100 can be obtained. Obviously, the closer the values of AP and AR are to 1, the better the test results and the closer to the label.
  • Comparable performance can only be achieved by training the best YOLO v5 sub-version YOLO v5x for more Epochs.
  • YOLO v5 is slightly faster in running speed, its accuracy is far inferior to the method proposed in this paper.
  • the running speed is the same, but the detection effect is much higher than that of CenterNet.
  • Two conclusions can be drawn from the comparison at the attention module level: (1) The PAM proposed by the present invention can achieve a general and substantial enhancement effect on different deep learning models under the premise of sacrificing a small amount of computation; (2) Compared with SENet and CBAM, PAM can obtain more enhancement, which is obviously better than SENet and CBAM.
  • FIG. 11 The comparison of the training process between different methods is shown in FIG. 11 , and the method proposed in the present invention is marked with a circle. It can be clearly seen that although the training results will oscillate to different degrees, our method can generally achieve higher AP and AR than traditional methods. That is, a better target detection effect can be obtained.
  • the specific embodiment verifies the effectiveness of the technical solution of the present invention and the applicability to complex engineering.
  • the proposed intelligent detection method is more suitable for multi-disease detection with variable slenderness ratio and complex shape.
  • the proposed unmanned ship system also has high robustness and high practicability.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Mechanical Engineering (AREA)
  • Ocean & Marine Engineering (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Fluid Mechanics (AREA)
  • Electromagnetism (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an intelligent detection method for multiple types of faults for near-water bridges and an unmanned surface vehicle. The method includes an infrastructure fault target detection network CenWholeNet and a bionics-based parallel attention module PAM. CenWholeNet is a deep learning-based Anchor-free target detection network, which mainly comprises a primary network and a detector, used to automatically detect faults in acquired images with high precision. Wherein, the PAM introduces an attention mechanism into the neural network, including spatial attention and channel attention, which is used to enhance the expressive power of the neural network. The unmanned surface vehicle includes hull module, video acquisition module, lidar navigation module and ground station module, which supports lidar navigation without GPS information, long-range real-time video transmission and highly robust real-time control, used for automated acquisition of information from bridge underside.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
This Application is a Section 371 National Stage Application No. PCT/CN2021/092393, filed on May 8, 2021, and claims priority to Chinese Patent Application No. 202110285996.5, filed on Mar. 7, 2021, the contents of which are incorporated herein by reference in their entireties.
TECHNICAL FIELD
The invention belongs to the field of structural fault detection in civil engineering, and in particular relates to an intelligent detection method for multi-type faults of a near-water bridge and an unmanned surface vehicle.
BACKGROUND
During the service lifetime of engineering structures, due to the influence of load and environment, many faults will occur. Once these faults are generated, they will easily accumulate and expand, thus affecting the service life and overall safety of the structure, and even affecting the safety of people's lives and their properties. In recent years, there have been many cases of structural damage such as bridge collapse due to lack of effective inspection and maintenance. Therefore, regular inspection and maintenance of the structure is essential.
Traditional infrastructure fault detection methods are mainly manual. These methods require the help of complicated tools, and have problems such as low efficiency, high labor costs, and large detection blind spots. Therefore, many researchers have recently introduced intelligent detection methods and intelligent detection equipment into the field of infrastructure fault detection. The intelligent detection method is represented by deep learning technology, which has brought revolutionary solutions to many industries, such as medicine and health, aerospace and material science. For example, the patent document with the publication number CN111862112A discloses a learned medical image segmentation method, the patent document with the publication number CN111651916A, discloses a material property prediction method based on deep learning. Similarly, the use of deep learning techniques for intelligence of structural faults is attracting more and more attention. Researchers apply deep learning methods to the detection of different faults and different infrastructures. Such as concrete structure crack detection, reinforced concrete structure multi-fault detection, steel structure corrosion detection, bolt loosening detection, ancient building fault detection, shield tunnel defect detection, etc. However, intelligent algorithms are not enough. To achieve true automatic detection, intelligent detection equipment is also required. In order to meet the needs of different inspection projects, a variety of inspection robots have been proposed and applied. Such as bridge inspection drones, mobile tunnel inspection vehicles, bridge deck inspection robots, rope climbing robots, etc. For example, the patent document with the publication number CN112171692A discloses a flying adsorption robot suitable for intelligent detection of bridge deflection; the patent document with the publication number CN111413353A discloses an intelligent mobile comprehensive detection equipment for tunnel lining faults; patent document with the publication number CN111021244A discloses an orthotropic steel bridge deck fatigue crack detection robot; the patent document with the publication number CN109978847A discloses a cable-robot-based method for identifying the fault of the noose.
These methods have solved many engineering problems, but two outstanding shortcomings of the current solutions remains. (1) The current intelligent detection method is mainly based on the Anchor-based method, that is, a large number of a priori boxes need to be pre-set, that is, anchor boxes, so it is named Anchor-based method. For example, the patent document with the publication number CN111062437A discloses a bridge fault target detection model based on the Faster R-CNN model, and the patent document with the publication number CN111310558A also discloses a road damage extraction method based on the Faster R-CNN model. The patent document with the publication number CN111127399A, discloses a method for detecting bridge pier faults based on the YOLOv3 model. Both the Faster R-CNN model and the YOLO series models are very classic Anchor-based methods. The first prominent problem of Anchor-based methods is that the effect of the algorithm will be affected by the pre-set prior box. When dealing with features with complex shapes, having multiple aspect ratios and multiple sizes, the size and aspect ratio of the prior box may be too different from the target, which will reduce the recall rate of the prediction results. Therefore, in order to improve the detection accuracy, a large number of prior frames are often preset. This also brings about the second prominent problem of the Anchor-based method. A large number of a priori boxes will introduce a large number of hyperparameters and design choices, which will make the model very complex, and the computational load is large, and the computational efficiency is often not high. Therefore, traditional intelligent detection methods are not suitable for structural fault detection, and the engineering community urgently needs new intelligent detection algorithms that are more efficient and concise, and have a wider adaptability. (2) At present, where an intelligent equipment can detect fault is still very limited, and it is mainly facing the area that is easy to detect such as the outer surface of the structure. The detection method, the patent document with the publication number of CN111260615A, discloses a method for detecting apparent faults of bridges based on UAV. However, the UAV system is difficult to work in relatively closed spaces, such as the bottom area of a large number of small and medium bridges, where the headroom is low, and the situation is complex, and artificial and intelligent detection equipment is often helpless. Taking UAV as an example, its flight often requires a wider space free of interference, and GPS signal-assisted positioning and manipulation. However, the GPS signal in the bottom area of small and medium-sized bridges with very low clearance is often very weak, and the internal situation is also very complicated. There are risks such as signal loss and collision damage when drones fly in. And some areas are very small, there may be toxic gases, and it is difficult for humans to easily reach them. Therefore, these areas have become detection blind spots for many years. Effective detection of these areas is also the focus and difficulty of the project. The engineering community urgently needs new types of intelligent detection equipment to detect such areas that are difficult for humans and other intelligent equipment to detect.
SUMMARY OF THE INVENTION
In order to solve the above problems, the present invention discloses an intelligent detection method and unmanned surface vehicle for multi-type faults of near-water bridges, which are suitable for automatic and intelligent detection of faults at the bottom of small and medium bridges. The proposed solution includes intelligent algorithms and hardware equipment. It can ensure the detection accuracy while taking into account the detection speed, and has a wide adaptability and applicability to complex engineering environments.
For achieving the above object, technical scheme of the present invention is as follows.
An intelligent detection system for detecting multiple types of faults for near-water bridges, comprises a first component, a second component, and a third component. The first component is an intelligent detection algorithm: CenWholeNet, an infrastructure fault target detection network based on deep learning.
The second component is an embedded parallel attention module PAM into the target detection network CenWholeNet, and the parallel attention module includes two sub-modules: a spatial attention sub-module and a channel attention sub-module.
The third component is an intelligent detection equipment assembly: an unmanned surface vehicle system based on lidar navigation, the unmanned surface vehicle includes four modules, a hull module, a video acquisition module, a lidar navigation module and a ground station module.
Further, the infrastructure fault target detection network CenWholeNet described in the first component comprises the following steps:
    • Step 1: a primary network: using the primary network to extract features of images;
    • Step 2: a detector: converting the extracted image features, by the detector, into tensor forms required for calculation, and optimizes a result through a loss function; and
    • Step 3: result output: the result output includes converting the tensor into a boundary box and outputting of prediction results of target detection.
Wherein Step 1 of the infrastructure fault target detection network CenWholeNet in the first component has the primary network, the method of using the primary network is as follows: giving an input image P∈
Figure US12223632-20250211-P00001
W×H×3, wherein W is the width of the image, H is the height of the image, and 3 represents the number of channels of the image, that is, three RGB channels; extracting features of the input image P through the primary network;
    • using two convolutional neural network models, Hourglass network and deep residual network ResNet.
Wherein Step 2 of the infrastructure fault target detection network CenWholeNet in the first component has the detector, the method of using the detector is as follows:
converting the features extracted by the primary network into an output set consisting of 4 tensors
Figure US12223632-20250211-P00002
=[{tilde over (H)},{tilde over (D)},Õ,
Figure US12223632-20250211-P00003
], by the detector, as a core of CenWholeNet;
    • using
H ~ [ 0 , 1 ] C × W r × H r
    •  represent a heat map of a central key point, where C is a category of the fault, which is taken as C=3 here, and r is an output step size, that is, the down sampling ratio, wherein a default step size is 4 and by down sampling, the calculation efficiency is improved;
    • defining
H ~ [ 0 , 1 ] C × W r × H r
    •  as ground-truth heatmap, for category c, the ground-truth center point of location (i, j) is pcij
      Figure US12223632-20250211-P00001
      C×W×H; First computing its down sampled equivalent position
p ^ cxy C × W r × H r , wherein x = i r , y = j r ;
    •  then through a Gaussian kernel function, map
p ^ cxy C × W r × H r
    •  to tensor
Y p C × W r × H r ,
    •  Yp is defined by:
Y p ( c , x , y ) = exp ( - ( x - p ^ cxy ( x ) ) 2 + ( y - p ^ cxy ( y ) ) 2 2 σ p 2 )
    • wherein {circumflex over (p)}cxy(x) and {circumflex over (p)}cxy(y) represent center point position (x,y), σp=gaussian_radius/3; and gaussian_radius represent a maximum radius representing an offset of the corner points of a detection frame, wherein the maximum radius ensures that the intersection ratio between the offset detection frame and the ground-truth detection frame is IoU≥t, and t=0.7 is taken in all experiments; integrating all the corresponding Y, points to get the ground-truth heat map H:
H c , x , y = max p [ Y p ( c , x , y ) ] , c [ 1 , C ] , x [ 1 , W r ] , y [ 1 , H r ]
    • wherein, Hc,x,y represents a value of H at the position (c,x,y), a probability that this position is a center point; specifically, Hc,x,y=1 represents a central key point, a positive sample; conversely, Hc,x,y=0 is a background and the negative sample; focal loss as a metric to measure a distance between {tilde over (H)} and H, according to the following equation:
Heat = - 1 N c = 1 C x = 1 W / r y = 1 H / r { ( 1 - H ~ c , x , y ) α log ( H ~ c , x , y ) if H c , x , y = 1 ( H ~ c , x , y ) α log ( 1 - H ~ c , x , y ) Otherwise ( 1 - H c , x , y ) β
    • wherein N is a total count of all central key points, α and β are hyperparameters, configured to control the weights; in all cases, α=2, β=4; by minimizing
      Figure US12223632-20250211-P00004
      Heat, the neural network model is configured to better predict a position of a center point of the target;
    • obtaining a size information W×H of a prediction box to finally determine the boundary box;
    • defining a size of the ground-truth boundary box corresponding to the kth key point pk be dk=(wk,hk), and integrate all dk to get the ground-truth boundary box dimension tensor
D 2 × W r × H r :
D=d 1 ⊕d 2 ⊕ . . . ⊕d N
    • wherein ⊕ represents pixel-level addition; for all fault categories, the model is configured to give a predicted dimension tensor
D ~ 2 × W r × H r ,
    •  and L1 Loss is configured to measure D and {tilde over (D)} similarity, determined by the following equation:
D = 1 N k = 1 N SmoothL 1 Loss ( d ~ k , d k ) = 1 N k = 1 N { 0.5 d ~ k - d k 2 2 if d ~ k - d k 1 < 1 d ~ k - d k 1 - 0.5 Otherwise
    • obtaining a rough width and height of each prediction box by minimizing
      Figure US12223632-20250211-P00004
      D, by the model;
    • correcting an error caused by down sampling by introducing a position offset, because the image is scaled by r times; recording the coordinates of the kth key point pk as (xk,yk), then the mapped coordinates are (└xk/r┘, └yk/r┘), then get the ground-truth offset:
o k = ( x k r - x k r , y k r - y k r )
    • integrating all ok to get the ground-truth offset matrix
O 2 × W r × H r :
O=o 1 ⊕o 2 ⊕ . . . ⊕o N
    • wherein, the 2 of a first dimension represents the offset of the key point (x, y) in the W and H directions; correspondingly, the model will give a prediction tensor
O ~ 2 × W r × H r ,
    •  and smooth L1 Loss is used to train the offset loss:
Off = 1 N k = 1 N SmoothL 1 Loss ( o ~ k , o k ) = 1 N k = 1 N { 0.5 o ~ k - o k 2 2 if o ~ k - o k 1 < 1 o ~ k - o k 1 - 0.5 Otherwise
    • introducing a new set of tensors to modify the prediction frame and improve the detection accuracy, in order to make the model pay more attention to the overall information of the target; specifically, taking an angle between the connecting line of a diagonal point of the detection frame and the x-axis, and the diagonal length of the detection frame as the training targets; defining coordinates of an upper left corner and lower right corner of the detection frame to be (xk 1,yk 1) and (xk 2,yk 2), so the diagonal length of the detection frame lk is calculated as:
      l k=√{square root over ((x k 1 −x k 2)2+(y k 1 −y k 2)2)}
    • an inclination of a connecting line between the upper left and lower right corners θk is calculated by the following formula:
θ k = π - arctan ( y k 2 - y k 1 x k 2 - x k 1 )
    • constructing a pair of complementary polar coordinates polar
polar k = ( 1 2 l k , θ k )
    •  and further to obtain a ground-truth polar coordinate matrix
Polar 2 × W r × H r :
Polar = ( 1 2 l 1 , θ 1 ) ( 1 2 l 2 , θ 2 ) ( 1 2 l N , θ N )
    • the model also gives a prediction tensor
2 × W r × H r ;
    •  Polar an
      Figure US12223632-20250211-P00003
      is trained by a same L1 loss:
Polar = 1 N k = 1 N k - polar k 1
Finally, for each position, the model will predict the output of C+6, which will form the set
Figure US12223632-20250211-P00002
=[{tilde over (H)},{tilde over (D)},Õ,
Figure US12223632-20250211-P00003
], which will also share the weights of the network; and the loss function of is defined by:
Figure US12223632-20250211-P00005
=
Figure US12223632-20250211-P00005
HeatOff
Figure US12223632-20250211-P00005
OffD
Figure US12223632-20250211-P00005
DPolar
Figure US12223632-20250211-P00005
Polar
Wherein all the experiments, λoff=10, λD and λPolar are both take as 0.1.
In Step 3 of the infrastructure fault target detection network CenWholeNet in the first component, the method of outputting a result is as follows:
    • outputting results by extracting a possible center keypoint coordinates from a predicted heatmap tensor {tilde over (H)}, and then obtaining a predicted bounding box according to the information in the corresponding {tilde over (D)}, Õ and
      Figure US12223632-20250211-P00003
      ; wherein the greater the value of {tilde over (H)}c,x,y, the more likely it is the center point; for category c, if the point pcxy satisfies the following formula, it is considered that pcxy is an candidate center point;
H ~ c , x , y = max i 2 1 , j 2 1 , i , j , [ H ~ c , x + i , y + j ]
    • wherein we do not need non-maximum suppression (NMS), using a 3×3 max-pooling convolutional layer to extract candidate center points; letting a set of center points be {tilde over (P)}={({tilde over (x)}k,{tilde over (y)}k)}k=1 N p , wherein Np is a total number of selected center points; for any center points ({tilde over (x)}k,{tilde over (y)}k), extract corresponding size information ({tilde over (w)}k,{tilde over (h)}k)=({tilde over (D)}1,{tilde over (x)} k ,{tilde over (y)} k ,D2,{tilde over (x)} k ,{tilde over (y)} k ), offset information (δ{tilde over (x)}k,δ{tilde over (y)}k)=(Õ1,{tilde over (x)} k ,{tilde over (y)} k 2,{tilde over (x)} k ,{tilde over (y)} k ) and polar coordinate information ({tilde over (l)}kk)=(
      Figure US12223632-20250211-P00006
      ); first, calculate the prediction frame size correction value according to ({tilde over (l)}k,{tilde over (θ)}k):
{ Δ h ~ k = l ~ k sin ( θ ~ k ) Δ w ~ k = - l ~ k cos ( θ ~ k )
    • defining specific location of the prediction box as
{ Top = y ~ k + δ y ~ k - Bottom = y ~ k + δ y ~ k + ( α y · 1 2 h ~ k + β y · Δ h ~ k ) ; ( α y · 1 2 h ~ k + β y · Δ h ~ k ) Left = x ~ k + δ x ~ k - Right = x ~ k + δ x ~ k + ( α x · 1 2 w ~ k + β x · Δ w ~ k ) ; ( α x · 1 2 w ~ k + β x · Δ w ~ k )
    • wherein bounding box resizing hyperparameters as αyx=0.9, βfx=0.1.
Further, a method of establishing the parallel attention module in the second component is as follows.
As we all know, attention plays a very important role in human perception. When human eyes or ears and other organs acquire information, they tend to focus on more interesting targets and improve their attention; while suppressing uninteresting targets, reduce its attention. Inspired by human attention, some researchers recently proposed a bionic idea, attention mechanism: by embedding attention modules in neural networks, increase the weight of feature tensors in meaningful regions, reducing the weights of areas such as meaningless backgrounds, which can improve the performance of the network.
The present invention discloses a lightweight, plug-and-play parallel attention module PAM, configured to improves expressiveness of neural networks; wherein PAM considers two dimensions of feature map attention, spatial attention and channel attention force, and combine them in parallel;
    • giving an input feature map as X∈
      Figure US12223632-20250211-P00007
      C×W×H, wherein C, H and W denote channel, height and width, respectively; first, transforming
      Figure US12223632-20250211-P00008
      1 by implementing the spatial attention submodule: X→Ũ∈
      Figure US12223632-20250211-P00007
      C×W×H; then, transforming
      Figure US12223632-20250211-P00008
      2 by implementing the channel attention sub-module: X→Û∈
      Figure US12223632-20250211-P00007
      C×W×H finally, outputting feature map U∈
      Figure US12223632-20250211-P00007
      C×W×H; the transformations consists essentially of convolution, maximum pooling operation, mean pooling operation and ReLU function; and overall calculation process is as follows:
      U=Ũ⊕Û=
      Figure US12223632-20250211-P00008
      1(X)⊕
      Figure US12223632-20250211-P00008
      2(X)
    • wherein ⊕ represents output pixel-level tensor addition;
    • the spatial attention sub-module is configured to emphasize “where” to improve attention, and pay attention to the locations of regions of interest (ROIs); first, maximum pooling and mean pooling operations are performed on the feature map along a channel direction to obtain several two-dimensional images, λ1Uavg_s
      Figure US12223632-20250211-P00007
      1×W×H and λ2Umax_s
      Figure US12223632-20250211-P00007
      1×W×H, wherein λ1 and λ2 are adjust hyperparameters for different pooling operation weights, and taking λ1=2, λ2=1; Uavg_s
      Figure US12223632-20250211-P00009
      Umax_s are calculated by the following formulas, and MaxPool and AvgPool represent maximum pooling operation and the average pooling operation respectively;
U avg _ s ( 1 , i , j ) = AvgPool ( X ) = 1 C k = 1 C X ( k , i , j ) , i [ 1 , W ] , j [ 1 , H ] U max _ s ( 1 , i , j ) = MaxPool ( X ) = max k [ 1 , C ] ( X ( k , i , j ) ) , i [ 1 , W ] , j [ 1 , H ]
Next, introducing convolution operation to generate the spatial attention weight Uspa
Figure US12223632-20250211-P00007
1×W×H; the overall calculation process of the spatial attention sub-module is as follows:
Figure US12223632-20250211-P00008
1(X)=Ũ=U spa ⊗X=σ(Conv([λ1 U avg_s2 U max_s]))⊗X
which is equivalent to:
Figure US12223632-20250211-P00008
1(X)=σ(Conv([MaxPool(X),AvgPool(X),AvgPool(X)]))⊗X
    • wherein, ⊗ represents pixel-level tensor multiplication, σ represents a sigmoid activation function, Conv represents a convolution operation, and a convolution kernel size is 3×3; and a spatial attention weight is copied along a channel axis;
    • the channel attention sub-module is configured to find the relationship of internal channels, and care about “what” is interesting in a given feature map; first, mean pooling and max pooling are performed along width and height directions to generate a number of 1-
      Figure US12223632-20250211-P00001
      dimensional vector, λ3Uavg_c
      Figure US12223632-20250211-P00001
      C×1×1 and λ4Umax_c
      Figure US12223632-20250211-P00001
      C×1×1, and λ3 and λ4 are adjust hyperparameters for different pooling operation weights, and taking λ3=2, λ4=1; Uavg_c and Umax_c are calculated by the following formulas:
U avg _ c ( k , 1 , 1 ) = AvgPool ( X ) = 1 W × H i = 1 W j = 1 H X ( k , i , j ) , k [ 1 , C ] U max _ c ( k , 1 , 1 ) = MaxPool ( X ) = max i [ 1 , W ] , j [ 1 , H ] ( X ( k , i , j ) ) , k [ 1 , C ]
Subsequently, introducing point-wise convolution (PConv) as a channel context aggregator to realize point-wise inter-channel interaction; in order to reduce amount of parameters, PConv is designed in a form of an Hourglass, and setting an attenuation ratio to r; finally, channel attention is obtained force weight Ucha
Figure US12223632-20250211-P00001
C×1×1; the calculation process of this sub-module is as follows:
Figure US12223632-20250211-P00008
2(X)=Û=U cha ⊗X=σ(ΣPConv([λ3 U avg_c4 U max_c]))⊗X
which is equivalent to
Figure US12223632-20250211-P00008
2(X)=σ(ΣPConv2(δ(PConv1([λ3 U avg_c4 U max_c]))))⊗X
    • δ represents the ReLU activation function; the size of the convolution kernel of PConv1 is C/r×C×1×1, and the size of the convolution kernel of the inverse transform PConv2 is C×C/r×1×1; selecting ratio r as 16, the channel attention weights is copied along the width and height directions;
    • wherein the PAM is a plug-and-play module, which ensures the strict consistency of the input tensor and output tensor at a dimension level; PAM is configured to be embedded in any position of any convolutional neural network model as a supplementary module, the method of providing PAM embedding Hourglass and ResNet including: for the ResNet network, the PAM is embedded in the residual block after the batch normalization layer, before the residual connection, and the same in each residual block; for the Hourglass network, it is divided into two parts: downsampling and upsampling and the downsampling part embeds the PAM between the residual blocks, as a transition module, the upsampling part embeds the PAM before the residual connection. Details are presented in the drawings.
Further, the LIDAR-based unmanned surface vehicle of the third component comprises four modules including, a hull module, a video acquisition module, a lidar navigation module and ground station module, working together in a cooperative manner.
The hull module includes a trimaran and a power system; the trimaran is configured to be stable, resist level 6 wind and waves, and has an effective remote control distance of 500 meters, adaptable to engineering application scenarios; the size of the hull is 75×47×28 cm, which is convenient for transportation; an effective load of the surface vehicle is 5 kg, and configured to be installed with multiple scientific instruments; in addition, the unmanned surface vehicle has the function of constant speed cruise, which reduces the control burden of personnel.
The video acquisition module is composed of a three-axis camera pan/tilt, a fixed front camera and a fill light; the three-axis camera pan/tilt supports lox optical zoom, auto focus, photography and 60 FPS video recording; said video acquisition module is configured to meet the needs of faults of different scales and locations shooting requirements; the fixed front camera is configured to determine a hull posture; a picture is transmitted back to a ground station in real time through a wireless image transmission device, on the one hand for fault identification, on the other hand for assisting a control of the USV; a controllable LED fill light board is installed to cope with small and medium-sized bridges and other low-light working environments, which contains 180 high-brightness LED lamp beads; 3D print a pan/tilt carrying the LED fill light board to meet the needs of multi-angle fill light; in addition, a fixed front-view LED light is also installed beads, providing light source support for the front-view camera.
The lidar navigation module includes lidar, mini computer, a set of transmission system and control system; lidar is configured to perform 3600 omnidirectional scanning; after it is connected with the mini computer, it can perform real-time mapping of the surrounding environment of the unmanned surface vehicle; through wireless image transmission, the information of the surrounding scene is transmitted back to the ground station in real time, so as to realize the lidar navigation of the unmanned surface vehicle; based on the lidar navigation, the unmanned surface vehicle no longer needs GPS positioning, in areas with weak GPS signals such as under the bridges and underground culverts; the wireless transmission system supports real-time transmission of 1080 P video, with a maximum transmission distance of 10 kilometers; redundant transmission is used to ensure link stability and strong anti-interference; the control system consists of wireless image transmission equipment, Pixhawk 2.4.8 flight control and SKYDROID T12 receiver, and through the flight control and receiver, the control system effectively control the equipment on board.
The ground station module includes two remote controls and multiple display devices; a main remote control is used to control the unmanned surface vehicle, and a secondary remote control is used to control the surface vehicle borne scientific instruments, and the display device is used to monitor the real-time information returned by the camera and lidar; on the one hand, the display device displays the picture in real time, and on the other hand, it processes the image in real time to identify the fault; the devices cooperate with each other to realize the intelligent fault detection without a GPS signal.
The beneficial effects of the present invention are described below.
1. In terms of intelligent detection algorithm, the present invention is the first application of Anchor-free target detection algorithm in the field of structural faults. The detection results of the traditional Anchor-based method are affected by the setting of the prior frame (that is, the anchor boxes), which leads to this algorithm to deal with structural faults with complex shapes, various sizes, and various aspect ratio features (for example, the aspect ratio of the steel bar may be large, and the aspect ratio of the peeling may be small), the size and aspect ratio of the preset a priori frame will be very different from the target, which will cause low detection result recall rate. In addition, in order to achieve a better detection effect, a large number of a priori frames are often preset. This introduces many hyperparameters and design choices. This makes the design of the model more complex, and at the same time brings a larger amount of computation. Compared with the Anchor-based method, the method disclosed by the present invention abandons the complex a priori frame setting, directly predicts key points and related vectors (i.e. width, height and other information), and composes them into a detection frame. The method of the invention is simpler, more direct and effective, solves the problem fundamentally, and is more suitable for the detection of engineering structure faults with complex features. In addition, the present invention proposes a novel and lightweight attention module by considering the gain effect of the attention mechanism on the expressive ability of the neural network model. The experimental results show that the method described in the present invention is superior to multiple neural network models with extensive influence, and achieves a comprehensive and better effect in the two dimensions of efficiency and accuracy. The disclosed attention module can also improve different neural network models by sacrificing negligible computation.
2. In terms of intelligent detection equipment, the present invention discloses an unmanned surface vehicle solution that does not rely on GPS signals to detect faults at the bottom of small and medium bridges. Due to the constraints of design and performance, the current testing equipment is often not instrumental when inspecting a large number of small and medium-sized bridges. Taking drones as an example, their flight often requires a wider space free of interference and requires GPS-assisted positioning. However, in the area at the bottom of small and medium bridges with very low clearance, urban underground culverts and sewers, etc., the space is relatively closed, the GPS signal is often very weak, and the internal situation is very complicated. There are risks such as signal loss and collision damage when the drone flies in. And some areas are very small, there may be toxic gases, and it is difficult for humans to easily reach them. Therefore, the engineering community urgently needs a new type of intelligent detection equipment to detect areas that are difficult to detect by artificial and other intelligent equipment. The present invention takes the lead in a highly robust unmanned surface system suitable for fault detection in relatively closed areas. The experimental results show that while improving the detection efficiency, the system can reduce the safety risk and detection difficulty of engineers and save a lot of manpower cost, has strong engineering applicability and broad application prospects. In addition, the system proposed by the present invention is not only suitable for the bottom of medium and small bridges, but also has great application potential in engineering scenarios such as urban underground culverts and sewers.
DESCRIPTION OF DRAWINGS
FIG. 1 is a schematic diagram of the overall framework in accordance with the aspects of the present invention;
FIG. 2 is a schematic diagram of the CenWholeNet network in accordance with the aspects of the present invention;
FIG. 3 is a detailed diagram of the attention module PAM in accordance with the aspects of the present invention;
FIG. 4 is a schematic diagram of the architecture of the unmanned ship system in accordance with the aspects of the present invention;
FIG. 5 is a schematic diagram of the polar coordinate supplementary information in accordance with the aspects of the present invention;
FIG. 6 the proposed PAM embedded ResNet network scheme diagram in accordance with the aspects of the present invention;
FIG. 7 PAM embedded Hourglass network scheme diagram in accordance with the aspects of the present invention;
FIG. 8 is a schematic diagram of the application of the method in accordance with the aspects of the present invention in a bridge group;
FIG. 9 is a schematic diagram of the real-time mapping of the unmanned surface vehicle in accordance with the aspects of the present invention;
FIG. 10 is a schematic diagram of the detection results of the method in accordance with the aspects of the present invention;
FIG. 11 is a comparison table of the detection results between the algorithm framework in the present invention and other advanced target detection algorithms; and
FIG. 12 is the algorithm framework by the present invention compared with a training process of other advanced target detection algorithms.
DESCRIPTION OF THE EMBODIMENTS
The present invention will be further described below with reference to the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are only used to illustrate the present invention and not to limit the scope of the present invention. After reading the present disclosure, those skilled in the art can make modifications to the various equivalent forms of the present disclosure within the scope defined by the appended claims of the present application.
An intelligent detection method for multi-type faults of near-water bridges. The overall flow chart of the technical solution is shown in FIG. 1 , including the following components:
a first component, an intelligent detection algorithm: CenWholeNet, an infrastructure fault target detection network based on deep learning, described and illustrated in FIG. 2 ;
a second component, an embedded parallel attention module PAM into the target detection network CenWholeNet, the parallel attention module includes two sub-modules: a spatial attention sub-module and a channel attention sub-module, process is illustrated in FIG. 3 ;
a third component, an intelligent detection equipment assembly: an unmanned surface vehicle system based on lidar navigation, the unmanned surface vehicle includes four modules, a hull module, a video acquisition module, a lidar navigation module and a ground station module. Structural design of the unmanned surface vehicle is illustrated in FIG. 4 .
Wherein the infrastructure fault target detection network CenWholeNet described in the first component comprises the following steps.
    • Step 1: a primary network: using the primary network to extract features of images;
    • Step 2: a detector: converting the extracted image features, by the detector, into tensor forms required for calculation, and optimizes a result through a loss function;
    • Step 3: result output: the result output includes converting the tensor into a boundary box and outputting of prediction results of target detection.
Wherein Step 1 of the infrastructure fault target detection network CenWholeNet in the first component has the primary network, the method of using the primary network is as follows:
Figure US12223632-20250211-P00001
n giving an input image P∈
Figure US12223632-20250211-P00001
W×H×3, wherein W is the width of the image, H is the height of the image, and 3 represents the number of channels of the image, that is, three RGB channels; extracting features of the input image P through the primary network; using two convolutional neural network models, Hourglass network and deep residual network ResNet.
Wherein Step 2 of the infrastructure fault target detection network CenWholeNet in the first component has the detector, the method of using the detector is as follows:
    • converting the features extracted by the primary network into an output set consisting of 4 tensors
      Figure US12223632-20250211-P00002
      =[{tilde over (H)},{tilde over (D)},Õ,
      Figure US12223632-20250211-P00003
      ], by the detector, as a core of CenWholeNet;
    • using
H ~ [ 0 , 1 ] C × W r × H r
    •  to represent a heat map of a central key point, where C is a category of the fault, which is taken as C=3 here, and r is an output step size, that is, the down sampling ratio, wherein a default step size is 4 and by down sampling, the calculation efficiency is improved;
    • defining
H [ 0 , 1 ] C × W r × H r
    •  as a ground-truth heatmap, for category c, the ground-truth center point of location (i,j) is pcij
      Figure US12223632-20250211-P00001
      C×W×H; First computing its down sampled equivalent position
p ^ cxy C × W r × H r ,
    •  wherein
x = i r , y = j r ;
    •  then through a Gaussian kernel function, map
p ^ cxy C × W r × H r
    •  to tensor
Y p C × W r × H r ,
    •  Yp is defined by
Y p ( c , x , y ) = exp ( - ( x - p ^ cxy ( x ) ) 2 + ( y - p ^ cxy ( y ) ) 2 2 σ p 2 )
    • wherein {circumflex over (p)}cxy(x) and {circumflex over (p)}cxy(y) represent center point position (x,y), σp=gaussian_radius/3; and gaussian_radius represent a maximum radius representing an offset of the corner points of a detection frame, wherein the maximum radius ensures that the intersection ratio between the offset detection frame and the ground-truth detection frame is IoU≥t, and t=0.7 is taken in all experiments; integrating all the corresponding Yp points to get the ground-truth heat map H:
H c , x , y = max p [ Y p ( c , x , y ) ] , c [ 1 , C ] , x [ 1 , W r ] , y [ 1 , H r ]
    • wherein, Hc,x,y represents a value of H at the position (c,x,y), a probability that this position is a center point; specifically, Hc,x,y=1 represents a central key point, a positive sample; conversely, Hc,x,y=0 is a background and the negative sample; focal loss as a metric to measure a distance between I and H, according to the following equation:
Heat = - 1 N c = 1 C x = 1 W / r y = 1 H / r { ( 1 - H ~ c , x , y ) α log ( H ~ c , x , y ) if H c , x , y = 1 ( H ~ c , x , y ) α log ( 1 - H ~ c , x , y ) Otherwise ( 1 - H ~ c , x , y ) β
    • wherein N is a total count of all central key points, α and β are hyperparameters, configured to control the weights; in all cases, α=2, β=4; by minimizing
      Figure US12223632-20250211-P00004
      Heat, the neural network model is configured to better predict a position of a center point of the target;
    • obtaining a size information W×H of a prediction box to finally determine the boundary box;
    • defining a size of the ground-truth boundary box corresponding to the kth key point pk be dk=(wk,hk), and integrate all dk to get the ground-truth boundary box dimension tensor
D 2 × W r × H r :
D=d 1 ⊕d 2 ⊕ . . . ⊕d N
    • wherein ⊕ represents pixel-level addition; for all fault categories, the model is configured to give a predicted dimension tensor
D ~ 2 × W r × H r ,
    •  and L1 Loss is configured to measure D and {tilde over (D)} similarity, determined by the following equation:
D = 1 N k = 1 N SmoothL1Loss ( d ~ k , d k ) = 1 N k = 1 N { 0.5 d ~ k - d k 2 2 if d ~ k - d k 1 < 1 d ~ k - d k 1 - 0.5 Otherwise
    • obtaining a rough width and height of each prediction box by minimizing
      Figure US12223632-20250211-P00004
      D, by the model;
    • correcting an error caused by down sampling by introducing a position offset, because the image is scaled by r times; recording the coordinates of the kth key point pk as (xk,yk), then the mapped coordinates are (└xk/r┘, └yk/r┘), then get the ground-truth offset:
o k = ( x k r - x k r , y k r - y k r )
    • integrating all ok to get the ground-truth offset matrix
O 2 × W r × H r :
O=o 1 ⊕o 2 ⊕ . . . ⊕o N
    • wherein, the 2 of a first dimension represents the offset of the key point (x, y) in the W and H directions; correspondingly, the model will give a prediction tensor
O ~ 2 × W r × H r ,
    •  and smooth L1 Loss is used to train the offset loss:
Off = 1 N k = 1 N SmoothL 1 Loss ( o ~ k , o k ) = 1 N k = 1 N { 0.5 o ~ k - o k 2 2 if o ~ k - o k 1 < 1 o ~ k - o k 1 - 0.5 Otherwise
    • introducing a new set of tensors to modify the prediction frame and improve the detection accuracy, in order to make the model pay more attention to the overall information of the target; specifically, taking an angle between the connecting line of a diagonal point of the detection frame and the x-axis, and the diagonal length of the detection frame as the training targets, as shown in FIG. 5 ; defining coordinates of an upper left corner and lower right corner of the detection frame to be (xk 1,yk 1) and (xk 2,yk 2), so the diagonal length of the detection frame lk is calculated as:
      l k=√{square root over ((x k 1 −x k 2)2+(y k 1 −y k 2)2)}
    • an inclination of a connecting line between the upper left and lower right corners θk is calculated by the following formula:
θ k = π - arctan ( y k 2 - y k 1 x k 2 - x k 1 )
    • constructing a pair of complementary polar coordinates
polar k = ( 1 2 l k , θ k )
    •  and further to obtain a ground-truth polar coordinate matrix
Polar 2 × W r × H r :
Polar = ( 1 2 l 1 , θ 1 ) ( 1 2 l 2 , θ 2 ) ( 1 2 l N , θ N )
    • the model also gives a prediction tensor
2 × W r × H r ;
    •  Polar and
      Figure US12223632-20250211-P00003
      is trained by a same L1 loss:
Polar = 1 N k = 1 N - polar k 1
Finally for each position, the model will predict the output of C+6, which will form the set
Figure US12223632-20250211-P00002
=[{tilde over (H)},{tilde over (D)},Õ,
Figure US12223632-20250211-P00003
], which will also share the weights of the network; and the loss function of is defined by:
Figure US12223632-20250211-P00010
=
Figure US12223632-20250211-P00010
HeatOff
Figure US12223632-20250211-P00010
OffD
Figure US12223632-20250211-P00010
DPolar
Figure US12223632-20250211-P00010
Polar
Wherein all the experiments, λOff=10, λD and λPolar are both take as 0.1.
In Step 3 of the infrastructure fault target detection network CenWholeNet in the first component, the method of outputting a result is as follows:
    • outputting results by extracting a possible center keypoint coordinates from a predicted heatmap tensor {tilde over (H)}, and then obtaining a predicted bounding box according to the information in the corresponding {tilde over (D)}, Õ and
      Figure US12223632-20250211-P00003
      ; wherein the greater the value of {tilde over (H)}c,x,y, the more likely it is the center point; for category c, if the point pcxy satisfies the following formula, it is considered that pcxy is an candidate center point;
H ~ c , x , y = max i 2 1 , j 2 1 , i , j [ H ~ c , x + i , y + j ]
wherein we do not need non-maximum suppression (NMS), using a 3×3 max-pooling convolutional layer to extract candidate center points; letting a set of center points be {tilde over (P)}={({tilde over (x)}k,{tilde over (y)}k)}k=1 N p , wherein Np is a total number of selected center points; for any center points ({tilde over (x)}k,{tilde over (y)}k), extract corresponding size information ({tilde over (w)}k,{tilde over (h)}k)=({tilde over (D)}1,{tilde over (x)} k ,{tilde over (y)} k , {tilde over (D)}2,{tilde over (x)} k ,{tilde over (y)} k ), offset information (δ{tilde over (x)}k,δ{tilde over (y)}k)=(Õ1,{tilde over (x)} k ,{tilde over (y)} k 2,{tilde over (x)} k ,{tilde over (y)} k ) and polar coordinate information ({tilde over (l)}k,{tilde over (θ)}k)=(
Figure US12223632-20250211-P00006
); first, calculate the prediction frame size correction value according to ({tilde over (l)}k,{tilde over (θ)}k):
{ Δ h ~ k = l ~ k sin ( θ ~ k ) Δ w ~ k = - l ~ k cos ( θ ~ k )
    • defining specific location of the prediction box as
{ Top = y ~ k + δ y ~ k - ( α y · 1 2 h ~ k + β y · Δ h ~ k ) ; Bottom = y ~ k + δ y ~ k + ( α y · 1 2 h ~ k + β y · Δ h ~ k ) Left = x ~ k + δ x ~ k - ( α x · 1 2 w ~ k + β x · Δ w ~ k ) ; Right = x ~ k + δ x ~ k + ( α x · 1 2 w ~ k + β x · Δ w ~ k )
    • wherein bounding box resizing hyperparameters as αyx=0.9, βyx=0.1.
Further, a method of establishing the parallel attention module in the second component is as follows.
As we all know, attention plays a very important role in human perception. When human eyes or ears and other organs acquire information, they tend to focus on more interesting targets and improve their attention; while suppressing uninteresting targets, reduce its attention. Inspired by human attention, some researchers recently proposed a bionic idea, attention mechanism: by embedding attention modules in neural networks, increase the weight of feature tensors in meaningful regions, reducing the weights of areas such as meaningless backgrounds, which can improve the performance of the network.
The present invention discloses a lightweight, plug-and-play parallel attention module PAM, configured to improves expressiveness of neural networks; wherein PAM considers two dimensions of feature map attention, spatial attention and channel attention force, and combine them in parallel;
giving an input feature map as X∈
Figure US12223632-20250211-P00001
C×W×H wherein C, H and W denote channel, height and width, respectively; first, transforming
Figure US12223632-20250211-P00008
1 by implementing the spatial attention submodule: X→Ũ∈
Figure US12223632-20250211-P00001
C×W×H; then, transforming
Figure US12223632-20250211-P00008
2 by implementing the channel attention sub-module: X→Û∈
Figure US12223632-20250211-P00001
C×W×H finally, outputting feature map U∈
Figure US12223632-20250211-P00001
C×W×H; transformations consists essentially of convolution, maximum pooling operation, mean pooling operation and ReLU function; and overall calculation process is as follows:
U=Ũ⊕Û=
Figure US12223632-20250211-P00008
1(X)⊕
Figure US12223632-20250211-P00008
2(X)
    • wherein ⊕ represents output pixel-level tensor addition;
    • the spatial attention sub-module is configured to emphasize “where” to improve attention, and pay attention to the locations of regions of interest (ROIs); first, maximum pooling and mean pooling operations are performed on the feature map along a channel direction to obtain several two-dimensional images, λ1Uavg_s
      Figure US12223632-20250211-P00011
      1×W×H and λ2Umax_s
      Figure US12223632-20250211-P00011
      1×W×H wherein λ1 and λ2 are adjust hyperparameters for different pooling operation weights, and taking λ1=2, λ2=1; Uavg_s
      Figure US12223632-20250211-P00009
      Umax_s are calculated by the following formulas, and MaxPool and AvgPool represent maximum pooling operation and the average pooling operation respectively;
U avg _ s ( 1 , i , j ) = AvgPool ( X ) = 1 C k = 1 C X ( k , i , j ) , i [ 1 , W ] , j [ 1 , H ] U max _ s ( 1 , i , j ) = MaxPoo l ( X ) = max k [ 1 , C ] ( X ( k , i , j ) ) , i [ 1 , W ] , j [ 1 , H ]
Next, introducing convolution operation to generate the spatial attention weight Uspa
Figure US12223632-20250211-P00011
1×W×H; the overall calculation process of the spatial attention sub-module is as follows:
Figure US12223632-20250211-P00008
1(X)=Ũ=U spa ⊗X=σ(Conv([λ1 U avg_s2 U max_s]))⊗X
which is equivalent to:
Figure US12223632-20250211-P00008
1(X)=σ(Conv([MaxPool(X),AvgPool(X),AvgPool(X)]))⊗X
    • wherein, ⊗ represents pixel-level tensor multiplication, σ represents a sigmoid activation function, Conv represents a convolution operation, and a convolution kernel size is 3×3; and a spatial attention weight is copied along a channel axis;
    • the channel attention sub-module is configured to find the relationship of internal channels, and care about “what” is interesting in a given feature map; first, mean pooling and max pooling are performed along width and height directions to generate a number of 1-dimensional vector, λ3Uavg_c
      Figure US12223632-20250211-P00011
      C×1×1 and λ4Umax_c
      Figure US12223632-20250211-P00011
      C×1×1, and λ3 and λ4 are adjust hyperparameters for different pooling operation weights, and taking λ3=2, λ4=1; Uavg_c and Umax_c are calculated by the following formulas:
U avg _ c ( k , 1 , 1 ) = AvgPoo l ( X ) = 1 W × H i = 1 W j = 1 H X ( k , i , j ) , k [ 1 , C ] U max _ c ( k , 1 , 1 ) = MaxPoo l ( X ) = max i [ 1 , W ] , j [ 1 , H ] ( X ( k , i , j ) ) , k [ 1 , C ]
Subsequently, introducing point-wise convolution (PConv) as a channel context aggregator to realize point-wise inter-channel interaction; in order to reduce amount of parameters, PConv is designed in a form of an Hourglass, and setting an attenuation ratio to r; finally, channel attention is obtained force weight Ucha
Figure US12223632-20250211-P00001
C×1×1; the calculation process of this sub-module is as follows:
Figure US12223632-20250211-P00008
2(X)=Û=U cha ⊗X=σ(ΣPConv([λ3 U avg_c4 U max_c]))⊗X
which is equivalent to:
Figure US12223632-20250211-P00008
2(X)=σ(ΣPConv2(δ(PConv1([λ3 U avg_c4 U max_c]))))⊗X
    • δ represents the ReLU activation function; the size of the convolution kernel of PConv1 is C/r×C×1×1, and the size of the convolution kernel of the inverse transform PConv2 is C×C/r×1×1; selecting ratio r as 16, the channel attention weights is copied along the width and height directions;
    • wherein the PAM is a plug-and-play module, which ensures the strict consistency of the input tensor and output tensor at a dimension level; PAM is configured to be embedded in any position of any convolutional neural network model as a supplementary module, the method of providing PAM embedding Hourglass and ResNet including: for the ResNet network, the PAM is embedded in the residual block after the batch normalization layer, before the residual connection, and the same in each residual block; for the Hourglass network, it is divided into two parts: downsampling and upsampling and the downsampling part embeds the PAM between the residual blocks, as a transition module, the upsampling part embeds the PAM before the residual connection. Details of embedment is illustrated in FIG. 6 and FIG. 7 .
Further, the LIDAR-based unmanned surface vehicle of the third component comprises four modules including, a hull module, a video acquisition module, a lidar navigation module and ground station module, working together in a cooperative manner.
The hull module includes a trimaran and a power system; the trimaran is configured to be stable, resist level 6 wind and waves, and has an effective remote control distance of 500 meters, adaptable to engineering application scenarios; the size of the hull is 75×47×28 cm, which is convenient for transportation; an effective load of the surface vehicle is 5 kg, and configured to be installed with multiple scientific instruments; in addition, the unmanned surface vehicle has the function of constant speed cruise, which reduces the control burden of personnel.
The video acquisition module is composed of a three-axis camera pan/tilt, a fixed front camera and a fill light; the three-axis camera pan/tilt supports 10× optical zoom, auto focus, photography and 60 FPS video recording; said video acquisition module is configured to meet the needs of faults of different scales and locations shooting requirements; the fixed front camera is configured to determine a hull posture; a picture is transmitted back to a ground station in real time through a wireless image transmission device, on the one hand for fault identification, on the other hand for assisting a control of the USV; a controllable LED fill light board is installed to cope with small and medium-sized bridges and other low-light working environments, which contains 180 high-brightness LED lamp beads; 3D print a pan/tilt carrying the LED fill light board to meet the needs of multi-angle fill light; in addition, a fixed front-view LED light is also installed beads, providing light source support for the front-view camera.
The lidar navigation module includes lidar, mini computer, a set of transmission system and control system; lidar is configured to perform 360° omnidirectional scanning; after it is connected with the mini computer, it can perform real-time mapping of the surrounding environment of the unmanned surface vehicle; through wireless image transmission, the information of the surrounding scene is transmitted back to the ground station in real time, so as to realize the lidar navigation of the unmanned surface vehicle; based on the lidar navigation, the unmanned surface vehicle no longer needs GPS positioning, in areas with weak GPS signals such as under the bridges and underground culverts; the wireless transmission system supports real-time transmission of 1080 P video, with a maximum transmission distance of 10 kilometers; redundant transmission is used to ensure link stability and strong anti-interference; the control system consists of wireless image transmission equipment, Pixhawk 2.4.8 flight control and SKYDROID T12 receiver, and through the flight control and receiver, the control system effectively control the equipment on board.
The ground station module includes two remote controls and multiple display devices; a main remote control is used to control the unmanned surface vehicle, and a secondary remote control is used to control the surface vehicle borne scientific instruments, and the display device is used to monitor the real-time information returned by the camera and lidar; on the one hand, the display device displays the picture in real time, and on the other hand, it processes the image in real time to identify the fault; the devices cooperate with each other to realize the intelligent fault detection without a GPS signal.
Embodiment 1
The inventors tested the proposed technical solutions of the present invention under the condition of a water system bridge group (for example, Jiulong Lake water system bridge group in Nanjing, Jiangsu Province, China), as shown in FIG. 8 . The 3D lidar carried by the unmanned surface vehicle is combined with the SLAM algorithm, and the real-time mapping effect is shown in FIG. 9 . There are 5 small and medium sized bridges in the bridge group. The collected images include three types of faults: cracking, flaking and rebar exposure. The pixel resolution of the fault images is 512×512. Model building, training and testing based on the PyTorch deep learning framework. The Batchsize during training is taken as 2, the Batchsize during testing is taken as 1, and the learning rate is taken as 5×10−4. The detection result of the solution proposed by the present invention is shown in FIG. 9 , and the heat map is the visual result directly output by the network, which can provide evidence for the result of target detection.
The detection method disclosed in the present invention is also compared with the state-of-the-art object detection models on the same dataset, including the widely influential object detection method Faster R-CNN in Anchor-based methods and obtained in the industry. The latest YOLOv5 model in the widely used YOLO method, the acclaimed CenterNet method in Anchor-free. In addition, we also compared attention module PAM of the present invention with SENet and CBAM, the excellent and classic attention modules recognized by the deep learning community.
The chosen evaluation metrics are the average precision AP and average recall AR, which are commonly used in the deep learning field. They are the average values of different categories and different images. The calculation process is briefly described below. First introduce a key concept, the intersection of IoU. It is a common concept in the field of target detection. It measures the degree of overlap between the candidate box, that is, the prediction result of the model and the ground-truth bounding box, that is, the ratio of intersection and union, which can be calculated by the following formula.
IoU = area ( Prediction results GroundTruth ) area ( Prediction results GroundTruth )
For each prediction box, three relationships are considered between it and the ground-truth bounding box. The number of prediction boxes with the IoU of the ground-truth bounding box greater than the specified threshold is recorded as the true class TP; the number of prediction boxes with the IoU of the ground truth bounding box less than the threshold is recorded as the false positive class FP, the number of undetected ground-truth bounding box, denoted as false negative class FP. Then the accuracy can be calculated as
Precision = TP TP + FP = TP all detections
The recall rate can be calculated as
Recall = TP TP + FN = TP all ground truths
Therefore, depending on the IoU threshold, different accuracies can be calculated. The IoU is usually divided into 10 classes, 0.50:0.05:0.95. AP50 used in the example is the precision when the IoU threshold is 0.50, AP75 is the precision when the IoU threshold is 0.75, and the average precision AP represents the average precision under 10 IoU thresholds, that is,
AP = 1 1 0 ( A P 5 0 + A P 5 5 + A P 6 0 + + AP 9 5 )
This is the most important metric to measure model checking performance. The average recall AR is the maximum recall for each image given 1, 10, and 100 detections. Then averaging under the category and 10 IoU thresholds, 3 sub-indicators AR1, AR10 and AR100 can be obtained. Obviously, the closer the values of AP and AR are to 1, the better the test results and the closer to the label.
The comparison of prediction results between different methods is shown in FIG. 10 below, where the parameter quantity is the quantity of a constant deep learning model “volume”. FPS (frame-per-second) represents the number of images processed by the algorithm in one second, which is the running speed of the algorithm. Compared with the Faster R-CNN method, the method proposed by the present invention is significantly better than the Faster R-CNN in the two dimensions of efficiency and accuracy. Compared with the 4 sub-versions of YOLO v5, YOLO v5s, YOLO v5m, YOLO v51 and YOLO v5x, we can see that the effect is not very ideal, and we are very shocked by the poor detection results of YOLOv5. Comparable performance can only be achieved by training the best YOLO v5 sub-version YOLO v5x for more Epochs. Although YOLO v5 is slightly faster in running speed, its accuracy is far inferior to the method proposed in this paper. Compared with the CenterNet method, the running speed is the same, but the detection effect is much higher than that of CenterNet. Two conclusions can be drawn from the comparison at the attention module level: (1) The PAM proposed by the present invention can achieve a general and substantial enhancement effect on different deep learning models under the premise of sacrificing a small amount of computation; (2) Compared with SENet and CBAM, PAM can obtain more enhancement, which is obviously better than SENet and CBAM.
The comparison of the training process between different methods is shown in FIG. 11 , and the method proposed in the present invention is marked with a circle. It can be clearly seen that although the training results will oscillate to different degrees, our method can generally achieve higher AP and AR than traditional methods. That is, a better target detection effect can be obtained.
To sum up, the specific embodiment verifies the effectiveness of the technical solution of the present invention and the applicability to complex engineering. Compared with the traditional deep learning method, the proposed intelligent detection method is more suitable for multi-disease detection with variable slenderness ratio and complex shape. The proposed unmanned ship system also has high robustness and high practicability.
The above disclosure is only a typical embodiment of the present invention. However, the embodiment of the present invention is not limited thereto. After reading the patent by any person skilled in the art, the homogeneous modification of the patent should fall into the protection of the present invention scope.

Claims (6)

The invention claimed is:
1. A method of using an intelligent detection system for detecting multiple types of faults for near-water bridges, comprising
providing the intelligent detection system, comprised of
a first component, an intelligent detection algorithm: CenWholeNet, an infrastructure fault target detection network based on deep learning, being electrically coupled to a second component;
the second component, an embedded parallel attention module PAM into the target detection network CenWholeNet, the parallel attention module includes two sub-modules: a spatial attention sub-module and a channel attention sub-module, being electrically coupled to a third component; and
the third component, an intelligent detection equipment assembly: an unmanned surface vehicle system based on lidar navigation, the unmanned surface vehicle includes four modules, a hull module, a video acquisition module, a lidar navigation module and a ground station module;
a computer readable storage medium, having stored thereon a computer program, said program arranged to:
Step 1: using a primary network to extract features of images;
Step 2: converting the extracted image features, by a detector, into tensor forms required for calculation, and optimizing a result through a loss function;
Step 3: outputting results includes converting the tensor forms into a boundary box and outputting of prediction results of target detection.
2. The method of claim 1, wherein
Step 1 of the infrastructure fault target detection network CenWholeNet in the first component having the primary network, the method of using the primary network is as follows:
giving an input image P∈
Figure US12223632-20250211-P00012
W×H×3, wherein W is the width of the image, His the height of the image, and 3 represents the number of channels of the image, that is, three RGB channels;
extracting features of the input image P through the primary network;
using two convolutional neural network models, Hourglass network and deep residual network ResNet.
3. The method of claim 1, wherein Step 2 of the infrastructure fault target detection network CenWholeNet in the first component having the detector, the method of using the detector is as follows:
converting the features extracted by the primary network into an output set consisting of 4 tensors
Figure US12223632-20250211-P00013
=[{tilde over (H)},{tilde over (D)},Õ,
Figure US12223632-20250211-P00014
], by the detector, as a core of CenWholeNet;
using
H ~ [ 0 , TagBox[",", "NumberComma", Rule[SyntaxForm, "0"]] 1 ] C × W r × H r
 represent a heat map of a central key point, where C is a category of the fault, which is taken as C=3 here, and r is an output step size, that is, the down sampling ratio, wherein a default step size is 4 and by down sampling, the calculation efficiency is improved;
defining
H [ 0 , TagBox[",", "NumberComma", Rule[SyntaxForm, "0"]] 1 ] C × W r × H r
 as a ground-truth heatmap, for category c, the ground-truth center point of location (i, j) is pcij
Figure US12223632-20250211-P00015
C×W×H;
first, computing the ground-truth center point of location (i, j)'s down sampled equivalent position
p ˆ cxy C × W r × H r ,
 wherein
x = i r , y = j r ;
 then through a Gaussian kernel function, map
p ˆ cxy C × W r × H r
 to tensor
Y p C × W r × H r
 Yp is defined by:
Y p ( c , x , y ) = exp ( - ( x - p ^ c x y ( x ) ) 2 + ( y - p ˆ cxy ( y ) ) 2 2 σ p 2 )
wherein {circumflex over (p)}cxy(x) and {circumflex over (p)}cxy(y) represent center point position (x,y),
σp=gaussian_radius/3; and gaussian_radius represent a maximum radius representing an offset of the corner points of a detection frame, wherein the maximum radius ensures that the intersection ratio between the offset detection frame and the ground-truth detection frame is IoU≥t, and t=0.7 is taken in all experiments; integrating all the corresponding Yp points to get the ground-truth heat map H:
H c , x , y = max p [ Y p ( c , x , r ) ] , c [ 1 , C ] , x [ 1 , W r ] , y [ 1 , H r ]
wherein, Hc,x,y represents a value of H at the position (c,x,y), a probability that this position is a center point; specifically, Hc,x,y=1 represents a central key point, a positive sample; conversely, Hc,x,y=0 is a background and the negative sample; focal loss as a metric to measure a distance between H and H, according to the following equation:
Heat = - 1 N c = 1 C x = 1 W / r y = 1 H / r { ( 1 - H ~ c , x , y ) α log ( H ~ c , x , y ) if H c , x , y = 1 ( H ~ c , x , y ) α log ( 1 - H ~ c , x , y ) ( 1 - H c , x , y ) β Otherwise
wherein N is a total count of all central key points, α and β are hyperparameters, configured to control the weights; in all cases, α=2, β=4; by minimizing
Figure US12223632-20250211-P00016
Heat, the neural network model is configured to better predict a position of a center point of the target;
obtaining a size information W×H of a prediction box to finally determine the boundary box;
defining a size of a ground-truth boundary box corresponding to the kth key point pk be dk=(wk,hk), and integrate all dk to get a ground-truth boundary box dimension tensor
D 2 × W r × H r D = d 1 d 2 d N
wherein ⊕ represents pixel-level addition; for all fault categories, the model is configured to give a predicted dimension tensor
D ~ 2 × W r × H r ,
 and L1 Loss is configured to measure D and {tilde over (D)} similarity, determined by the following equation:
D = 1 N k = 1 N Smooth L 1 Loss ( d ~ k , d k ) = 1 N k = 1 N { 0.5 d ~ k - d k 2 2 if d ~ k - d k 1 < 1 d ~ k - d k 1 - 0.5 Otherwise
obtaining a rough width and height of each prediction box by minimizing
Figure US12223632-20250211-P00017
D, by the model;
correcting an error caused by down sampling by introducing a position offset, because the image is scaled by r times; recording the coordinates of the kth key point pk as (xk,yk), then the mapped coordinates are
( x k r , y k r ) ,
then get the ground-truth offset:
o k = ( x k r - x k r , y k r - y k r )
integrating all ok to get the ground-truth offset matrix
O 2 × W r × H r ; O = o 1 o 2 ... o N
wherein, the 2 of a first dimension represents the offset of the key point (x, y) in the W and H directions; correspondingly, the model will give a prediction tensor
O ~ 2 × W r × H r ,
 and smooth L1 Loss is used to train the offset loss:
Off = 1 N k = 1 N Smooth L 1 Loss ( o ~ k , o k ) = 1 N k = 1 N { 0.5 o ~ k - o k 2 2 if o ~ k - o k 1 < 1 o ~ k - o k 1 - 0.5 Otherwise
introducing a new set of tensors to modify the prediction frame and improve the detection accuracy, in order to make the model pay more attention to the overall information of the target; specifically, taking an angle between a connecting line of diagonal points of the detection frame and the x-axis, and the diagonal length of the detection frame as the training targets; defining coordinates of an upper left corner and lower right corner of the detection frame to be (xk 1,yk 1) and (xk 2,yx 2), so the diagonal length of the detection frame lk is calculated as:

l k=√{square root over ((x k 1 −x k 2)2+(y k 1 −y k 2)2)}
an inclination of the connecting line between the upper left and lower right corners θk is calculated by the following formula:
θ k = π - arc tan ( y k 2 - y k 1 x k 2 - x k 1 )
constructing a pair of complementary polar coordinates
polar k = ( 1 2 l k , θ k )
 and further to obtain a ground-truth polar coordinate matrix
Polar 2 × W r × H r : Polar = ( 1 2 l 1 , θ 1 ) ( 1 2 l 2 , θ 2 ) ( 1 2 l N , θ N )
the model also gives a prediction tensor
2 × W r × H r ;
 Polar and
Figure US12223632-20250211-P00018
is trained by a same L1 loss:
Polar = 1 N k = 1 N - polar k 1
for each position, the model will predict the output of C+6, which will form the set
Figure US12223632-20250211-P00019
=[{tilde over (H)},{tilde over (D)},Õ,
Figure US12223632-20250211-P00020
], which will also share the weights of the network; and the loss function of is defined by:

Figure US12223632-20250211-P00021
=
Figure US12223632-20250211-P00022
HeatOff
Figure US12223632-20250211-P00023
OffD
Figure US12223632-20250211-P00024
DPolar
Figure US12223632-20250211-P00025
Polar
Wherein all the experiments, λOff=10, λD and λPolar are both take as 0.1.
4. The method of claim 1, wherein Step 3 of the infrastructure fault target detection network CenWholeNet in the first component, the method of outputting result is as follows:
outputting results by extracting a possible center keypoint coordinates from a predicted heatmap tensor {tilde over (H)}, and then obtaining a predicted bounding box according to the information in the corresponding {tilde over (D)}, Õ and
Figure US12223632-20250211-P00018
; wherein the greater the value of {tilde over (H)}c,x,y, the more likely it point pcxy is the center point; for category c, if the point pcxy satisfies the following formula, it is then pcxy is an candidate center point;
H ~ c , x , y = max i 2 1 , j 2 1 , i , j [ H ~ c , x + i , y + j ]
wherein we do not need non-maximum suppression (NMS), using a 3×3 max-pooling convolutional layer to extract candidate center points; letting a set of center points {tilde over (P)}={({tilde over (x)}k,{tilde over (y)}k)}k=1 N p , wherein Np is a total number of selected center points; for any center points ({tilde over (x)}k,{tilde over (y)}k), extract corresponding size information ({tilde over (w)}k,{tilde over (h)}k)=({tilde over (D)}1,{tilde over (x)} k ,{tilde over (y)} k , {tilde over (D)}2,{tilde over (x)} k ,{tilde over (y)} k ), offset information (δ{tilde over (x)}k,δ{tilde over (y)}k)=(Õ1,{tilde over (x)} k ,{tilde over (y)} k 2,{tilde over (x)} k ,{tilde over (y)} k ) and polar coordinate information ({tilde over (l)}k,{tilde over (θ)}k)=(
Figure US12223632-20250211-P00026
1,{tilde over (x)} k ,{tilde over (y)} k ,
Figure US12223632-20250211-P00027
2,{tilde over (x)} k ,{tilde over (y)} k ); first, calculate the prediction frame size correction value according to ({tilde over (l)}k,{tilde over (θ)}k):
{ Δ h ~ k = l ~ k sin ( θ ~ k ) Δ w ~ k = - l ~ k cos ( θ ~ k )
defining specific location of the prediction box as
{ Top = y ~ k + δ y ~ k - ( α y · 1 2 h ~ k + β y · Δ h ~ k ) ; Bottom = y ~ k + δ y ~ k + ( α k · 1 2 h ~ k + β y · Δ h ~ k ) Left = x ~ k + δ x ~ k - ( α x · 1 2 w ~ k + β x · Δ w ~ k ) ; Right = x ~ k + δ x ~ k + ( α x · 1 2 w ~ k + β k · Δ w ~ k )
wherein bounding box resizing hyperparameters as αyx=0.9, βyx=0.1.
5. The method of claim 1, wherein the method of establishing the parallel attention module in the second component is as follows:
providing a lightweight, plug-and-play parallel attention module PAM, configured to improves expressiveness of neural networks; wherein PAM considers two dimensions of feature map attention, spatial attention and channel attention force, and combine them in parallel;
giving an input feature map as X∈
Figure US12223632-20250211-P00028
C×W×H wherein C, H and W denote channel, height and width, respectively; first, transforming
Figure US12223632-20250211-P00029
1 by implementing the spatial attention submodule: X→Ũ∈
Figure US12223632-20250211-P00030
C×W×H; then, transforming
Figure US12223632-20250211-P00031
2 by implementing the channel attention sub-module: X→Û∈
Figure US12223632-20250211-P00032
C×W×H finally, outputting feature map U∈
Figure US12223632-20250211-P00033
C×W×H; the transformations consists essentially of convolution, maximum pooling operation, mean pooling operation and ReLU function; and overall calculation process is as follows:

U=Ũ⊕Û=
Figure US12223632-20250211-P00034
1(X)⊕
Figure US12223632-20250211-P00035
2(X)
wherein ⊕ represents output pixel-level tensor addition;
wherein the spatial attention sub-module is configured to emphasize “where” to improve attention, and pay attention to the locations of regions of interest (ROIs); first, maximum pooling and mean pooling operations are performed on the feature map along a channel direction to obtain several two-dimensional images, λ1Uavg_s
Figure US12223632-20250211-P00036
1×W×H and λ2Umax_s
Figure US12223632-20250211-P00037
1×W×H, wherein λ1 and λ2 are adjust hyperparameters for different pooling operation weights, and taking λ1=2, λ2=1; Uavg_s
Figure US12223632-20250211-P00038
Umax_s are calculated by the following formulas, and MaxPool and AvgPool represent maximum pooling operation and the average pooling operation respectively;
U avg _ s ( 1 , i , j ) = AvgPool ( X ) = 1 C k = 1 C X ( k , i , j ) , i [ 1 , W ] , j [ 1 , H ] U max _ s ( 1 , i , j ) = Max Pool ( X ) = max k [ 1 , C ] ( X ( k , i , j ) ) , i [ 1 , W ] , j [ 1 , H ]
next, introducing convolution operation to generate the spatial attention weight Uspa
Figure US12223632-20250211-P00039
1×W×H; the overall calculation process of the spatial attention sub-module is as follows:

Figure US12223632-20250211-P00040
1(X)=Ũ=U spa ⊗X=σ(Conv([λ1 U avg_s2 U max_s]))⊗X

which is equivalent to:

Figure US12223632-20250211-P00041
1(X)=σ(Conv([MaxPool(X),AvgPool(X),AvgPool(X)]))⊗X
wherein, ⊗ represents pixel-level tensor multiplication, σ represents a sigmoid activation function, Conv represents a convolution operation, and a convolution kernel size is 3×3; and a spatial attention weight is copied along a channel axis;
the channel attention sub-module is configured to find the relationship of internal channels, and care about “what” is interesting in a given feature map; first, mean pooling and max pooling are performed along width and height directions to generate a number of 1-dimensional vector, λ3Uavg_c
Figure US12223632-20250211-P00042
C×1×1
Figure US12223632-20250211-P00043
λ4Umax_c
Figure US12223632-20250211-P00044
C×1×1, and λ3 and λ4 are adjust hyperparameters for different pooling operation weights, and taking λ3=2, λ4=1; Uavg_c and Umax_c are calculated by the following formulas:
U avg _ c ( k , 1 , 1 ) = AvgPool ( X ) = 1 W × H i = 1 W j = 1 H X ( k , i , j ) , k [ 1 , C ] U max _ c ( k , 1 , 1 ) = Max Pool ( X ) = max i [ 1 , W ] , j [ 1 , H ] ( X ( k , i , j ) ) , k [ 1 , C ]
subsequently, introducing point-wise convolution (PConv) as a channel context aggregator to realize point-wise inter-channel interaction; in order to reduce amount of parameters, PConv is designed in a form of an Hourglass, and setting an attenuation ratio to r; finally, channel attention is obtained force weight Ucha
Figure US12223632-20250211-P00045
C×1×1; the calculation process of this sub-module is as follows:
2 ( X ) = U ^ = U cha X = σ ( PConv ( [ λ 3 U avg_c , λ 4 U max _ c ] ) ) X
which is equivalent to
2 ( X ) = σ ( PConv 2 ( δ ( PConv 1 ( [ λ 3 U avg _ c , λ 4 U max _ c ] ) ) ) ) X
δ represents the ReLU activation function; the size of the convolution kernel of PConv1 is C/r×C×1×1, and the size of the convolution kernel of the inverse transform PConv2 is C×C/r×1×1; selecting ratio r as 16, the channel attention weights is copied along the width and height directions;
wherein the PAM is a plug-and-play module, which ensures the strict consistency of the input tensor and output tensor at a dimension level; PAM is configured to be embedded in any position of any convolutional neural network model as a supplementary module, the method of providing PAM embedding Hourglass and ResNet including: for the ResNet network, the PAM is embedded in the residual block after the batch normalization layer, before the residual connection, and the same in each residual block; for the Hourglass network, the Hourglass network is divided into two parts: downsampling and upsampling and the downsampling part embeds the PAM between the residual blocks, as a transition module, the upsampling part embeds the PAM before the residual connection.
6. The method of claim 1, wherein the LIDAR-based unmanned surface vehicle of the third component comprises
four modules including, the hull module, the video acquisition module, the lidar navigation module and ground station module, working together in a cooperative manner;
the hull module includes a trimaran and a power system; the trimaran is configured to be stable, resist level 6 wind and waves, and has an effective remote control distance of 500 meters, adaptable to engineering application scenarios; the size of the hull is 75×47×28 cm, which is convenient for transportation; an effective load of the surface vehicle is 5 kg, and configured to be installed with multiple scientific instruments; in addition, the unmanned surface vehicle has the function of constant speed cruise, which reduces the control burden of personnel;
the video acquisition module is composed of a three-axis camera pan/tilt, a fixed front camera and a fill light; the three-axis camera pan/tilt supports 10× optical zoom, auto focus, photography and 60 FPS video recording; said video acquisition module is configured to meet the needs of faults of different scales and locations shooting requirements; the fixed front camera is configured to determine a hull posture; a picture is transmitted back to a ground station in real time through a wireless image transmission device, on the one hand for fault identification, on the other hand for assisting a control of the USV; a controllable LED fill light board is installed to cope with small and medium-sized bridges and other low-light working environments, which contains 180 high-brightness LED lamp beads; 3D print a pan/tilt carrying the LED fill light board to meet the needs of multi-angle fill light; in addition, a fixed front-view LED light is also installed beads, providing light source support for the front-view camera;
the lidar navigation module includes lidar, mini computer, a set of transmission system and control system; lidar is configured to perform 360° omnidirectional scanning; after the lidar is connected with the mini computer, the lidar can perform real-time mapping of the surrounding environment of the unmanned surface vehicle; through wireless image transmission, the information of the surrounding scene is transmitted back to the ground station in real time, so as to realize the lidar navigation of the unmanned surface vehicle; based on the lidar navigation, the unmanned surface vehicle no longer needs GPS positioning, in areas with weak GPS signals such as under the bridges and underground culverts; the wireless transmission system supports real-time transmission of 1080P video, with a maximum transmission distance of 10 kilometers; redundant transmission is used to ensure link stability and strong anti-interference; the control system consists of wireless image transmission equipment, Pixhawk 2.4.8 flight control and SKYDROID T12 receiver, and through the flight control and receiver, the control system effectively control the equipment on board;
the ground station module includes two remote controls and multiple display devices; a main remote control is used to control the unmanned surface vehicle, and a secondary remote control is used to control the surface vehicle borne scientific instruments, and the display device is used to monitor the real-time information returned by the camera and lidar; the display device displays the picture in real time; the devices cooperate with each other to realize the intelligent fault detection without a GPS signal.
US17/755,086 2021-03-17 2021-05-08 Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges Active US12223632B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN202110285996.5A CN112884760B (en) 2021-03-17 2021-03-17 Intelligent detection method for multiple types of diseases near water bridges and unmanned ship equipment
CN202110285996.5 2021-03-17
PCT/CN2021/092393 WO2022193420A1 (en) 2021-03-17 2021-05-08 Intelligent detection method for multiple types of diseases of bridge near water, and unmanned surface vessel device

Publications (2)

Publication Number Publication Date
US20230351573A1 US20230351573A1 (en) 2023-11-02
US12223632B2 true US12223632B2 (en) 2025-02-11

Family

ID=76041072

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/755,086 Active US12223632B2 (en) 2021-03-17 2021-05-08 Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges

Country Status (3)

Country Link
US (1) US12223632B2 (en)
CN (1) CN112884760B (en)
WO (1) WO2022193420A1 (en)

Families Citing this family (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021246217A1 (en) * 2020-06-05 2021-12-09 コニカミノルタ株式会社 Object detection method, object detection device, and program
CN113256601B (en) * 2021-06-10 2022-09-13 北方民族大学 Pavement damage detection method and system
CN113627245B (en) * 2021-07-02 2024-01-19 武汉纺织大学 CRTS target detection method
CN113808077A (en) * 2021-08-05 2021-12-17 西人马帝言(北京)科技有限公司 A target detection method, device, equipment and storage medium
CN113989614B (en) * 2021-10-19 2024-10-15 南京航空航天大学 A post-processing and diagonal evaluation method for road damage target detection
CN113870265B (en) * 2021-12-03 2022-02-22 绵阳职业技术学院 Industrial part surface defect detection method
CN114266299A (en) * 2021-12-16 2022-04-01 京沪高速铁路股份有限公司 Method and system for detecting defects of steel structure of railway bridge based on unmanned aerial vehicle operation
CN114358054B (en) * 2021-12-16 2024-11-08 中国人民解放军战略支援部队信息工程大学 Broadband wireless communication signal detection method and system under complex environment
CN114266892B (en) * 2021-12-20 2024-11-29 江苏燕宁工程科技集团有限公司 Pavement disease recognition method and system for multi-source data deep learning
CN114332664B (en) * 2022-01-04 2025-09-16 西南大学柑桔研究所 Method and device for identifying plant diseases and insect pests, electronic equipment and storage medium
CN114663774B (en) * 2022-05-24 2022-12-02 之江实验室 Lightweight salient object detection system and method
CN114820620B (en) * 2022-06-29 2022-09-13 中冶建筑研究总院(深圳)有限公司 Bolt loosening defect detection method, system and device
CN115061113B (en) * 2022-08-19 2022-11-01 南京隼眼电子科技有限公司 Target detection model training method and device for radar and storage medium
US12423786B2 (en) * 2022-08-22 2025-09-23 Nanjing University Of Posts And Telecommunications Multi-scale fusion defogging method based on stacked hourglass network
CN115305808A (en) * 2022-08-30 2022-11-08 武汉工程大学 Integrated application method and system of multi-type bridge detection equipment based on unmanned platform
CN115661032A (en) * 2022-09-22 2023-01-31 北京工业大学 An Intelligent Detection Method for Pavement Diseases Applicable to Complex Background
CN115393655A (en) * 2022-09-28 2022-11-25 南京华苏科技有限公司 Method for detecting industrial carrier loader based on YOLOv5s network model
CN115681736A (en) * 2022-10-31 2023-02-03 中国科学院沈阳自动化研究所 Modular large-load underwater electric pan-tilt device
CN115909072B (en) * 2022-11-29 2025-09-30 中国人民解放军海军工程大学 A water column detection method for impact point based on improved YOLOv4 algorithm
CN115574785B (en) * 2022-12-12 2023-02-28 河海大学 Water conservancy project safety monitoring method and platform based on data processing
CN116008285A (en) * 2023-01-04 2023-04-25 上海市建筑科学研究院有限公司 A bridge detection system and method based on an unmanned ship
CN116597261B (en) * 2023-03-06 2025-11-11 江苏科技大学 Unmanned ship electronic image stabilization and target detection method based on space-time context fusion
CN115953408B (en) * 2023-03-15 2023-07-04 国网江西省电力有限公司电力科学研究院 YOLOv 7-based lightning arrester surface defect detection method
CN116228740B (en) * 2023-04-07 2025-11-21 河海大学 Small sample chip appearance defect detection method and detection system based on improvement YOLOv5
CN116295020B (en) * 2023-05-22 2023-08-08 山东高速工程检测有限公司 Method and device for locating bridge defects
CN116740704B (en) * 2023-06-16 2024-02-27 安徽农业大学 Method and device for monitoring the change rate of wheat leaf phenotypic parameters based on deep learning
CN117036470B (en) * 2023-06-28 2025-12-26 南京信息工程大学 A Method for Object Recognition and Pose Estimation in a Grasping Robot
CN116777895B (en) * 2023-07-05 2024-05-31 重庆大学 Concrete bridge Liang Biaoguan disease intelligent detection method based on interpretable deep learning
CN116895007B (en) * 2023-07-18 2025-08-15 西南石油大学 Small target detection method based on improvement YOLOv n
CN117036902B (en) * 2023-07-20 2025-12-30 中国人民解放军国防科技大学 A Vehicle Target Recognition Method Based on Hybrid Attention Mechanism in SAR Images
CN116958696B (en) * 2023-07-31 2025-10-03 长安大学 A method and system for classifying objects using hyperspectral remote sensing technology from unmanned aerial vehicles
CN116882459B (en) * 2023-08-02 2025-11-28 南通大学 Neural network construction method for identifying crop foliar diseases
CN117115688B (en) * 2023-08-17 2026-01-30 广东海洋大学 A Deep Learning-Based System and Method for Detecting and Counting Dead Fish in Low-Light Environments
CN117054891A (en) * 2023-10-11 2023-11-14 中煤科工(上海)新能源有限公司 Battery life prediction method and prediction device
CN117333845B (en) * 2023-11-03 2025-02-07 东北电力大学 A real-time detection method for small target traffic signs based on improved YOLOv5s
CN117541922B (en) * 2023-11-09 2024-08-06 国网宁夏电力有限公司建设分公司 SF-YOLOv-based power station roofing engineering defect detection method
CN117218329B (en) * 2023-11-09 2024-01-26 四川泓宝润业工程技术有限公司 Wellhead valve detection method and device, storage medium and electronic equipment
CN117523447A (en) * 2023-11-13 2024-02-06 常州工学院 A lightweight ship real-time video detection method based on YOLO-v5
CN117724137B (en) * 2023-11-21 2024-08-06 江苏北斗星通汽车电子有限公司 Automobile accident automatic detection system and method based on multi-mode sensor
CN117272245B (en) * 2023-11-21 2024-03-12 陕西金元新能源有限公司 Fan gear box temperature prediction method, device, equipment and medium
CN117705800A (en) * 2023-11-23 2024-03-15 华南理工大学 Mechanical arm vision bridge detection system based on guide rail sliding and control method thereof
CN117390407B (en) * 2023-12-13 2024-04-05 国网山东省电力公司济南供电公司 Fault identification method, system, medium and equipment of substation equipment
CN117830965B (en) * 2023-12-13 2025-10-21 华南理工大学 A vehicle detection method based on guiding spatial attention based on road semantic information
CN117456610B (en) * 2023-12-21 2024-04-12 浪潮软件科技有限公司 Climbing abnormal behavior detection method and system and electronic equipment
CN118034269B (en) * 2024-01-12 2024-09-24 淮阴工学院 An adaptive control method for ship intelligent maneuvering
CN117689731B (en) * 2024-02-02 2024-04-26 陕西德创数字工业智能科技有限公司 Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model
CN117710755B (en) * 2024-02-04 2024-05-03 江苏未来网络集团有限公司 A vehicle attribute recognition system and method based on deep learning
CN117727104B (en) * 2024-02-18 2024-05-07 厦门瑞为信息技术有限公司 Near-infrared living body detection device and method based on bilateral attention
CN117788471B (en) * 2024-02-27 2024-04-26 南京航空航天大学 YOLOv 5-based method for detecting and classifying aircraft skin defects
CN117805658B (en) * 2024-02-29 2024-05-10 东北大学 Data-driven electric vehicle battery remaining life prediction method
CN118014997A (en) * 2024-04-08 2024-05-10 湖南联智智能科技有限公司 A pavement disease recognition method based on improved YOLOV5
CN118050729B (en) * 2024-04-15 2024-07-09 南京信息工程大学 A radar echo time downscaling correction method based on improved U-Net
WO2025241215A1 (en) * 2024-05-24 2025-11-27 杭州电子科技大学 Improved deep learning model-based refrigeration unit fault detection method
CN118298340B (en) * 2024-06-06 2024-07-30 北京理工大学长三角研究院(嘉兴) A method for dense target detection in UAV aerial photography based on prior knowledge
CN118799622B (en) * 2024-06-14 2025-05-02 长江宜昌航道局 Channel ship based on improved YOLOv s algorithm and navigation mark detection method
CN118396071B (en) * 2024-07-01 2024-09-03 山东科技大学 A boundary-driven neural network architecture for unmanned vessel environment understanding
CN119067919A (en) * 2024-07-31 2024-12-03 沈阳工业大学 A metal surface defect detection method based on improved YOLOv5
CN118587733B (en) * 2024-08-06 2024-10-22 安徽省交通规划设计研究总院股份有限公司 A bridge structure identification and parameter extraction method for bridge PDF design drawings
CN118604006B (en) * 2024-08-09 2024-10-29 佛山大学 Building wall safety detection method and system
CN119151974B (en) * 2024-08-28 2025-04-01 国家海洋局北海预报中心((国家海洋局青岛海洋预报台)(国家海洋局青岛海洋环境监测中心站)) A method, medium and system for detecting wave height based on semantic segmentation
CN118735914B (en) * 2024-08-30 2025-01-21 宁波未知数字信息技术有限公司 Ship wall surface treatment method
CN118762301B (en) * 2024-09-06 2025-03-28 北京智弘通达科技有限公司 Efficient detection and treatment of track damage based on deep learning
CN119323665B (en) * 2024-09-25 2025-09-05 济南浪潮数据技术有限公司 Optical remote sensing image target detection method, device and medium
CN119027069B (en) * 2024-10-29 2025-02-25 中国电建集团华东勘测设计研究院有限公司 Photovoltaic project construction progress recognition method based on UAV images and neural network
CN119515817B (en) * 2024-11-05 2025-07-29 汇通建设集团股份有限公司 Strip steel surface defect detection method based on lightweight dual-enhancement network
CN119068446B (en) * 2024-11-06 2025-03-11 洛阳理工学院 Intelligent driving visual navigation method based on infrared target detection
CN119442910B (en) * 2024-11-10 2025-11-21 同济大学 Active expansion end device optimization control method based on deep learning model
CN119169207B (en) * 2024-11-21 2025-06-06 无锡车联天下信息技术有限公司 Vehicle-mounted monitoring dynamic linear radar wall visualization method based on deep learning
CN119600265B (en) * 2024-11-22 2025-10-28 广东工业大学 Light-weight fire hazard multi-target detection method in electric power environment
CN119323572B (en) * 2024-12-19 2025-07-01 国网江西省电力有限公司电力科学研究院 Improved insulator defect detection method and system based on RT-DETR
CN120070361A (en) * 2025-02-07 2025-05-30 北京科技大学 Steel surface defect detection method based on ACW-YOLO algorithm
CN119672541B (en) * 2025-02-19 2025-09-19 贵州黔通工程技术有限公司 Bridge bolt monitoring image recognition method and system based on deep learning
CN119810154B (en) * 2025-03-13 2025-06-03 湖南云箭科技有限公司 Method and system for extracting moving target velocity vector
CN119851100B (en) * 2025-03-21 2025-06-17 中国人民解放军国防科技大学 Method and device for detecting surface damage precursor of additive manufacturing device
CN120411824B (en) * 2025-04-21 2025-12-23 淮南师范学院 Unmanned aerial vehicle low altitude target location and recognition system
CN120070429B (en) * 2025-04-27 2025-07-22 华东交通大学 Method and system for detecting surface defects of new vehicle based on FamsYOLO network
CN120259290B (en) * 2025-06-04 2025-08-22 中数智科(杭州)科技有限公司 A method and system for detecting loose bolts in rail vehicles
CN120259982B (en) * 2025-06-04 2025-09-09 中国铁塔股份有限公司 Road construction detection methods, systems, equipment, media and program products
CN120384851B (en) * 2025-06-27 2025-09-05 水利部交通运输部国家能源局南京水利科学研究院 Wind power pile structure monitoring method and system
CN120472494B (en) * 2025-07-10 2025-09-16 华雁智能科技(集团)股份有限公司 Method, device, equipment and medium for identifying graphic elements of power grid plant wiring diagram
CN120874217A (en) * 2025-09-29 2025-10-31 中核四川环保工程有限责任公司 Nuclear retired building multi-mode self-adaptive modeling method and device based on scene driving

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107839845A (en) 2017-11-14 2018-03-27 江苏领安智能桥梁防护有限公司 A kind of unmanned monitoring ship
CN108288269A (en) 2018-01-24 2018-07-17 东南大学 Bridge pad disease automatic identifying method based on unmanned plane and convolutional neural networks
CN109300126A (en) 2018-09-21 2019-02-01 重庆建工集团股份有限公司 A high-precision intelligent detection method for bridge diseases based on spatial location
CN109978847A (en) 2019-03-19 2019-07-05 东南大学 Drag-line housing disease automatic identifying method based on transfer learning Yu drag-line robot
US10354169B1 (en) * 2017-12-22 2019-07-16 Motorola Solutions, Inc. Method, device, and system for adaptive training of machine learning models via detected in-field contextual sensor events and associated located and retrieved digital audio and/or video imaging
US20200043229A1 (en) * 2017-08-11 2020-02-06 Jing Jin Incident site investigation and management support system based on unmanned aerial vehicles
CN111021244A (en) 2019-12-31 2020-04-17 川南城际铁路有限责任公司 An intelligent orthotropic steel bridge deck fatigue crack detection robot
CN111062437A (en) 2019-12-16 2020-04-24 交通运输部公路科学研究所 An automatic target detection model for bridge structural diseases based on deep learning
CN111127399A (en) 2019-11-28 2020-05-08 东南大学 An underwater bridge pier disease identification method based on deep learning and sonar imaging
CN111260615A (en) 2020-01-13 2020-06-09 重庆交通大学 Detection method for apparent damage of UAV bridge based on fusion of laser and machine vision
CN111310558A (en) 2019-12-28 2020-06-19 北京工业大学 Pavement disease intelligent extraction method based on deep learning and image processing method
CN111413353A (en) 2020-04-03 2020-07-14 中铁隧道局集团有限公司 Tunnel lining disease comprehensive detection vehicle
US10719641B2 (en) * 2017-11-02 2020-07-21 Airworks Solutions, Inc. Methods and apparatus for automatically defining computer-aided design files using machine learning, image analytics, and/or computer vision
CN111651916A (en) 2020-05-15 2020-09-11 北京航空航天大学 A material property prediction method based on deep learning
CN111862112A (en) 2020-07-08 2020-10-30 哈尔滨工业大学(深圳) A medical image segmentation method based on deep learning and level set method
CN112171692A (en) 2020-10-15 2021-01-05 吉林大学 Intelligent detection device and method for bridge deflection
CN112465748A (en) 2020-11-10 2021-03-09 西南科技大学 Neural network based crack identification method, device, equipment and storage medium
CN112488990A (en) 2020-11-02 2021-03-12 东南大学 Bridge bearing fault identification method based on attention regularization mechanism
US11521357B1 (en) * 2020-11-03 2022-12-06 Bentley Systems, Incorporated Aerial cable detection and 3D modeling from images
US11769052B2 (en) * 2018-12-28 2023-09-26 Nvidia Corporation Distance estimation to objects and free-space boundaries in autonomous machine applications
US20240020953A1 (en) * 2022-07-15 2024-01-18 Nvidia Corporation Surround scene perception using multiple sensors for autonomous systems and applications

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222701B (en) * 2019-06-11 2019-12-27 北京新桥技术发展有限公司 Automatic bridge disease identification method
CN111324126B (en) * 2020-03-12 2022-07-05 集美大学 Vision unmanned ship
CN111507271B (en) * 2020-04-20 2021-01-12 北京理工大学 A method for intelligent detection and identification of airborne optoelectronic video targets
CN112069868A (en) * 2020-06-28 2020-12-11 南京信息工程大学 Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network
CN112329827B (en) * 2020-10-26 2022-08-23 同济大学 Increment small sample target detection method based on meta-learning

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200043229A1 (en) * 2017-08-11 2020-02-06 Jing Jin Incident site investigation and management support system based on unmanned aerial vehicles
US10719641B2 (en) * 2017-11-02 2020-07-21 Airworks Solutions, Inc. Methods and apparatus for automatically defining computer-aided design files using machine learning, image analytics, and/or computer vision
CN107839845A (en) 2017-11-14 2018-03-27 江苏领安智能桥梁防护有限公司 A kind of unmanned monitoring ship
US10354169B1 (en) * 2017-12-22 2019-07-16 Motorola Solutions, Inc. Method, device, and system for adaptive training of machine learning models via detected in-field contextual sensor events and associated located and retrieved digital audio and/or video imaging
CN108288269A (en) 2018-01-24 2018-07-17 东南大学 Bridge pad disease automatic identifying method based on unmanned plane and convolutional neural networks
CN109300126A (en) 2018-09-21 2019-02-01 重庆建工集团股份有限公司 A high-precision intelligent detection method for bridge diseases based on spatial location
US11769052B2 (en) * 2018-12-28 2023-09-26 Nvidia Corporation Distance estimation to objects and free-space boundaries in autonomous machine applications
CN109978847A (en) 2019-03-19 2019-07-05 东南大学 Drag-line housing disease automatic identifying method based on transfer learning Yu drag-line robot
CN111127399A (en) 2019-11-28 2020-05-08 东南大学 An underwater bridge pier disease identification method based on deep learning and sonar imaging
CN111062437A (en) 2019-12-16 2020-04-24 交通运输部公路科学研究所 An automatic target detection model for bridge structural diseases based on deep learning
CN111310558A (en) 2019-12-28 2020-06-19 北京工业大学 Pavement disease intelligent extraction method based on deep learning and image processing method
CN111021244A (en) 2019-12-31 2020-04-17 川南城际铁路有限责任公司 An intelligent orthotropic steel bridge deck fatigue crack detection robot
CN111260615A (en) 2020-01-13 2020-06-09 重庆交通大学 Detection method for apparent damage of UAV bridge based on fusion of laser and machine vision
CN111413353A (en) 2020-04-03 2020-07-14 中铁隧道局集团有限公司 Tunnel lining disease comprehensive detection vehicle
CN111651916A (en) 2020-05-15 2020-09-11 北京航空航天大学 A material property prediction method based on deep learning
CN111862112A (en) 2020-07-08 2020-10-30 哈尔滨工业大学(深圳) A medical image segmentation method based on deep learning and level set method
CN112171692A (en) 2020-10-15 2021-01-05 吉林大学 Intelligent detection device and method for bridge deflection
CN112488990A (en) 2020-11-02 2021-03-12 东南大学 Bridge bearing fault identification method based on attention regularization mechanism
US11521357B1 (en) * 2020-11-03 2022-12-06 Bentley Systems, Incorporated Aerial cable detection and 3D modeling from images
CN112465748A (en) 2020-11-10 2021-03-09 西南科技大学 Neural network based crack identification method, device, equipment and storage medium
US20240020953A1 (en) * 2022-07-15 2024-01-18 Nvidia Corporation Surround scene perception using multiple sensors for autonomous systems and applications

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Fu, Jun Dual Attention Network for Scene Segmentation arXiv: 1809.02983v4 Apr. 21, 2019.

Also Published As

Publication number Publication date
WO2022193420A1 (en) 2022-09-22
US20230351573A1 (en) 2023-11-02
CN112884760B (en) 2023-09-26
CN112884760A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
US12223632B2 (en) Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges
US12313732B2 (en) Contextual visual-based SAR target detection method and apparatus, and storage medium
US20240013505A1 (en) Method, system, medium, equipment and terminal for inland vessel identification and depth estimation for smart maritime
CN111461291A (en) Long-distance pipeline inspection method based on YOLOv3 pruning network and deep learning dehazing model
CN113240688A (en) Integrated flood disaster accurate monitoring and early warning method
CN111985376A (en) A deep learning-based method for extracting ship contours from remote sensing images
CN112347895A (en) Ship remote sensing target detection method based on boundary optimization neural network
CN116343077A (en) A fire detection and early warning method based on attention mechanism and multi-scale features
CN117911760B (en) Ship detection method in multi-scale SAR images based on attention mechanism
CN116844055A (en) Lightweight SAR ship detection method and system
CN111414807A (en) A Tide Identification and Crisis Early Warning Method Based on YOLO Technology
CN116912675B (en) Underwater target detection method and system based on feature migration
CN118115893A (en) A Small Target Detection Method for Remote Sensing Images
CN118212464A (en) A context-based remote sensing image scene classification method and system
Geng et al. An efficient detector for maritime search and rescue object based on unmanned aerial vehicle images
CN110826478A (en) A method for identifying illegal construction in aerial photography based on adversarial network
CN116721359A (en) A method and system for standardized deployment and inspection of transmission lines based on multi-source data
Zhao Research on adaptive weight and frequency domain enhancement fusion method for small target detection
CN117372910A (en) A multi-scale-attention-oriented target detection method suitable for UAVs
He et al. LSIDA-YOLOV7: An optimized YOLOv7 based on local sensitive information data augmentation for sewer pipeline defect detection
Ding et al. Building detection algorithm in multi-scale remote sensing images based on attention mechanism
CN118230253B (en) Iron tower video image farmland extraction method and device based on attention mechanism
CN111898702B (en) Unmanned ship environment intelligent sensing method based on deep learning
Tan et al. VisLanding: Monocular 3D Perception for UAV Safe Landing via Depth-Normal Synergy
Liu et al. VOS-net: real-time oil spill detection in uav videos via lightweight adapter tuning

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

AS Assignment

Owner name: SOUTHEAST UNIVERSITY, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, JIAN;HE, ZHILI;JIANG, SHANG;REEL/FRAME:059891/0083

Effective date: 20220414

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE