CN117952898A

CN117952898A - Water delivery tunnel crack detection method based on UNet network

Info

Publication number: CN117952898A
Application number: CN202311369727.2A
Authority: CN
Inventors: 孙玉山; 李岳明; 曹建; 王旭; 邵卓青; 绳册
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2023-10-23
Filing date: 2023-10-23
Publication date: 2024-04-30

Abstract

The invention relates to a water delivery tunnel crack detection method based on a UNet network. The invention relates to the technical field of crack detection of a water-conveying tunnel, which is based on UNet for network improvement, and is added with a SE-Res2Net module, so that the network depth is deepened, multi-scale feature extraction is realized, and the possibility of gradient dissipation is reduced; adding a context enhancement module between the encoder and the decoder, and acquiring more context information while increasing the receptive field; and the self-adaptive fusion mode is used for replacing jump connection, so that fusion of the shallow layer characteristics and the deep layer characteristics is completed. The ablation experiment proves that the performance of the UNet network in the underwater crack segmentation task of the water delivery tunnel can be improved to a certain extent by the improved methods.

Description

Water delivery tunnel crack detection method based on UNet network

Technical Field

The invention relates to the technical field of crack detection of a water delivery tunnel, in particular to a crack detection method of the water delivery tunnel based on a UNet network.

Background

The lining of the water delivery tunnel is mostly of a reinforced concrete structure, and is influenced by a series of factors such as water flow flushing, internal water pressure, external geological pressure, rock stratum change, temperature change and the like in the annual operation, so that the lining of the tunnel can have defects such as falling, cracking, deformation, exposed ribs, collapse and the like.

The most typical defect of the water delivery tunnel is a crack, the existence of the crack reduces the integrity and firmness of the lining, and the service life of the tunnel is greatly shortened, so that serious accidents are easily caused if the detection and maintenance are not performed in time. At present, the crack detection of the water-conveying tunnel is most commonly used for instrument detection and manual detection, the instrument detection monitors the safety state of the tunnel by embedding sensors such as a stress meter, a reinforcement meter, a flowmeter, an inclinometer and the like in the tunnel, but because the distance of the tunnel is long and the diameter is large, the sensors can not fully cover the tunnel, and in order to avoid omission, irregular manual inspection is needed. Firstly, cutting off water in a water delivery tunnel, emptying water in the tunnel, enabling workers to enter the tunnel for inspection, photographing if cracks are found in the inspection process, marking important cracks, drawing crack distribution diagrams, and sending data back to an expert for analysis. Or after the water in the hole is emptied, the staff detects the crack by means of ultrasonic detection, ground penetrating radar technology, infrared imaging technology, three-dimensional laser scanning technology and the like. Sometimes, in order to obtain a more accurate result, an expert needs to perform secondary on-site investigation, sample and verify the crack, and obtain more detection data. This approach requires a long detection period, and some tunnels are harsh in environment, difficult for staff to reach, resulting in unreliable detection results. Another manual detection method is to detect the water delivery tunnel by sending a professional diver to carry underwater detection equipment to enter water, but the method has a plurality of limitations because the professional diver does not know the operation characteristics of hydraulic engineering.

With the continuous development of informatization, the defect detection method based on computer vision has the advantages of high accuracy, strong real-time performance, comprehensive detection and low cost compared with the manual detection method, and is widely applied to the modern industry. The convolutional neural network (Convolutioal Neural Networks, CNN) in the deep learning can learn data characteristics through training, has good nonlinear expression capability on complex structures, can effectively solve the problems of unobvious image characteristics of cracks of the water delivery tunnel, random target position areas and the like, can detect under the condition that the water delivery tunnel is not broken by using the autonomous underwater vehicle (Autonomous Underwater Vehicle, AUV), and solves the limitation of manual detection. Therefore, the detection research of the crack of the water delivery tunnel based on deep learning is developed, and the high-efficiency, rapid, accurate and comprehensive detection of continuous water is realized, so that the method has important engineering application significance.

The target detection method based on the area block has good crack detection effect, but only can select a crack frame to roughly classify, and cannot accurately and completely extract the crack profile information, so that in order to enable tunnel maintenance personnel to better know the tunnel crack information for subsequent maintenance, the crack needs to be subjected to pixel division to obtain more crack image information.

The full convolution neural network can effectively solve the problems, and the current crack detection method is mostly improved on the basis of a UNet network. Although the encoding and decoding structure of UNet can classify each pixel so as to acquire the global information of the image, detail information is easily lost after the maximum pooling, and false detection and omission of small cracks are caused. According to the characteristics of the underwater image of the water delivery tunnel, the characteristics of large noise, low contrast between the crack and the background, bright middle, dark surrounding and the like of the image exist, and the small cracks are more, so that the difficulty of crack segmentation is increased.

Disclosure of Invention

Aiming at the characteristics of underwater images of a water delivery tunnel, the invention provides a water delivery tunnel crack detection method based on a UNet network, which has the characteristics of large noise, low contrast between cracks and background, bright middle, dark surrounding and the like, and has more tiny cracks and increased crack segmentation difficulty.

It is noted that in the present invention, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention provides a water delivery tunnel crack detection method based on a UNet network, which provides the following technical scheme:

the method for detecting the crack of the water conveyance tunnel based on the UNet network comprises the following steps:

Step 1: collecting crack images, screening and merging the crack images;

step 2: performing pre-mathematical analysis on the crack image, marking a data set by labelimg, and randomly dividing a training set verification set;

step 3: setting up Unet a network, setting and adjusting training parameters, and verifying a training model according to a training set which is randomly divided;

step 4: and drawing a loss function attenuation curve according to the loss value, testing a crack image according to whether the curve is converged, and carrying out crack quantitative analysis.

Preferably, the step 1 specifically includes:

Searching for a dataset containing only cracks, including CrackDetection, concreteCrack, SDNET2018; crackDetection there are 6069 bridge crack images of 224 x 224 size including positive and negative samples; concreteCrack contains 4 ten thousand pavement cracks with the size of 227 multiplied by 227, wherein half of the pavement cracks are positive samples, and the other half of the pavement cracks are negative samples; SDNET2018 is a large concrete crack dataset comprising a plurality of classifications of wall surfaces, bridges, road surfaces and the like, and more than 56000 images with the size of 256 multiplied by 256 are combined;

Selecting 9300 proper crack images from the above disclosed data set, adding 500 pavement and wall concrete crack images shot by a mobile phone, and 200 underwater crack images of a dam, totaling 10000 original data sets;

training a ground fracture dataset and a water delivery tunnel underwater style dataset by using CycleGAN style conversion models to obtain a large number of underwater fracture images of the water delivery tunnel, and selecting 3500 fracture images which accord with the actual situation from the underwater fracture images as an original dataset.

Preferably, the step2 specifically includes:

The wavelet transformation-based denoising method for the underwater crack image of the water conveyance tunnel uses a wavelet basis with finite length and attenuation to replace a triangular function basis with infinite length in Fourier transformation, and the formula is as follows:

Wherein α represents a scale for controlling the expansion and contraction of the wavelet function; τ represents the panning of the control wavelet function;

adopting average gradient, mean square error and peak signal-to-noise ratio as denoising image evaluation standards;

The MG expresses the rate of change of the image to the tiny details, represents the detail expression of the image, and the larger the value is, the stronger the contrast of the image is, and the calculation formula is as follows:

Wherein M is the image size, And/>Representing the image horizontal and vertical gradients;

the MSE measures the gray level change of the image, the smaller the value is, the better the noise suppression effect is, and the calculation formula is as follows:

wherein M.N is the image size, and I (I, j) and I' (I, j) represent the gray scale of the pixel points before and after filtering;

The PSNR evaluates the distortion degree before and after image filtering, and the larger the value is, the less the image distortion is, and the calculation formula is as follows:

Preferably, 3500 underwater crack images of the water delivery tunnel are selected as simulation experiment data sets, the resolution ratio of all pictures is 416×416 to reduce training time, and in order to ensure model generalization performance, the data sets are randomly divided into a training set, a verification set and a test set according to the ratio of 8:1:1;

labeling the data set by using LabelImg labeling tools, naming a labeling frame as a Crack class, and then generating a corresponding xml file; as the data set cracks used in the invention mostly penetrate through the whole picture, the coverage area of the single frame marking is larger, the background area is too high to be beneficial to network learning, and a plurality of marking frames with moderate sizes are adopted to alternately cover the cracks.

Preferably, the step 3 specifically includes:

The Unet network adds three layers of SE-Res2Net modules in the encoder part, adds a context enhancement module in the middle of the coder and decoder, and uses a self-adaptive mixing mode to perform feature fusion; the Res2Net is used as a main network of the encoder part, so that the depth of the network is deepened, the possibility of gradient dissipation in training is reduced, the fine and multi-scale feature extraction is realized, and the parameter quantity and the calculation complexity are not increased; an SE module is added behind the Res2Net module, so that the relevance between channels is improved; a context enhancement module is added in the middle of the coder and the decoder, and the receptive field is increased in a mode of parallel connection of hole convolutions with different expansion rates, so that more shallow characteristic information is reserved; the self-adaptive feature fusion operation is used for replacing the jump connection operation, so that the self-adaptive fusion of the shallow features and the deep features is realized.

Preferably, the Unet network structure has a total of 33 layers, including 16 SE-Res2Net modules, a context enhancement module, and the network structure is still U-shaped;

Firstly, the number of feature images is increased through one convolution of 3 multiplied by 3, and then the feature image size is reduced through maximum pooling; the feature map size is reduced by half and the channel number is doubled after each time of maximum pooling, and the improved UNet network reduces the original input image by 16 times; in order to realize feature extraction under different scales, a SE-Res2Net module is added after each pooling to increase network depth; finally, the encoder uses three cavities to convolve in parallel to form a context enhancement module to increase a characteristic sampling receptive field, and captures multi-scale information and context information of the image; and then the characteristic image enters a decoding stage through the number of 1 multiplied by 1 convolution adjustment channels, the image size is recovered by up-sampling by using transposed convolution, the image is recovered to the original size by up-sampling for 4 times each time, and finally the segmented image is output by using the number of 1 multiplied by 1 convolution adjustment channels.

Preferably, the step 4 specifically includes:

The accuracy rate, recall rate, average cross ratio and F1 coefficient of common evaluation indexes are selected as evaluation indexes, and the specific definition is as follows:

Precision: the proportion of the pixels which are truly positive samples in all the pixels predicted to be positive samples to the pixels predicted to be positive samples is shown, and the higher the accuracy rate is, the more accurate the detection result is; the formula is as follows:

Recall: the higher the recall rate is, the higher the coverage rate of the crack pixels in the detection result in the label crack pixels is; the formula is as follows:

Homogeneous mixing ratio MIoU: the average of IoU values representing each category on the dataset, ioU representing the closeness of the detection result to the label, the greater the intersection ratio is, the higher the coincidence of the crack detection result and the label is; the formula is as follows:

f1 coefficient; the weighted harmonic mean value is expressed, so that the algorithm performance can be comprehensively evaluated, and the larger the value is, the better the formula is as follows:

The TP is true positive, namely, the TP is predicted to be a positive sample and the prediction is correct, and the network model detection result shows the number of correctly predicted crack pixel points; FP is a false positive, i.e. a positive sample is predicted but the prediction is wrong, and the network model detection result indicates the number of pixel points of the crack which are mispredicted; TN is true negative, namely, the prediction is negative and correct, and the network model detection result shows the number of correctly predicted background pixel points; FN is false negative, i.e. the prediction is negative and the prediction is wrong, and the network model detection result indicates that the misprediction is the number of background pixels.

A water conveyance tunnel crack detection system based on a UNet network, the system comprising:

The data acquisition module is used for gathering the crack images and screening and merging the crack images;

The preprocessing module performs preprocessing on the crack image, marks a data set by labelimg and randomly divides a training set verification set;

The network building module builds Unet a network, sets and adjusts training parameters, and verifies the training model according to the randomly divided training sets;

and the test module draws a loss function attenuation curve according to the loss value, tests a crack image according to whether the curve converges, and carries out crack quantitative analysis.

A computer-readable storage medium having stored thereon a computer program for execution by a processor for implementing a UNet network-based water tunnel crack detection method

A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes a water delivery tunnel crack detection method based on a UNet network when executing the computer program

The invention has the following beneficial effects:

compared with the prior art, the invention has the advantages that:

according to the water tunnel crack detection method based on the UNet network, network improvement is carried out based on the UNet, and the SE-Res2Net module is added, so that the network depth is deepened, multi-scale feature extraction is realized, and the possibility of gradient dissipation is reduced; adding a context enhancement module between the encoder and the decoder, and acquiring more context information while increasing the receptive field; and the self-adaptive fusion mode is used for replacing jump connection, so that fusion of the shallow layer characteristics and the deep layer characteristics is completed.

Secondly, training an improved network, introducing experimental environment, parameter setting, data set and evaluation index, comparing with other segmentation networks, and performing ablation experiments, wherein the experiment shows that the invention has good segmentation effect in the underwater crack segmentation task of the water delivery tunnel, the accuracy reaches 81%, the recall rate reaches 85%, the F1 score reaches 84%, and the engineering crack detection requirement is met. And finally, carrying out quantitative analysis on the crack, carrying out skeleton extraction on the segmented crack binary image, solving the length, width and area information of the crack, and carrying out accuracy verification through a pool experiment, thereby proving the effectiveness of the invention and feature information extraction.

In order to obtain finer information of the crack, it is necessary to perform segmentation using a semantic segmentation network, and to obtain feature information such as the length, width, and area of the crack by performing quantization analysis based on the segmented image. Aiming at the problems of false detection and missing detection of underwater fine cracks of a water conveyance tunnel, the invention provides a water conveyance tunnel crack segmentation method based on an improved UNet network. Using Res2Net

As the backbone network of the encoder part, deepen the network depth, reduce the gradient dissipation problem, and add the SE module at the end of Res2Net module to prevent the relativity between channels from decreasing; a context enhancement module is added in the middle of the coder and the decoder to increase the receptive field and keep more shallow characteristic information; the self-adaptive feature fusion operation is used for replacing the jump connection operation, so that the shallow layer features and the deep layer features are integrated better. The effectiveness of the improved UNet algorithm in the underwater crack segmentation task of the water conveyance tunnel is proved by comparison with a classical semantic segmentation network, and the performance of the UNet network in the underwater crack segmentation task of the water conveyance tunnel can be improved to a certain extent by means of an ablation experiment.

In order to facilitate maintenance personnel to judge the damage condition of the tunnel, quantitative analysis of the crack is needed. In order to verify the measurement effect of the method, a pool experiment is carried out, a concrete pouring cement plate is used for self-making cracks, then a water delivery tunnel is used for detecting an AUV for data acquisition, and finally data are processed. The crack detection method and the characteristic information calculation method provided by the invention are excellent in performance through detection, segmentation and characteristic information calculation result analysis.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flow chart of crack detection of a water conveyance tunnel based on a UNet network;

FIG. 2 is a residual structure;

FIG. 3 is a diagram of the SE-Res2Net basic structural unit;

FIG. 4 is a schematic diagram of a hole convolution;

FIG. 5 is a schematic diagram of a context enhancement module;

FIG. 6 is a feature fusion schematic;

Fig. 7 is a UNet network architecture;

FIG. 8 is Labelme data labels;

FIG. 9 is a label image;

FIG. 10 is a tag dataset;

fig. 11 is a diagram of UNet versus modified UNet loss function curve (unfrozen);

fig. 12 is a diagram of UNet versus modified UNet loss function curve (freeze);

fig. 13 is a graph showing the effect of crack splitting.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

The present invention will be described in detail with reference to specific examples.

First embodiment:

According to the embodiments shown in fig. 1 to 13, the specific optimization technical scheme adopted by the present invention to solve the above technical problems is as follows: the invention relates to a water delivery tunnel crack detection method based on a UNet network.

Step 1: collecting crack images, screening and merging the crack images;

The step 1 specifically comprises the following steps:

the step 2 specifically comprises the following steps:

3500 underwater crack images of the water delivery tunnel are selected as simulation experiment data sets, the training time is reduced by enabling resolution ratio of all pictures to be 416 multiplied by 416, and in order to ensure model generalization performance, the data sets are randomly divided into a training set, a verification set and a test set according to the ratio of 8:1:1;

the step 3 specifically comprises the following steps:

Unet the network structure has a total of 33 layers, including 16 SE-Res2Net modules, a context enhancement module, and the network structure is still U-shaped;

Specific embodiment II:

the second embodiment of the present application differs from the first embodiment only in that:

The invention provides a water delivery tunnel crack detection system based on a UNet network, which is characterized in that: the system comprises:

Third embodiment:

the difference between the third embodiment and the second embodiment of the present application is that:

the invention is a network improvement based on UNet, comprising:

SE-Res2net module

For the underwater crack segmentation task with complex environment, the UNet network structure has a plurality of defects in extracting deep information of an underwater crack image, so that the network structure needs to be deepened to enhance the feature extraction capability. However, the gradient dissipation problem may occur in the deeper UNet network under the condition of insufficient device video memory, so that the network cannot converge in the training process, the gradient dissipation may be reduced to a certain extent by the residual structure, and the residual structure is shown in fig. 2.

Along with the continuous deepening of the network, the input continuously fuses more complete information in the shallow network into the deep network through jump connection, so that the characteristic information loss in network transmission can be prevented, and the gradient dissipation problem can be avoided. To extract features of finer granularity, res2Net improves the residual structural unit by first dividing the extracted features into different regions and then convolving the regions separately to extract the features. This method can extract feature information more finely without increasing the number of parameters and the computational complexity. As shown in fig. 3, res2Net introduces the idea of grouping and multiscale in the residual structure unit, replacing the larger 3×3 convolution kernel part in the residual unit with a filter bank consisting of multiple 3×3 convolution kernels with smaller channel dimensions, which are connected by means of residual connection according to different levels. In addition, SE modules are added at the end of the structure, so that the feature extraction capability is enhanced, and the channel correlation between excessive filters is prevented from being reduced.

The specific process is as follows: the feature map is divided into N different groups through a 1×1 convolution operation, the feature size in each group is the same, and the larger N represents the larger receptive field, and n=4 is adopted in the invention. To balance the parameters, X ₁ does not perform convolution operation, and X ₂、X₃ and X ₄ extract feature information by 3×3 convolution, respectively. Y _i can be linked at different levels to achieve different degrees of receptive fields. The process expression is:

Wherein Conv _i (·) represents the convolution operation.

Context enhancement module CAM

In the process of feature extraction, the feature dimension of the image is reduced and effective information is kept through pooling operation, the size is reduced, and the receptive field is increased. For pixel-level image segmentation, the feature map is subjected to an upsampling operation after the downsampling feature extraction to restore the feature map to the input size, and some important feature information is lost in the process. In order to reduce the loss of information, the invention can use the hole convolution to replace the pooling operation, adds the hole convolution between the encoder and the decoder, increases the receptive field, does not reduce the resolution and does not lose the characteristic information. The cavity convolution is further provided with a parameter named expansion rate to adjust the receptive field of the convolution, and the larger the expansion rate is, the larger the receptive field is.

As shown in fig. 4, which is a schematic diagram of hole convolution with expansion rates of 1, 2 and 4, the problem of information loss caused by excessive parameter and pooling operation can be avoided by using the hole convolution. With the same convolution kernel size, the hole convolution can obtain a larger perception range in a superposition manner compared with the traditional convolution, so that the context information is better captured.

The context enhancement module designed by the invention is shown in fig. 5, three cavity convolutions with expansion rates of 1, 2 and 4 are connected in parallel, and a1×1 standard convolution is arranged after each cavity convolution. And respectively carrying out three-layer operation on the pooled feature images, carrying out weighted fusion of pixel-by-pixel addition on the feature images obtained by each layer, and then transmitting the feature images to a decoder part for up-sampling operation.

Adaptive feature fusion Mixup

The convolutional neural network can obtain a large amount of characteristic information such as target edges, outlines, positions and the like in a shallow network stage, and the information is more complete and rich. And as the network deepens, the shallow characteristic information gradually degenerates, so that the characteristic information contained in the restored characteristic map is reduced, and the segmentation precision is affected. In order to solve the above problem, UNet fully fuses local features and context information contained in a shallow network in an encoder with global features and abstract features contained in a deep network in a decoder through jump connection to generate new features. However, the jump connection in UNet simply superimposes the shallow features and the deep features according to the number of channels, and the feature graphs with different feature sizes need to be cut for fusion, which has a certain limitation. The crack segmentation network designed by the invention adopts a self-adaptive hybrid mode to replace jump connection to fuse shallow and deep characteristic information, characteristic size cutting is not needed when characteristics are fused by the self-adaptive hybrid mode, and self-adaptive learning factors are added to allocate different weights to deep and shallow characteristics for fusion, so that the fusion effect is better. Three downsampling layers and three upsampling layers are involved, and the hybrid operation formula is as follows:

f_↑2＝Mix(f_↓1,f_↑1)＝σ(θ₁)*f_↓1+(1-σ(θ₁))*f_↑1,

f_↑3＝Mix(f_↓2,f_↑2)＝σ(θ₂)*f_↓2+(1-σ(θ₂))*f_↑2,

f_↑O＝Mix(f_↓3,f_↑3)＝σ(θ₃)*f_↓3+(1-σ(θ₃))*f_↑3,

Where f _↑i and f _↓i represent feature maps of the i-th up-sampling layer and down-sampling layer, respectively, and f _↑O represents the final output map, the size of i being defined from left to right according to the adaptive feature fusion scheme (fig. 6). Sigma (θ _i) represents the learning factor of the ith fusion part, the value of which can be determined by sigma on the parameter θ _i, and sigma represents the Sigmoid function.

Improved UNet network structure

In the research problem of the invention, the underwater crack image of the water delivery tunnel has the characteristics of large noise, low contrast between the crack and the background, bright middle, dark surrounding and the like, and is mostly tiny crack, the segmentation capability of the network is improved on the basis of the UNet network, so that the network can accurately and completely extract the crack characteristics, and preparation is made for subsequent crack quantification work.

Aiming at task requirements, the improved UNet network is added with three layers of SE-Res2Net modules in an encoder part, and a context enhancement module is added in the middle of a coder and a decoder, and the characteristic fusion is carried out by using an adaptive mixing mode. The Res2Net is used as a backbone network of the encoder part, so that the depth of the network is deepened, the possibility of gradient dissipation in training is reduced, the fine and multi-scale feature extraction is realized, and the parameter quantity and the calculation complexity are not increased. And an SE module is added behind the Res2Net module, so that the relevance among channels is improved. And a context enhancement module is added in the middle of the coder and the decoder, and the receptive field is increased in a mode of parallel connection of hole convolutions with different expansion rates, so that more shallow characteristic information is reserved. The self-adaptive feature fusion operation is used for replacing the jump connection operation, so that the self-adaptive fusion of the shallow features and the deep features is realized. The improved UNet network structure is shown in fig. 7.

The improved UNet network architecture has a total of 33 layers, including 16 SE-Res2Net modules, a context enhancement module, and the network architecture is still U-shaped.

The number of feature maps is first increased by one convolution of 3 x3 of the input image, and then the feature map size is reduced by maximum pooling. The improved UNet network of the invention reduces the original input image by 16 times after every maximum pooling, the feature map size is reduced by half, and the channel number is doubled. To achieve feature extraction at different scales, SE-Res2Net modules are added after each pooling to increase network depth. And finally, the encoder uses three cavity convolutions to form a context enhancement module to increase the characteristic sampling receptive field, and captures the multi-scale information and the context information of the image. And then the characteristic image enters a decoding stage through the number of 1 multiplied by 1 convolution adjustment channels, the image size is recovered by up-sampling by using transposed convolution, the image is recovered to the original size by up-sampling for 4 times each time, and finally the segmented image is output by using the number of 1 multiplied by 1 convolution adjustment channels. In the original UNet structure, jump connection is only used when the same-level features in the up-down sampling process are fused, so that features extracted by a shallow network and features extracted by a deep network cannot be directly communicated, a self-adaptive feature fusion mode is designed, cross-layer connection of context information is realized, feature information mobility is enhanced, and more feature information is reserved.

In summary, the modified UNet differs from the original UNet in the following points: (1) And after the encoding structure is downsampled, an SE-Res2Net module is added, so that the network depth is deepened, multi-scale feature extraction is realized, and the possibility of gradient dissipation is reduced. (2) And finally adding a context enhancement module into the coding structure, increasing the receptive field, reserving more shallow image characteristic information, and acquiring more context information. (3) And the fusion of the low-level features and the deep features is completed by using a self-adaptive fusion mode, so that the acquisition of the network to the context of the feature image is enhanced, and the network precision is improved.

Underwater crack detection experimental result and analysis of water delivery tunnel based on improved UNet network

The invention discloses a water delivery tunnel crack detection experimental simulation flow chart based on an improved UNet network, which is shown in fig. 1, and comprises the following specific implementation steps:

(1) Generating a data set: and carrying out style conversion by taking an underwater photographed image of a real water delivery tunnel as a background and a land concrete crack as a basic sample through CycleGAN networks to generate a data set for experimental simulation.

(2) Pretreatment of underwater crack images of a water conveyance tunnel: ① And a denoising mode based on wavelet transformation is adopted to denoise the synthesized underwater crack image of the water-conveying tunnel, so that the background noise is reduced, and the influence on crack feature extraction is reduced. ② And carrying out dodging treatment on the denoised image by adopting an improved Mask dodging algorithm, avoiding the feature that the crack is covered by the excessive illumination, and enhancing the contrast ratio of the crack and the background.

(3) Building an improved UNet network model: the network is built using functions such as two-dimensional convolution (Conv 2D), max pooling (Maxpooling), and upsampling (UpSample) in pytorch.

(4) Loading training samples and labels for network training: the training samples and labels are loaded with sample processing code, and the pictures and labels are automatically resized before being transferred into the network without requiring a fixed size. And continuously optimizing and updating the network model parameters by using an Adam method, and stopping iteration and retaining the model parameters after the loss function gradually decreases to a certain degree and tends to be stable and unchanged.

(5) Testing a network model: and detecting the test set data by using the trained network model to obtain a binarized image of the underwater crack of the water conveyance tunnel.

(6) And (3) quantifying crack characteristics: the area information is directly obtained by utilizing the binary image output by the model, then the binary image is skeletonized, and the length and the width of the crack are calculated by utilizing the skeleton image.

Experimental environment and parameter settings

In order to ensure the authenticity and accuracy of simulation experiment data, the improved UNet network is subjected to accurate performance evaluation, and the experiment environment and model parameters need to be configured before model training, and the specific configuration is as follows.

(1) Experimental environment: the experimental environment comprises hardware configuration and software environment, and in order to ensure the accuracy and the effectiveness of the simulation experiment, the experiment is performed under the same environment, and the experimental environment is shown in table 1.

Table 1 experimental environment

(2) Model parameter setting: the network model optimization algorithm adopts an Adam algorithm, the learning rate is set to be le-4, and the momentum is set to be 0.9; the batch_size is set to 12; the activation function uses Sigmoid; the epoch was set to 100 by analyzing the relationship between loss value and accuracy and number of exercises and pre-exercise result analysis.

A data set; because the underwater crack image of the water delivery tunnel is difficult to acquire through acquisition and no data is basically disclosed on the network, the underwater crack of the water delivery tunnel has extremely high similarity with the characteristics of the land concrete crack, and the underwater environmental characteristics of the water delivery tunnel are combined, so that a data set required by an experiment is synthesized through a style conversion method.

And selecting 9300 proper images from the public dataset CrackDetection, concreteCrack, SDNET and 2018, adding 500 images of pavement and wall concrete cracks shot by a mobile phone at ordinary times, 200 dam cracks, taking the 10000 images as style conversion basic images, taking the underwater environment image of the water delivery tunnel as style images, generating the underwater crack image of the water delivery tunnel through a CycleGAN network, and selecting 3000 high-quality images conforming to the environment of the water delivery tunnel. Adding 500 images of the real dam underwater cracks to form 3500 original data in total, preprocessing the images, and placing the enhanced images into the same folder. Dataset annotation was performed using Labelme, with the annotation interface shown in FIG. 8.

After the designated file is imported, a left labeling button is selected for labeling, and the designated file is classified into a category of a Crack after labeling is completed as shown in the figure, and the rest of the background is automatically classified into other categories.

The annotated information is saved as a json format, then the json file is converted into a label graph, the label graph is a binary image, red is a crack, black is a background, the label image is shown in fig. 9, and the label data set is shown in fig. 10.

Evaluating the index; subjective and objective evaluations are typically employed to evaluate fracture splitting results. Subjective evaluation is to judge the difference between the crack segmentation result and the label and original image by using the subjective consciousness of a person, and objective evaluation is to quantitatively analyze the segmentation result by using the commonly used evaluation index in the segmentation field. The Precision (Precision), recall (Recall), average cross ratio (MIoU) and F1 coefficient of the common evaluation index are selected as the evaluation indexes, and the specific definition is as follows:

(1) Precision. The higher the accuracy rate, the more accurate the detection result is, indicating that the pixel that is actually a positive sample among all the pixels predicted as positive samples is in proportion to all the pixels predicted as positive samples. The formula is as follows:

(2) Recall rate Recall. The pixels representing the correct prediction as positive samples occupy the pixel ratio columns of all the actual positive samples. The higher the recall, the higher the coverage of the crack pixels in the detection result in the label crack pixels. The formula is as follows:

(3) And the average cross ratio MIoU. The average of IoU values representing each category on the dataset, ioU represents the closeness of the test result to the label, and the greater the overlap ratio indicates the higher the overlap of the crack test result with the label. The formula is as follows:

(4) F1 coefficient. The weighted harmonic mean value is expressed, so that the algorithm performance can be comprehensively evaluated, and the larger the value is, the better the algorithm performance is. The formula is as follows:

In the formula, TP is true positive, namely, the TP is predicted to be a positive sample and the prediction is correct, and the network model detection result shows the number of correctly predicted crack pixels; FP is a false positive, i.e. a positive sample is predicted but the prediction is wrong, and the network model detection result indicates the number of pixel points of the crack which are mispredicted; TN is true negative, namely, the prediction is negative and correct, and the network model detection result shows the number of correctly predicted background pixel points; FN is false negative, i.e. the prediction is negative and the prediction is wrong, and the network model detection result indicates that the misprediction is the number of background pixels.

Training results and algorithm performance analysis

The training weights from 0 on the network are too random, so that the effect is poor, the training is performed by using the pre-training weights of the main network, and fig. 11 is a loss function curve without setting a freezing stage, and it can be seen that the UNet and the improved UNet basically reach convergence in 100 rounds of network training, and the improved UNet loss function rapidly drops in the first 50 rounds and reaches convergence more quickly. Fig. 12 is a graph of the loss function for a set freeze phase, and it can be seen that the UNet and modified UNet early losses drop more rapidly and converge more rapidly than if the freeze phase were not set.

The MIoU curve changes are shown in fig. 13, and it can be seen that both UNet and MIoU with modified UNet can reach more than 80% in the end.

The crack segmentation simulation experiment effect shows that the input image, the label image, the UNet segmentation image before improvement and the UNet segmentation image after improvement are respectively from left to right, and the segmentation effect comparison is carried out on some representative cracks. As can be seen from the comparison chart, the cracks segmented by the UNet network before improvement are relatively complete, but the detection effect on some tiny cracks is poor, part of pixel values are easy to lose, and background noise is easy to be mistaken as crack characteristics. The improved UNet network is more excellent in segmentation effect, and the crack is completely extracted and basically does not contain noise points, so that the quantization accuracy of the subsequent cracks is improved.

The segmentation effect is compared with FCN, segNet, deepLabv three representative semantic segmentation algorithms by adopting the evaluation index selected in the previous section except subjective evaluation, and index comparison results are shown in table 2.

Table 2 comparison of detection indicators

As can be seen from the table, in the underwater crack data set of the water delivery tunnel, the algorithm obtains the highest score on three indexes Precision, MIoU, F, and compared with FCN, segNet, deepLabv3, the score of the algorithm F1 is respectively improved by 35%,27% and 6%, so that the effectiveness of the algorithm in the underwater crack segmentation task of the water delivery tunnel is proved.

Ablation experiments

In order to verify the effectiveness of each improved part of the algorithm, ablation experiments are performed in this section for comparison. The SE-Res2Net module, the context enhancement module CAM and the adaptive feature fusion module Mixup are added step by step on the basis of the UNet network, and the effect is shown in Table 5.3.

Table 5.3 comparison of ablation experiments

As can be seen from the table, the F1 index score can be improved by adding different modules. The comparison shows that the addition of the SE-Res2Net can improve the network segmentation accuracy, and the accuracy of the SE-Res2Net is improved by 7% compared with that of the original network; the recall rate of the network model can be improved by adding CAM, and the improvement amplitude is 13%; mixup by integrating shallow layer features and deep layer features, feature communication between the deep layer and the shallow layer of the network is improved, the accuracy and recall rate of the network can be effectively improved, and the lifting amplitude is 10% and 7% respectively. The network added with SE-Res2Net, CAM, mixup has 12 percent higher accuracy than the original network, 15 percent higher recall rate and 15 percent higher F1 fraction. The data prove that the performance of the UNet network in the underwater crack segmentation task of the water delivery tunnel can be improved to a certain extent by the improved methods.

Quantitative analysis of cracks

After the image preprocessing and segmentation of the underwater crack image of the water-conveying tunnel, the overall shape of the crack is extracted, but in actual engineering, only rough extraction of the crack outline does not substantially help the condition analysis of the tunnel, and more characteristic information needs to be extracted. The length, width and area information of the crack are obtained through processing the divided binary image, and the method has important significance for tunnel maintenance personnel to further judge damage conditions and formulate maintenance schemes. In actual engineering, the generated cracks are mainly linear cracks due to regular checking and maintenance, and the probability of generating net whole cracks is relatively low, so that the main research target is the linear cracks.

The skeleton extraction of the crack is crack refinement, the crack topological structure of clear visualization is obtained by eliminating redundant information of the binary image, the image refinement changes the multi-pixel part of the crack into a single-pixel structure, the crack characteristics can be clearly displayed, and the characteristic information of the crack can be more conveniently calculated. The conditions to be met for image refinement are as follows:

(1) The convergence is good;

(2) Continuity: the connectivity of the thinned profile is good;

(3) Topology: after refinement, the topological structure of the crack is not changed;

(4) Thinning property: the width of the thinned crack is single pixel;

(5) Center axiality: the thinned skeleton is positioned at the center position;

(6) Rapidity: the refinement algorithm has high calculation speed.

The Hilditch refinement algorithm firstly traverses the binary image, judges the neighborhood pixel of each pixel point, marks the pixel points meeting judgment as the background and converts the background pixel into the background pixel, and the rest of the image traversal is the skeleton structure. From the figure, it can be seen that the Hilditch refinement algorithm has a better refinement effect, but part of positions can not guarantee single pixels, and the subsequent calculation can be influenced.

The Rosenfeld refinement algorithm principle is that pixel points are marked according to distribution characteristics, and pixel points meeting the conditions are removed through multiple iterations. From the figure, a great amount of burrs exist in the processing result of the Rosenfeld refinement algorithm, further deburring and false branch removing are needed later, the process is complex, and the efficiency is low.

The principle of the Zhang rapid parallel refinement algorithm is that specific pixel points are removed through multiple iterations, and the algorithm has the advantages of high calculation speed, good connectivity of refined images and less burr generation.

The algorithm run times are shown in table 4.

Table 4 details algorithm run time (Unit: s)

As can be seen from the table, the Rosenfeld refinement algorithm has the longest running time, the lower operation efficiency, and the Hilditch refinement algorithm and the Zhang refinement algorithm have the similar running time. Comprehensively comparing the running time and the refining effect of each algorithm, and selecting a Zhang refining algorithm to extract the crack skeleton.

The length of the linear crack can be directly obtained by using the straight line distance between two ends of the circumscribed rectangle or the outline, but most of the crack trend is irregular, and the crack length calculated by the method is smaller than the actual value. The number of effective single pixel points in the crack skeleton is calculated to be used as the length of the crack.

The Zhang refining algorithm is used in the previous section, most redundant information is eliminated, a crack skeleton is initially extracted, but the complexity of the crack makes some burrs and false branches reserved in the skeleton, and some skeletons cannot strictly guarantee single pixels, so that the subsequent calculation result is inaccurate. The framework is trimmed by adopting a method for removing burrs of the framework based on a template, and the main steps are as follows:

(1) Traversing pixel points on the binary image, if the pixel value of a certain pixel point is 255, counting the points with the pixel value of 255 in the eight neighborhood range of the image point, and counting the accumulated numbers as N;

(2) If N=1, the pixel point is indicated as an end point of the skeleton, the point is indicated as A (i), if N=2, the pixel point is indicated as B (i), if N is more than or equal to 3, the pixel point is indicated as C (i) at the intersection of the skeleton and the branches;

(3) Assigning 0 to the pixel value of C (i), and dividing the original binary image into a plurality of different connected domains including false branches to obtain a marked image;

(4) In the marked image, counting the length of a connected domain containing an endpoint A (i), and marking as L (i);

(5) Setting a threshold M, and if the minimum value minL (i) of L (i) is larger than M, setting the pixel value of the connected domain where the minL (i) is positioned as 0;

(6) And (5) restoring the pixel value of the cross point C (i) to 255, and repeating the steps (1) - (5) until the pseudo branch processing is completed.

After the false branch treatment, the crack false branch is effectively removed, and the skeleton main body is reserved, so that the effectiveness of the method in the aspect of treating the false branch is proved. The length calculation of the fracture is performed below.

The image is framed and is in fact a two-dimensional array of pixels, denoted (x _i,y_i), i=0, 1, 2. And summing along the length direction of the crack framework, and obtaining the result as the length of the crack. Let the crack length be L, the calculation formula is as follows:

The crack width is taken as important characteristic information, can directly reflect the damage degree of the water delivery tunnel, and has great significance for evaluating engineering diseases. The split binary image of the crack is relatively clear, and the distance between the boundaries of two sides of the crack and perpendicular to the skeleton line can be directly calculated to be used as the width of the crack.

The pixel coordinates are used for representing the pixel position, the pixel coordinates of the upper edge of the crack in the kth column are assumed to be (j, Z (j, k)), and the pixel coordinates of the lower edge are assumed to be (i, Z (i, k)), and meanwhileThe vertical crack width of the row is P (k), the inclination angle is theta, and the calculation formulas of P (k) and theta are as follows:

P(k)＝Z(i,k)-Z(j,k)

As can be derived from the formulas (5-8) and (5-9), the pixel width W (k) of the slit is:

W(k)＝P(k)×cos(θ)

and finally, counting all the calculated pixel widths, wherein the maximum value is taken as the maximum width, and the average value is taken as the width of the crack.

The split image contains less noise, the area of the split can be directly expressed by the number of pixel points with the pixel value f (i, j) =255 in the image before being thinned, the size of the image is assumed to be M x N, the pixel area is assumed to be A, and the calculation formula is as follows:

In the previous sections, the invention improved the UNet network and calculated the length, width and area of the fracture by characterization analysis. To verify the accuracy of the method, a pool experiment was performed. Because the pool wall has no crack, the concrete pouring cement board is used for firstly, and a crack pattern is drawn, wherein the crack pattern comprises transverse cracks, longitudinal cracks and net-shaped cracks.

The underwater tunnel detection AUV carrying the underwater high-definition camera and the underwater illuminating lamp is used for experimental data acquisition, experiments are carried out at night to simulate the environment of the underwater tunnel, and the crack image has obvious bright areas and accords with the characteristics of the underwater crack image of the underwater tunnel in the dark environment.

And (5) performing accuracy analysis by using the acquired data. Firstly, denoising by using wavelet transformation, then performing denoising treatment by using an improved Mask dodging algorithm,

In the case of knowing the information such as the length, width, etc. of the pixel level of the crack, the crack length, average width, maximum width, area in pixels can be converted into the actual water conveyance tunnel crack length, width, and area as long as the resolution of the image or the distance of the adjacent images is known. In addition, the ratio of the number of the crack pixels to the total number of the pixels in the calculated result can also evaluate the coverage rate of the image cracks to a certain extent, and the method has very important reference significance for evaluating the water conveyance tunnel.

Fourth embodiment:

The fourth embodiment of the present application differs from the third embodiment only in that:

The present invention provides a computer readable storage medium having stored thereon a computer program for execution by a processor for implementing a method of water tunnel crack detection, such as one based on a UNet network.

Fifth embodiment:

The fifth embodiment of the present application differs from the fourth embodiment only in that:

The invention provides computer equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes a water tunnel crack detection method based on a UNet network when executing the computer program.

The above description is only a preferred implementation manner of the water tunnel crack detection method based on the UNet network, and the protection scope of the water tunnel crack detection method based on the UNet network is not limited to the above embodiments, and all technical schemes under the concept belong to the protection scope of the invention. It should be noted that modifications and variations can be made by those skilled in the art without departing from the principles of the present invention, which is also considered to be within the scope of the present invention.

Claims

1. A water delivery tunnel crack detection method based on a UNet network is characterized by comprising the following steps: the method comprises the following steps:

Step 1: collecting crack images, screening and merging the crack images;

2. The method according to claim 1, characterized in that: the step 1 specifically comprises the following steps:

3. The method according to claim 2, characterized in that: the step 2 specifically comprises the following steps:

4. A method according to claim 3, characterized in that: 3500 underwater crack images of the water delivery tunnel are selected as simulation experiment data sets, the training time is reduced by enabling resolution ratio of all pictures to be 416 multiplied by 416, and in order to ensure model generalization performance, the data sets are randomly divided into a training set, a verification set and a test set according to the ratio of 8:1:1;

Labeling the data set by using LabelImg labeling tools, naming a labeling frame as a Crack class, and then generating a corresponding xml file; because the data set cracks used in the text mostly penetrate through the whole picture, the coverage area of the single frame marking is large, the background area is too high to be beneficial to network learning, and a plurality of marking frames with moderate sizes are adopted to alternately cover the cracks.

5. The method according to claim 4, characterized in that: the step 3 specifically comprises the following steps:

6. The method according to claim 5, characterized in that:

Firstly, the number of feature images is increased through one convolution of 3 multiplied by 3, and then the feature image size is reduced through maximum pooling; every time the maximum pooling is carried out, the size of the characteristic diagram is reduced by half, the number of channels is doubled, and the improved UNet network reduces the original input image by 16 times; in order to realize feature extraction under different scales, a SE-Res2Net module is added after each pooling to increase network depth; finally, the encoder uses three cavities to convolve in parallel to form a context enhancement module to increase a characteristic sampling receptive field, and captures multi-scale information and context information of the image; and then the characteristic image enters a decoding stage through the number of 1 multiplied by 1 convolution adjustment channels, the image size is recovered by up-sampling by using transposed convolution, the image is recovered to the original size by up-sampling for 4 times each time, and finally the segmented image is output by using the number of 1 multiplied by 1 convolution adjustment channels.

7. The method according to claim 6, characterized in that: the step 4 specifically comprises the following steps:

8. A water delivery tunnel crack detecting system based on a UNet network is characterized in that: the system comprises:

9. A computer readable storage medium having stored thereon a computer program, characterized in that the program is executed by a processor for implementing the method according to claims 1-7.

10.A computer device comprising a memory and a processor, the memory storing a computer program, characterized by: the processor, when executing the computer program, implements the method of claims 1-7.