CN115345859A

CN115345859A - Intelligent detection method, device and equipment for tunnel leakage water image and storage medium

Info

Publication number: CN115345859A
Application number: CN202210998270.0A
Authority: CN
Inventors: 尚艳亮; 罗俊; 耿鹏; 吴薇娜; 黄帅
Original assignee: Shijiazhuang Tiedao University; Shijiazhuang Institute of Railway Technology; National Institute of Natural Hazards
Current assignee: Shijiazhuang Tiedao University; Shijiazhuang Institute of Railway Technology; National Institute of Natural Hazards
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-11-15

Abstract

The invention provides an intelligent detection method, a device, equipment and a storage medium for a tunnel water leakage image, wherein the method comprises the following steps: acquiring a plurality of leakage water image samples in a preset sample library; marking each water leakage image sample to obtain a water leakage label graph containing marks; training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder, a central layer and a depth supervision network module which is respectively connected with the decoder and the output end of the central layer, attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void ratios; and detecting the leakage water image to be detected based on the trained leakage water image detection model. The invention can accurately detect the leakage water image in the complex environment.

Description

Intelligent detection method, device and equipment for tunnel leakage water image and storage medium

Technical Field

The invention relates to the technical field of tunnel water leakage detection, in particular to an intelligent detection method, device, equipment and storage medium for a tunnel water leakage image.

Background

Along with the development of underground projects such as highway tunnels, subway tunnels, comprehensive underground pipe galleries and the like in China, tunnel leakage water becomes one of common diseases in the tunnels, and long-term leakage water can influence the operation safety of the tunnels.

At present, most of tunnel water leakage detection is mainly manual inspection, and the manual inspection method is mainly used for dispatching personnel to regularly go into a tunnel to manually visually inspect or acquire data by using a camera and the like. With the continuous development of information technology, computer vision methods are gradually used in tunnel water leakage detection. The computer vision method includes a conventional image processing method, a deep learning method, and the like. The traditional image processing method utilizes methods such as threshold segmentation, edge detection, morphological analysis and the like to detect and identify the water leakage diseases on the surface of the tunnel. The deep learning method utilizes a multilayer neural network to mine multilayer characteristics of images from massive image data information and continuously collect the multilayer characteristics of the images into a network model, and then completes tasks such as classification, positioning, segmentation and the like of input image data by training a specific network model.

However, the internal environment of the tunnel is complex, the number of interference factors is large, manual inspection is easily affected by subjective factors, inspection efficiency is low, and a large amount of manpower is consumed. At present, the traditional computer vision rule is difficult to overcome the interferences of low contrast ratio of diseases on the surface of the subway tunnel, uneven illumination, serious background noise pollution and the like, and can not accurately detect the leakage water image in the complex environment. Therefore, a method for accurately detecting a leakage water image even in a complicated environment is required.

Disclosure of Invention

The embodiment of the invention provides an intelligent detection method, device, equipment and storage medium for tunnel leakage water images, which aim to solve the problem that the leakage water images in a complex environment cannot be accurately detected at present.

In a first aspect, an embodiment of the present invention provides an intelligent detection method for a tunnel leakage water image, including:

acquiring a plurality of leakage water image samples in a preset sample library;

marking each leakage water image sample respectively to obtain a leakage water label graph containing marks;

training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer for connecting the encoder and the decoder, and a depth supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void ratios;

and detecting the leakage water image to be detected based on the trained leakage water image detection model.

In one possible implementation, the attention module transforms a one-dimensional convolution block of size k in the ECA module to include multiple parallel convolution blocks with different void rates; and the output of the attention module and the output of the central layer in the decoder serve as the input of the deep supervisory network.

In one possible implementation, the encoder and decoder each comprise a plurality of levels, each level of the encoder comprising a volume block, an attention module and a downsampling layer connected in sequence; each level of the decoder comprises a nearest neighbor upsampling layer, a rolling block and an attention module which are connected in sequence;

the central layer is used for connecting the lowest layer level of the encoder and the uppermost layer level of the decoder;

and the output end of the central layer and the output end of each layer of the decoder are respectively provided with a deep supervision network module, and the deep supervision network module is used for predicting the output of each layer of the decoder and using the output result for deep supervision learning.

In a possible implementation manner, training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof includes:

adjusting a mixed loss function of each level of the leakage water image detection model based on the first segmentation result and the leakage water label graph containing the marks; the first segmentation result is a prediction result obtained after the water leakage image sample is processed by an encoder, a decoder and a depth supervision network module of each level, and the mixed loss function of each level comprises a binary cross entropy loss function and a Dice loss function;

training a pre-constructed leakage water image detection model based on an overall loss function of the leakage water image detection model, each leakage water image sample and a leakage water label graph thereof; wherein the overall loss function is a linear combination of the mixed loss functions of each layer.

In one possible implementation, the mixing loss function/is per layer _seg Comprises the following steps:

l _seg ＝αl _bce +l _dice ；

the global loss function L is:

wherein l _bce As a binary cross entropy loss function, l _dice The method is a Dice loss function, M is the total layer number of the deep supervision network module, M is a positive integer, alpha is a proportionality coefficient, and alpha is more than 0 and less than 1.

In one possible implementation, the hole rate of the attention module of each level of the encoder and decoder is the same, and the hole rate of the attention module of the first level in the encoder and decoder is different from the hole rates of the attention modules of the other levels.

In one possible implementation, the encoder and decoder each comprise 4 levels, each level of the encoder comprising two convolution blocks, an attention layer and a down-sampling layer, connected in sequence, each level of the decoder comprising a nearest neighbor up-sampling layer, two convolution blocks and an attention module, connected in sequence, the central layer comprising two convolution blocks, wherein the convolution blocks comprise a 3 x 3 convolution layer, a bulk normalization layer and a correction linear unit; the attention module comprises 4 parallel one-dimensional hole convolutions, and the first level of the encoder and decoder has a hole rate of (1, 2,4, 8) for the attention model of the first level, and the other levels have a hole rate of (1, 4,8, 16) for the attention model of the other levels;

the deep supervision Network module is a Side Network and is used for predicting the output of each level of the decoder and using the output of each level of the decoder for deep supervision learning; wherein, the Side Network comprises a convolution layer of 1 multiplied by 1, a nearest neighbor up-sampling layer and a Sigmoid function layer;

and the output of the central layer and a characteristic diagram generated by a nearest upper sampling layer of each level of the decoder are used as the input of the Side Network, and the output of the last level of the decoder is processed by the Side Network to obtain a result which is the output result of the leakage water image detection model.

In a second aspect, an embodiment of the present invention provides an intelligent detection apparatus for a tunnel leakage water image, including:

the acquisition sample module is used for acquiring a plurality of leakage water image samples in a preset sample library;

the sample marking module is used for marking each water leakage image sample respectively to obtain a water leakage label image containing marks;

the training model module is used for training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer for connecting the encoder and the decoder, and a deep supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void rates;

and the detection module is used for detecting the leakage water image to be detected based on the trained leakage water image detection model.

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method according to the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the method according to the first aspect or any one of the possible implementation manners of the first aspect.

The embodiment of the invention provides an intelligent detection method, device, equipment and storage medium for tunnel leakage water images. And then, training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain the trained leakage water image detection model. And finally, detecting the leakage water image to be detected based on the trained leakage water image detection model. The water leakage image detection model constructed by the invention comprises an existing encoder and an existing decoder, and is additionally provided with an attention module and a depth supervision network module, wherein the attention module can help the water leakage image detection model to effectively learn semantic features, and the depth supervision network module can effectively improve the robustness of the water leakage image detection model, so that the accurate detection of the water leakage image in a complex environment can be realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a flowchart of an implementation of an intelligent detection method for a tunnel leakage water image according to an embodiment of the present invention;

FIG. 2 is a sample of a water leakage image and a corresponding label chart according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of an attention module according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a model for detecting a leakage image according to an embodiment of the present invention;

fig. 5 is a comparison diagram of detection results after different modules are added to a UNet network according to an embodiment of the present invention;

FIG. 6 is a diagram comparing the detection results of ACPA-Net provided by the embodiment of the present invention with the prior art;

FIG. 7 is a diagram comparing the detection results of ACPA-Net and the existing deep learning network provided by the embodiment of the present invention;

fig. 8 is a schematic structural diagram of an intelligent detection device for images of tunnel leakage water according to an embodiment of the present invention;

fig. 9 is a schematic diagram of an electronic device provided in an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following description is made by way of specific embodiments with reference to the accompanying drawings.

The tunnel leakage detection can be a pixel-level binary task, which assigns the pixel position of the leakage area to label 1 and the pixel position of the background area to label 0. When a water leakage picture is input into the water leakage image detection model, a picture with the same resolution as the input picture and only one channel is output finally. The value of each pixel position in the picture represents the probability that the pixel position belongs to the water leakage area, so that the pixel position actually belonging to the water leakage area in the output result of the last layer is expected to have a higher probability, and the pixel position belonging to the background area has a smaller probability. However, when detecting the leakage water image in the complex environment, the leakage water image in the complex environment cannot be accurately detected due to the interferences such as uneven lighting, serious background noise pollution, etc.

In order to solve the problem of the prior art, the embodiment of the invention provides an intelligent detection method, an intelligent detection device, an intelligent detection equipment and a storage medium for a tunnel leakage water image. First, the intelligent detection method for the tunnel leakage water image provided by the embodiment of the invention is described below.

The execution subject of the intelligent detection method for the tunnel water leakage image can be an intelligent detection device for the tunnel water leakage image, and the intelligent detection device for the tunnel water leakage image can be an electronic device with a processor and a memory, such as a mobile electronic device or a non-mobile electronic device. The embodiments of the present invention are not particularly limited.

Referring to fig. 1, it shows a flowchart of an implementation of the method for intelligently detecting a tunnel leakage water image according to an embodiment of the present invention, which is detailed as follows:

step S110, obtaining a plurality of water leakage image samples in a preset sample library.

Because of the relatively small data sets disclosed for tunnel leak detection, the present invention uses a self-made sample library for experimentation. Pictures used for making the sample library are all acquired from water leakage images in the railway tunnels in a plurality of different environments by using a digital camera.

And step S120, marking each water leakage image sample respectively to obtain a water leakage label graph containing marks.

Each water leakage image sample can be labeled separately using Labelme, with the target area with water leakage labeled as 1 and the background area without water leakage labeled as 0. In the invention, 1370 pairs of pictures are finally obtained through operations such as cropping, zooming and the like. Due to the memory constraints of the GPU, all images are scaled to 256 × 256 in the present invention, where 1100 are used for training and 270 are used for testing.

As shown in fig. 2, 2 sets of leakage water images and corresponding label maps thereof, wherein the 1 st and 3 rd columns are leakage water images, and the 2 nd and 4 th columns are corresponding label maps, it can be seen from the figure that the environment inside the tunnel is complex, and some leakage water areas are covered by pipes, cables, shadows, or the like. Some artifacts like artifacts, mosses, shadows, etc. that are similar in color to the leaking water area can make the segmentation more difficult. In addition, there are some blurred images and low grayscale images in the leakage water image sample, which also can present challenges to leakage water detection.

Step S130, training a pre-constructed leakage water image detection model based on each leakage water image sample and the leakage water label graph thereof to obtain the trained leakage water image detection model.

The water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer for connecting the encoder and the decoder, and a depth supervision network module respectively connected with the decoder and the output end of the central layer, attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void ratios.

In order to obtain an accurate segmentation result, the water leakage image detection model needs to obtain high-quality semantic features and sufficient spatial detail information. In order to obtain high-quality semantic features, in the process of extracting the image features, the water leakage image detection model generally performs down-sampling operation on the feature map, and the operation not only can reduce the resolution of the feature map and reduce the calculation amount, but also is beneficial to increasing the receptive field of a network. But the loss of spatial detail information is caused in the process of down-sampling the feature map. UNet integrates a feature map similar to the decoder resolution in the encoder into the decoder through a skip connection. By fusing features of the same level in the encoder, the decoder can fill in some missing detail information when gradually increasing the spatial resolution. On the basis of UNet, the invention integrates an attention module and a deep supervision network module.

In some embodiments, the attention module is a module that transforms a one-dimensional convolution block of size k in an ECA module to include multiple convolution blocks in parallel and with different hole rates; and the output of the attention module and the output of the central layer in the decoder serve as the input of the deep supervisory network.

In this embodiment, the attention module can make the water leakage image detection model focus on the area or target to which they should focus more, which is a method that can effectively improve the detection accuracy. An ECA (Efficient Channel Attention) module, which changes two layers of full connection in the SE module into one-dimensional convolution with a convolution kernel size of k, so that a direct corresponding relation exists between a Channel and corresponding parameters thereof, and the parameters are further reduced. The inventor finds that the ECA module uses one-dimensional convolution with the size of k, and only channel interaction can be performed between adjacent k channels, and interaction between channels with longer distance cannot be considered, so that the detection accuracy is low. For the convenience of the subsequent description, the attention module adopted in the present invention is named as ACPA (advanced channel planning) module, which is a module that converts a one-dimensional convolution with a size of k into a plurality of parallel one-dimensional convolutions with different void rates on the basis of an ECA module, and is used for performing interaction between channels at greater distances from a plurality of scales.

The structure of the ACPA module is described in detail below, and fig. 3 is a schematic structural diagram of the ACPA module:

taking a feature graph F with the height, the width and the channel number of H, W and C respectively as the input of the attention module, namely F epsilon R ^H ^×W×C It can be expressed as F = [ F = [ ] ₁ ,f ₂ ,f ₃ ,…f _c ]Wherein f is _i ∈R ^H×W The ith channel profile in F is shown. First, using the spatial feature information of the Global Average Pooling (GAP) aggregated feature map F, at this time, the feature vector F can be obtained _GAP ∈R ^1×1×C The kth element of the feature vector may be represented as follows:

wherein f is _k (i, j) is represented by f _k Coordinate (i, j) position in (a). To facilitate the one-dimensional convolution operation, F is added _GAP Remove one dimension and obtain F _GAP ∈R ^4×C At this time F _GAP Is represented as follows:

F _GAP ＝[f _GAP(1) ,f _GAP(2) ,f _GAP(3) ,...,f _GAP(C) ]；

next, in order to allow interaction between feature map channels over multiple scale distances, F _GAP Is input into 4 parallel one-dimensional hole convolutions, and the hole rate of the four hole convolutions can be expressed as rates = (r) ₁ ,r ₂ ,r ₃ ,r ₄ ) (for example, rates = (1, 4,8, 12) may be set). The one-dimensional convolution can enable all channels to share the learned parameters, and can effectively enable adjacent K characteristic channels to carry out cross-channel interaction. Convolution using four parallel one-dimensional holesThe cross-channel interaction of K adjacent channels may be extended to cross-channel interaction of K channels with multiple scale distances. Then adding the obtained results of the four parallel one-dimensional convolutions element by element to obtain Y epsilon R ^1×C . Thus F will be _GAP Input to parallel cavity convolution to obtain Y [ i ]]Can be formulated as follows:

in the above formula, w [ k ]]Representing the kth element of a convolution kernel of length K. rates [ j ] of]Represents the jth hole rate in rates, and rates j before convolution]-1 0 s are inserted into F respectively _GAP To ensure that the whole process does not change the resolution of the original feature map. And finally, normalizing Y through a sigmoid function to obtain a final channel attention diagram. So far, all the operations described above can be simply expressed as follows:

in the above formula, F represents the input to the entire ACPA module.

Indicates a void rate of rates [ i ]]And the convolution kernel length is K. GAP and sigma respectively represent a global average pooling layer and a sigmoid layer. The resulting channel attention weight map W may be used to emphasize feature channels associated with the target region and de-emphasize feature channels not associated with the target region by assigning different weights to each channel in F, as follows:

in the above-mentioned formula,

this shows the element-by-element multiplication of the channel attention weight graph W with the input feature graph F, but before this operation it is necessary to extend each element in W into a feature graph of the same spatial dimension as F.

Represents the output of the entire ACPA module whose individual channel information has been selectively enhanced or suppressed by the ACPA module. So far, it is all the operations of the ACPA module.

The ACPA module is a lightweight module and can be inserted at will. In ACPA-Net, ACPA modules are inserted at the end of two volume blocks in each step. Given an input X, the output Y of a step can be expressed as:

Y＝ACPA(ConvB(ConvB(X)))；

ConvB denotes a volume block, which is a combination of a 3 × 3 volume layer, a batch normalization layer, and a ReLU activation function.

In some embodiments, to further improve the accuracy of detection, the water leakage image detection model is made more discriminative by using a deep supervision network module in the water leakage image detection model. The depth monitoring network module adds some extra constraints to the former layers except the last layer of the water leakage image detection model, and can improve the convergence speed and the accuracy of the network under certain conditions.

In order to use a deep supervision Network module in a model, the invention adds a Side Network (Side Network) in a decoder of the model. The role of the Side network is to predict the outputs of each level in each decoder and then use these outputs for deep supervised learning. The Side network consists of a 1 multiplied by 1 convolutional layer, a nearest neighbor upsampling layer and a sigmoid layer. The output of the Central layer and the feature map generated by each up-sampling step in the decoder are used as the input of the Side network. First, a 1 × 1 convolution is used to predict the result and get the output. Then, except for the output of the last layer, the rest will be restored to the same resolution as the input image by the up-sampling layer. Finally, the output of the Side Network is obtained through the sigmoid layer, and the output is supervised by the label graph.

In some embodiments, the encoder and decoder each include a plurality of levels, each level of the encoder including a volume block, an attention module, and a downsampling layer connected in series. Each level of the decoder includes a nearest neighbor upsampling layer, a rolling block, and an attention module connected in sequence. The center layer is used to connect the lowermost level of the encoder and the uppermost level of the decoder. And the output end of the central layer and the output end of each layer of the decoder are respectively provided with a deep supervision network module, and the deep supervision network module is used for predicting the output of each layer of the decoder and using the output result for deep supervision learning.

In this embodiment, the hole rate of the attention module of each level of the encoder and decoder may be the same, and the hole rate of the attention module of the first level in the encoder and decoder is different from the hole rates of the attention modules of the other levels.

The water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer for connecting the encoder and the decoder, and a depth supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder. For convenience of subsequent description, the leakage water image detection model constructed in the invention is referred to as an ACPA-Net network for short.

Specifically, as shown in fig. 4 of the ACPA-Net network structure, the encoder and decoder in the ACPA-Net network each include 4 levels, and each level of the encoder includes two convolution blocks, an attention layer and a downsampling layer, which are connected in sequence. Each level of the decoder comprises a nearest neighbor upsampling layer, two convolution blocks, and an attention module connected in sequence. The Central layer is a Central layer and comprises two volume blocks. The convolution block includes a 3 x 3 convolution layer, a batch normalization layer and a corrective linear unit. The attention module ACPA module includes a plurality of parallel one-dimensional hole convolutions. The deep supervised Network module is a Side Network and is used for predicting the output of each level of the decoder and using the output of each level of the decoder for deep supervised learning; wherein, the Side Network comprises a convolution layer of 1 multiplied by 1, a nearest neighbor up-sampling layer and a Sigmoid function layer. And the output of the central layer and a characteristic diagram generated by the nearest upper sampling layer of each level of the decoder are used as the input of the Side Network, and the output of the last level of the decoder is processed by the Side Network to obtain a result which is the output result of the leakage water image detection model. Note that step in fig. 4 represents a hierarchy.

Wherein, the attention module ACPA module can be 4 parallel one-dimensional hole convolutions, and the first level of attention model of the encoder and decoder has a hole rate of (1, 2,4, 8), and the other levels of attention model have a hole rate of (1, 4,8, 16).

In some embodiments, the ACPA-Net constructed in the invention is implemented based on a deep learning framework PyTorch 1.4.0, and Pycharm used by a software platform. Limited by the memory of the display card, the batch size in the whole training process is set to be 4 finally, the RMSprop is selected by the optimizer, wherein the initial learning rate is set to be 1 multiplied by 10 ^-4 Momentum of 0.9, weight decay of 1X 10 ^-8 In this embodiment, reduceLROnPlateau is used as a learning rate adjustment strategy, and when no loss is reduced after 3 rounds of training, the existing learning rate is multiplied by 0.1. When the learning rate reaches 1 × 10 ^-8 After which the learning rate will not drop. Meanwhile, an early stop mechanism is adopted in the embodiment, the total number of training rounds is set as 200 rounds, but when the F1-score of the verification set is not improved in 10 rounds, the whole training process is terminated early.

All experiments described in the following invention are performed based on the same training set and test set, and the network in this embodiment does not use pre-training in the experiments, but directly uses Kaiming normal to perform parameter initialization.

In some embodiments, a loss function is introduced for calculating the gap between the predicted output and the label graph throughout the supervision of the network. Binary cross-entry loss (BCE loss) and Dice-coeffecific loss (Dice loss) are two loss functions commonly used in image detection. These two loss functions are a loss function belonging to the pixel level and a loss function belonging to the region level, respectively. The Dice coefficient is an important index for measuring the quality of the segmentation result.

The model can be optimized directly towards the optimal direction of the Dice coefficient by using the Dice loss function as the loss function. The Dice loss function is used as a loss function related to the region, and the performance is good for a scene with seriously unbalanced positive and negative samples, but the loss is unstable in the training process due to small targets in the training samples. Compared with the Dice loss function, the BCE loss function has more stable performance. In order to improve the supervision effect, the embodiment of the invention simultaneously utilizes the advantages of the Dice loss function and the BCE loss function to construct the mixed loss function.

And adjusting the mixing loss function of each level of the water leakage image detection model based on the first segmentation result and the water leakage label graph containing the marks. The first segmentation result is a prediction result obtained after the water leakage image sample is processed by an encoder, a decoder and a deep supervision network module of each level, and the mixed loss function of each level comprises a binary cross entropy loss function and a Dice loss function.

In particular, the mixing loss function/per layer _seg Comprises the following steps:

l _seg ＝αl _bce +l _dice ；

l _bce as a binary cross-entropy loss function,/ _dice The method is a Dice loss function, alpha is a proportional coefficient, alpha is more than 0 and less than 1, and the proportion of the BCE loss function in the mixed loss function can be adjusted. In the subsequent experiments of the invention, alpha is set to be 0.5, the gradient is smoothed by using a BCE loss function, and then the foreground part of the network is more concerned by using a Dice loss function.

The overall loss function L of the water leakage image detection model is as follows:

m is a deep supervision network moduleM is a positive integer,

the loss calculated for the mth output of the deep supervised network module side network.

To prove the effectiveness of the mixing loss function, the sensitivity of the parameter α in the mixing loss function was analyzed. The performance of different loss functions on ACPA-Net was tested. The loss functions include a BCE loss function, a Dice loss function, and a mixture loss function. The BCE loss function, dice loss function, and blend loss function are denoted as "BCE", "DL", and "α BCE + DL", respectively. Wherein alpha in the alpha BCE + DL is selected to be 0.25, 0.5, 1, 1.5 and 2 in sequence. The training process fluctuates greatly when the Dice loss function is used. Selecting the BCE loss function has a more stable training process. The hybrid loss function makes the training process more stable compared to the Dice loss. More importantly, the hybrid loss function makes the network perform better than the Dice loss and BCE loss. It is worth noting that the closer alpha is to 1, the better the network performance. This means that the network can achieve better results when the proportion of BCE loss in the mixed loss is closer to the proportion of the Dice loss function.

In some embodiments, in order to evaluate the performance of the model, some evaluation index needs to be used in the experiment to evaluate the model.

The evaluation indicators used in the present invention are: precision, recall, interaction-over-Unit (IoU), F1-score, and Accuracy. Precision in the invention indicates how many areas of leakage water are correctly predicted in the result of the prediction. Specifically, precision represents the ratio of pixels that are correctly predicted as the water leakage area to all pixels that are predicted as the water leakage area. Recall is the Recall rate which indicates how many areas of leakage water in the label map are correctly predicted. Specifically, it refers to the proportion of pixels correctly predicted as the leakage water area to the pixels of all real leakage water areas in the label map. A single Precision or Recall being higher does not mean that the model performs well. Therefore, precision and Recall are often used together to measure the effectiveness of a network. F1-score is expressed mathematically as a Dice coefficient, taking into account Precision and Recall.

A good model requires a balance point to be found between Precision and Recall, so F1-score is a reliable evaluation index for evaluating network performance. The IoU is also an important index, and represents the intersection ratio of the leaked water area in the segmentation result and the leaked water area in the label map. Accuracy is defined as the percentage of pixels that are correctly predicted among all pixels, including the leakage area versus the background area and background.

And S140, detecting the leakage water image to be detected based on the trained leakage water image detection model.

After the leakage water image detection model, i.e., the ACPA-Net network constructed in step S130, is trained, it can be used to detect the leakage water image to be detected.

The intelligent detection method provided by the invention comprises the steps of firstly, obtaining a plurality of leakage water image samples in a preset sample library, and then, respectively marking each leakage water image sample to obtain a leakage water label map containing marks. And then, training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain the trained leakage water image detection model. And finally, detecting the leakage water image to be detected based on the trained leakage water image detection model. The water leakage image detection model constructed by the invention comprises an existing encoder and an existing decoder, and is additionally provided with an attention module and a depth supervision network module, wherein the attention module can help the water leakage image detection model to effectively learn semantic features, and the depth supervision network module can effectively improve the robustness of the water leakage image detection model, so that the water leakage image under the complex environment can be accurately detected.

The following are a series of comparative experiments performed to verify the detection effect of the ACPA-Net network provided by the present invention.

Firstly, in order to verify the effectiveness of the attention module ACPA and the deep supervision Network module Side Network in the invention on the detection of the leakage water image, the invention takes a UNet Network as a reference Network (base), and compares the result with the result of integrating the output of the ACPA module and the deep supervision Network module. To ensure fairness, the same values were used for both the batch size and the spare rate in the experiments of the present invention. The experiment is carried out on the railway tunnel water leakage picture provided by the invention. Wherein F1-score, ioU and Accuracy were used as evaluation indexes. The hole rate of the ACPA module in the first level in the encoder and decoder is set to (1, 2,4, 8), and the hole rates of the ACPAs in the other levels are set to (1, 4,8, 16). The convolution kernel length of the one-dimensional convolution is set to 3.

The experimental results are shown in table 1, wherein "+ ACPA" and "+ ACPA + DS" respectively indicate the incorporation of the ACPA module into UNet and, on the basis thereof, the further incorporation of the deep supervised network module. The introduction of the ACPA module increased F1-score, ioU and Accuracy by 1.55%, 2.1% and 0.37%, respectively, compared to UNet. Particularly, after the deep supervision network module is further added into the model for learning, the indexes F1-score, ioU and Accuracy all obtain optimal values which are respectively 90.75%, 83.62% and 96.68%. Compared with a reference network, F1-score is improved by 3.9%, and IoU and Accuracy are also greatly improved.

TABLE 1ACPA Module and deep supervision network Module effectiveness on models

Method	F1-score(％)	IoU(％)	Accuracy(％)
				UNet	86.85	77.86	95.43
+ACPA	88.4(1.55↑)	79.96(2.1↑)	95.80(0.37↑)
				+ACPA+DS	90.75(3.9↑)	83.62(5.76↑)	96.68(1.25↑)

FIG. 5 shows the results of the detection by the above 3 methods, wherein GT (ground route) is a label chart. UNet has poor results and is susceptible to noise, which makes it easy to identify an interfering object with a low gray level, such as a pipe or a shadow, as a water leakage area. As can be seen from fig. 5, the ACPA module greatly reduces the misrecognition rate. This indicates that the ACPA module can help the network to learn better semantic feature representations efficiently. However, as shown in the fourth row of fig. 5, the network is still subject to some noise, which results in the segmentation result being internally discontinuous. And after a deep supervision strategy is further added into the network, the result is further improved.

The prediction result of the model provided by the invention is more uniformly represented in the leakage water area. The overall experiment shows that the ACPA module helps the network to extract the characteristics of the effective leaky water region, while adding appropriate supervision at the layers in front of the network makes the network more robust and a more consistent representation inside the target region.

Secondly, in order to verify the effectiveness of the ACPA module for detecting the image of the water leakage in the tunnel, the invention uses the image of the water leakage in the railway tunnel to evaluate the performance of the ACPA module and other attention modules. According to the lightweight property of the ACPA module, the invention compares the ACPA module with other four similar lightweight modules, mainly comprising an SE module, a CBAM module, a spatial and channel SE (scSE) module and an ECA module. Let UNet merged into deep supervised learning be a reference network (baseline), and it is denoted as "UNet _ DS". Then "+ SE", "+ CBAM", "+ scSE", "+ ECA", and "+ ACPA" indicate the addition of SE, CBAM, scSE, ECA, and ACPA attention modules, respectively, to the reference network. The experimental results are shown in table 2, where "Δ Param" represents the number of parameters added after the reference network is additionally added to the attention module, and it can be seen that the reference network achieves better effect than the reference network after the reference network is added to the attention module, which indicates that it is necessary to perform interaction between channels.

Table 2 ACPA module versus other attention modules

Methods	F1-score(％)	IoU(％)	Accuracy(％)	ΔParam
					UNet_DS	87.81	79.2	95.77	—
+SE	89.68(1.87↑)	81.76(2.56↑)	96.17(0.40↑)	13.95K
					+CBAM	89.50(1.69↑)	81.69(2.49↑)	96.36(0.59↑)	30.28K
+scSE	88.14(0.33↑)	79.77(0.57↑)	95.64(0.13↓)	110.30K
					+ECA	89.92(2.11↑)	82.39(3.19↑)	96.49(0.72↑)	24
+ACPA	90.75(2.94↑)	83.62(4.42↑)	96.68(0.91↑)	96

The CBAM, scSE and SE modules each compress the feature channels in the fully connected layer to reduce the number of parameters and computations in the channel attention module. This strategy results in an indirect rather than direct mapping between channels and weights. It is worth noting that the + ECA adds few parameters to the reference network, but achieves performance comparable to or even better than + SE, + CBAM, + scSE. This result demonstrates that the strategy of avoiding compressed feature channels in the ECA module can more efficiently perform cross-channel interactions. As can be seen from Table 2, "+ ACPA" performs best, being 0.83%, 1.23%, 0.19% higher than "+ ECA" in the indices F1-score, ioU and Accuracy, respectively, but with the addition of 72 parameters only on the basis of the former. The results indicate that the parallel hole convolution used in the ACPA module can effectively perform longer distance channel interactions.

Then, the ACPA-Net provided by the invention is compared with the three existing image detection methods. The first conventional image segmentation method is an Otsu Algorithm (OA) based on threshold segmentation, which is representative. The second is a representative Region Growing Algorithm (RGA) based on a region segmentation method. The third is the GrabCut algorithm based on graph theory. The results of the different methods on the test set are listed in table 3.

TABLE 3 ACPA-Net comparison with existing methods

Methods	F1-score(％)	Recall(％)	Precision(％)	IoU(％)	Accuracy(％)
						OA	48.69	97.67	34.32	33.84	63.30
RGA	63.09	73.76	63.64	48.40	82.11
						GrabCut	89.29	90.68	88.75	81.34	96.25
ACPA-Net	90.75	90.47	91.56	83.62	96.68

It can be seen that all evaluation indexes predicted by OA and RGA are much lower than those obtained by the method proposed by the present invention, except Recall. Although the results of OA prediction gave the highest Recall of 97.67%, the reason for this was that OA had difficulty finding the appropriate threshold automatically to segment the leaky water region from the background region. It tends to group the background area with the leakage water area, which also results in a very small Precision while obtaining a very high Recall for OA. Therefore, the values of F1-score, ioU and Accuracy are all lower. From the overall index, the results of OA are the worst, the results obtained by RGA are better than those obtained by OA, and GrabCut is the best among three traditional image segmentation methods and is far better than the other two traditional image segmentation methods. The method provided by the invention is superior to the three traditional image segmentation methods.

Fig. 6 is a comparison graph of the detection results of ACPA-Net and the prior art method, and it is obvious from the comparison graph that the detection results of ACPA-Net are better, and the image segmentation results of the prior art method are easily affected by various factors. As can be seen from the third row of fig. 6, the water leakage area obtained by the conventional image segmentation method has many holes due to the influence of light reflection, and the RGA can segment only a small portion of the water leakage area. For images in low light conditions, such as the last line in fig. 6, RGA and OA algorithms have difficulty distinguishing between foreground and background. GrabCut has superior performance compared to the first two methods, but this method requires manual interaction during segmentation, i.e. manual framing of the leakage regions and appropriate labeling of the mispredicted regions. GrabCut needs to be exchanged for a better segmentation result at the cost of manual interaction, so GrabCut is not an efficient method and is still inferior to the method provided by the invention. In contrast, as a data-driven deep learning method, the method provided by the invention has higher efficiency and better robustness than the traditional image segmentation method.

Finally, the ACPA-Net provided by the present invention is compared with existing detection methods based on deep learning networks, and these methods include: PSPNet, deepLabv3+, linkNet, and SFNet. Because the data set is small in scale, in the experiment, resNet-18 is used as a feature extraction network in all the networks, and ImageNet data sets are used for pre-training in all the networks, so that overfitting can be reduced to a certain extent, the performance of the model is improved, and the accuracy of the model is improved. Table 4 shows the comparison results of the method of the present invention and the five deep learning methods described above on the evaluation index.

TABLE 4 detection and comparison results of ACPA-Net and the existing deep learning network

Method	Param	F1-score(％)	Recall(％)	Precision(％)	IOU(％)	Accuracy(％)
							PSPNet	11.51M	87.42	88.17	87.57	78.24	95.63
DeepLabv3	15.90M	88.84	88.66	89.85	80.53	95.73
							DeepLabv3+	12.33M	88.37	89.89	87.98	79.82	95.83
LinkNet	11.66M	89.32	88.65	91.11	81.5	96.25
							SFNet	13.75M	90.02	90.05	90.77	82.46	96.51
Ours	4.32M	90.75	90.47	91.56	83.62	96.68

As can be seen from table 4, PSPNet gave the worst results, in contrast to LinkNet which has similar parameters to PSPNet but gave far better results than PSPNet. SFNet gave slightly lower results than the process of the invention. The model of the invention has the least parameter quantity and has the optimal performance on other indexes.

Fig. 7 is a comparison graph of detection results of ACPA-Net and the existing deep learning network, where the edge of the leakage area in the image predicted by the PSPNet is rough, and the PSPNet cannot effectively distinguish the pipe from the leakage area. The result shows that the Pyramid Pooling Module (PPM) for multi-scale context modeling proposed in PSPNet is not suitable for water leakage partitioning in complex environment. Deplab v3 and deplab v3+ maintain the receptive field of the augmented network and maintain image resolution in the encoder by using hole convolution, and then capture multi-scale context information using the ASPP module. As can be seen from fig. 7, deplab v3 and deplab v3+ gave results that were slightly elevated compared to PSPNet, but the leaky water areas in their predicted results still had very poor margins. As shown in the first row of fig. 7, deep bv3 and deep bv3+ make it difficult to make accurate predictions in a dark environment. LinkNet, SFNet and the ACPA-Net proposed by the present invention supplement some detail information by fusing feature maps from the encoder, and SFNet aligns two adjacent levels of feature maps by predicting the Flow field using a Flow Alignment Module (FAM), so LinkNet, SFNet and ACPA-Net predict the boundaries of the target region more clearly and accurately than the aforementioned PSPNet, deplab v3 and deplab 3 +. Compared with other methods, the ACPA-Net provided by the invention still obtains better results. As can be seen from the first two lines of fig. 7, there are severe misclassified regions in the other 5 prediction results based on deep learning. By means of the ACPA module and the deep supervision strategy, the method successfully obtains the most accurate segmentation result. The ACPA module helps the network to strengthen useful feature channels and suppress useless feature channels. This may help the network learn a more robust characterization of the water-leaking region. A skip connection between an encoder and a decoder may enhance some details of a picture by restoring local spatial information. Deep supervision can effectively improve the robustness of the network and also can enable a more consistent representation to be obtained inside the target area. The method proposed by the present invention still achieves better detection results than other methods in the latter lines of fig. 7.

According to the water leakage image detection model ACPA-Net provided by the invention, multi-scale long-distance cross-channel interaction can be effectively carried out by adding the attention module ACPA, so that useless features are inhibited, useful features are enhanced, a network is helped to learn stronger feature representation of a water leakage area, and parameters are fewer. The added deep supervision network module can effectively improve the robustness of the network and also can obtain more consistent representation in the target area. And a series of experimental results show that the water leakage image detection model ACPA-Net provided by the invention has a more accurate detection effect in a complex environment.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Based on the intelligent detection method for the tunnel leakage water image provided by the embodiment, correspondingly, the invention also provides a specific implementation mode of the detection device applied to the intelligent detection method for the tunnel leakage water image. Please see the examples below.

As shown in fig. 8, there is provided an intelligent detection device 800 for images of tunnel leakage water, comprising:

an acquiring sample module 810, configured to acquire a plurality of leakage water image samples in a preset sample library;

a marking sample module 820, configured to mark each leakage water image sample respectively to obtain a leakage water label map with a mark;

the training model module 830 is configured to train a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label map thereof, so as to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer for connecting the encoder and the decoder, and a depth supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void ratios;

the detection module 840 is configured to detect a leakage water image to be detected based on the trained leakage water image detection model.

In one possible implementation, the attention module is to transform a one-dimensional convolution block of size k in the ECA module to include a plurality of convolution blocks in parallel and with different hole rates; and the output of the attention module and the output of the central layer in the decoder serve as the input of the deep supervisory network.

In a possible implementation manner, the training model module 830 is configured to adjust a mixing loss function of each level of the leakage water image detection model based on the first segmentation result and the leakage water label map containing the label; the first segmentation result is a prediction result obtained after the water leakage image sample is processed by an encoder, a decoder and a depth supervision network module of each level, and the mixed loss function of each level comprises a binary cross entropy loss function and a Dice loss function;

In one possible implementation, the per-layer mixing loss function/ _seg Comprises the following steps:

l _seg ＝αl _bce +l _dice ；

the global loss function L is:

wherein l _bce As a binary cross-entropy loss function,/ _dice The method is a Dice loss function, M is the total layer number of the deep supervision network module, M is a positive integer, alpha is a proportionality coefficient, and alpha is more than 0 and less than 1.

In one possible implementation, the encoder and the decoder each comprise 4 levels, each level of the encoder comprises two convolution blocks, an attention layer and a down-sampling layer, which are connected in sequence, each level of the decoder comprises a nearest neighbor up-sampling layer, two convolution blocks and an attention module, which are connected in sequence, and the central layer comprises two convolution blocks, wherein the convolution blocks comprise a 3 × 3 convolution layer, a batch normalization layer and a correction linear unit; the attention module comprises 4 parallel one-dimensional hole convolutions, and the first level of the encoder and decoder has a hole rate of (1, 2,4, 8) for the attention model of the first level, and the other levels have a hole rate of (1, 4,8, 16) for the attention model of the other levels;

Fig. 9 is a schematic diagram of an electronic device provided in an embodiment of the present invention. As shown in fig. 9, the electronic apparatus 9 of this embodiment includes: a processor 90, a memory 91 and a computer program 92 stored in said memory 91 and executable on said processor 90. The processor 90 executes the computer program 92 to implement the steps in the above-mentioned method for intelligently detecting images of tunnel leakage water, such as steps 110 to 140 shown in fig. 1. Alternatively, the processor 30, when executing the computer program 32, implements the functions of the modules in the device embodiments, such as the functions of the modules 810 to 840 shown in fig. 8.

Illustratively, the computer program 92 may be partitioned into one or more modules that are stored in the memory 91 and executed by the processor 90 to implement the present invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 92 in the electronic device 9. For example, the computer program 92 may be divided into modules 810 to 840 as shown in fig. 8.

The electronic device 9 may include, but is not limited to, a processor 90, a memory 91. Those skilled in the art will appreciate that fig. 9 is merely an example of the electronic device 9, and does not constitute a limitation of the electronic device 9, and may include more or fewer components than those shown, or some of the components may be combined, or different components, e.g., the electronic device may also include an input-output device, a network access device, a bus, etc.

The Processor 90 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 91 may be an internal storage unit of the electronic device 9, such as a hard disk or a memory of the electronic device 9. The memory 91 may also be an external storage device of the electronic device 9, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 9. Further, the memory 91 may also include both an internal storage unit and an external storage device of the electronic device 9. The memory 91 is used for storing the computer program and other programs and data required by the electronic device. The memory 91 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. For the specific working processes of the units and modules in the system, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.

In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein.

Claims

1. An intelligent detection method for tunnel water leakage images is characterized by comprising the following steps:

training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder in jump connection with the encoder, a central layer connected with the encoder and the decoder, and a deep supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void rates;

2. The intelligent detection method as claimed in claim 1, wherein the attention module is a module for transforming a one-dimensional convolution block with size k in the ECA module into a convolution block including a plurality of convolution blocks in parallel and having different hole rates; and the output of the attention module in the decoder and the output of the central layer are used as the input of the deep supervision network.

3. The intelligent detection method of claim 1 or 2, wherein the encoder and the decoder each comprise a plurality of levels, each level of the encoder comprising a volume block, an attention module and a downsampling layer connected in sequence; each level of the decoder comprises a nearest neighbor upsampling layer, a rolling block and an attention module which are connected in sequence;

the center layer is for connecting a lowest level of the encoder and an uppermost level of the decoder;

and a deep supervision network module is arranged at the output end of the central layer and the output end of each level of the decoder and is used for predicting the output of each level of the decoder and using the output result for deep supervision learning.

4. The intelligent detection method as claimed in claim 3, wherein the training of the pre-constructed water leakage image detection model based on each water leakage image sample and its water leakage label map comprises:

adjusting a mixing loss function of each level of the leakage water image detection model based on a first segmentation result and the leakage water label graph containing the marks; the first segmentation result is a prediction result obtained after the water leakage image sample is processed by the encoder, the decoder and the deep supervision network module of each level, and the mixed loss function of each level comprises a binary cross entropy loss function and a Dice loss function;

training a pre-constructed leakage water image detection model based on the overall loss function of the leakage water image detection model, each leakage water image sample and a leakage water label graph thereof; wherein the overall loss function is a linear combination of the mixing loss functions of each layer.

5. The intelligent detection method of claim 4, wherein the per-layer mixing loss function/, is _seg Comprises the following steps:

l _seg ＝αl _bce +l _dice ；

the overall loss function L is:

wherein, the _bce As a binary cross entropy loss function, l _dice And the function is a Dice loss function, M is the total layer number of the deep supervision network module, M is a positive integer, alpha is a proportionality coefficient, and alpha is more than 0 and less than 1.

6. The smart detection method of claim 3, wherein the hole rate of the attention module of each level of the encoder and the decoder is the same, and the hole rate of the attention module of a first level of the encoder and the decoder is different from the hole rates of the attention modules of other levels.

7. The intelligent detection method of claim 6, wherein the encoder and the decoder each comprise 4 levels, each level of the encoder comprises two convolution blocks, an attention layer and a down-sampling layer connected in sequence, each level of the decoder comprises one nearest neighbor up-sampling layer, two convolution blocks and one attention module connected in sequence, the central layer comprises two convolution blocks, wherein the convolution blocks comprise a 3 x 3 convolution layer, a batch normalization layer and a correction linear unit; the attention module comprises 4 parallel one-dimensional hole convolutions, and the first level of attention models of the encoder and the decoder has a hole rate of (1, 2,4, 8), and the other levels of attention models have a hole rate of (1, 4,8, 16);

the deep supervised Network module is a Side Network and is used for predicting the output of each level of the decoder and using the output of each level of the decoder for deep supervised learning; wherein, the Side Network comprises a convolution layer of 1 multiplied by 1, a nearest neighbor up-sampling layer and a Sigmoid function layer;

and the output of the central layer and a characteristic diagram generated by a nearest neighbor upper sampling layer of each level of the decoder are used as the input of the Side Network, and the result obtained after the output of the last level of the decoder is processed by the Side Network is the output result of the leakage water image detection model.

8. The utility model provides an intelligent detection device of tunnel percolating water image which characterized in that includes:

the training model module is used for training a pre-constructed leakage water image detection model based on each leakage water image sample and a leakage water label graph thereof to obtain a trained leakage water image detection model; the water leakage image detection model at least comprises an encoder, a decoder in jumping connection with the encoder, a central layer connected with the encoder and the decoder, and a deep supervision network module respectively connected with the decoder and the output end of the central layer, wherein attention modules are arranged in the encoder and the decoder, and each attention module comprises a plurality of parallel convolutions with different void rates;

9. An electronic device, comprising a memory for storing a computer program and a processor for invoking and running the computer program stored in the memory, performing the method of any one of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.