CN115496982A - Coal mine well wall crack identification method based on deep neural network - Google Patents

Coal mine well wall crack identification method based on deep neural network Download PDF

Info

Publication number
CN115496982A
CN115496982A CN202211237746.5A CN202211237746A CN115496982A CN 115496982 A CN115496982 A CN 115496982A CN 202211237746 A CN202211237746 A CN 202211237746A CN 115496982 A CN115496982 A CN 115496982A
Authority
CN
China
Prior art keywords
well wall
module
wall crack
data
crack
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211237746.5A
Other languages
Chinese (zh)
Inventor
付文俊
张亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing China Coal Mine Engineering Co ltd
Original Assignee
Beijing China Coal Mine Engineering Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing China Coal Mine Engineering Co ltd filed Critical Beijing China Coal Mine Engineering Co ltd
Priority to CN202211237746.5A priority Critical patent/CN115496982A/en
Publication of CN115496982A publication Critical patent/CN115496982A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a coal mine well wall crack identification method based on a deep neural network.A data collection module collects data from a mine underground robot and collects image data; enhancing image information acquired by the underground robot by using an image enhancement algorithm; acquiring a coal mine well wall crack target image, removing a low-quality crack image, and labeling a crack target in the image to construct a coal mine well wall crack data set; sending the well wall crack data set established by the data set establishing module into an improved Efficientdet model training module for training, inputting a crack image to be detected into an improved Efficientdet model for detection, and generating a final detection result; the invention improves the Efficientdet model, and adds a multi-space visual angle fusion module and a three-coordinate attention module on the basis of the existing model, so that the model can overcome the complex underground coal mine environment interference, solve the problem of missing detection of fine cracks and improve the detection precision.

Description

Coal mine well wall crack identification method based on deep neural network
Technical Field
The invention relates to the technical field of image recognition and detection. In particular to a coal mine well wall crack identification method based on a deep neural network, which is suitable for the aspect of potential safety hazard investigation of the coal mine well wall.
Background
The coal mine well wall cracks have serious influence on coal mine production, for example, accidents such as underground airflow direction change, underground water seepage, gas accumulation, spontaneous combustion of coal seams due to gap oxygen supplement, collapse and the like are caused. From the perspective of safety and maintenance, the timely and rapid identification and repair of cracks are beneficial to increasing the safety factor of a coal mine, reducing the maintenance cost and reducing the loss. However, because the underground light condition of the coal mine is poor, the size of the crack is small, and the like, the well wall crack is difficult to discover, and therefore how to efficiently detect the well wall crack is more and more concerned by the coal mine industry. The detection of the cracks on the wall of the artificial coal mine generally requires professional personnel to go into the well for exploration, and the mode has low efficiency, high cost and easy omission and is accompanied with potential safety hazards. It is an efficient and safe way to use a downhole robot instead of a human being.
Conventional crack detection algorithms are generally based on classical digital image processing methods, for example, using various edge detection operators Sobel and roberts, etc. These classical methods have limited applicability and are difficult to obtain accurate results when used to identify complex-background coal mine borehole wall crack detections. With the rise of deep learning, many deep neural networks are used to detect highway cracks, such as SegNet and deep. Compared with a highway crack detection task, the coal mine well wall crack detection needs to overcome complex environmental interference and poor illumination conditions, and the problems often cause missing detection and false detection of small cracks on the well wall.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to provide a coal mine well wall crack identification method based on a deep neural network, the method is based on an Efficientdet detection network, a multi-space visual angle fusion module and a three-coordinate attention module are elaborately designed, the complex environmental interference and the poor illumination condition are overcome, and the identification detection precision is improved.
In order to solve the technical problems, the invention provides the following technical scheme:
a coal mine well wall crack identification method based on a deep neural network comprises the following steps:
step 100: the data collection module collects original coal mine well wall crack video data and original coal mine well wall crack image data from the data collected by the underground mine robot according to a time sequence, divides the original coal mine well wall crack video data into frame-by-frame divided well wall crack image data, and stores the divided well wall crack image data and the original coal mine well wall crack image data to the data collection module;
step 200: the data processing and enhancing module acquires well wall crack image data collected by the data collecting module, and manually eliminates well wall crack image data with poor quality and image data without cracks; performing data enhancement on the screened well wall crack image data;
step 300: inputting high-quality well wall crack image data obtained after the data processing and enhancing module is used for processing into a data set construction module; manually marking the crack area on the high-quality well wall image data; dividing the image data after the labeling into a training set and a testing set according to a proportion, and completing the construction of a well wall crack data set;
step 400: sending the well wall crack data set established by the data set establishing module into an improved Efficientdet model training module for training and learning, and storing a weight model with the highest test set crack detection precision in the training process for detecting a well wall crack image to be detected;
step 500: and (3) sending the borehole wall crack image to be detected into a result detection module, detecting the borehole wall crack image by using the weight model with the highest crack detection precision obtained in the step 400, and displaying and storing the detection result.
In the method for identifying the coal mine well wall crack based on the deep neural network, the data enhancement processing module constructed in the step 200 comprises the following substeps:
step 210: firstly, manually screening collected well wall crack image data, and eliminating fuzzy well wall crack images and collected normal crack-free images caused by the movement of an underground robot; the image data of the well wall cracks after being screened is more than or equal to 800;
step 220: carrying out contrast enhancement on the image by adopting a CLAHE algorithm in an attributes library function, so that the borehole wall crack image data subjected to contrast enhancement processing is detected by a deep neural network;
step 230: in order to expand a limited number of databases and train a detection model with strong robustness, data amplification is carried out on the manually screened high-quality well wall fracture image data, and the high-quality well wall fracture image data is subjected to horizontal/vertical turning, rotation, scaling, cutting, shearing and translation processing.
The coal mine well wall crack identification method based on the deep neural network comprises the following steps in a data set construction module in step 300:
step 310: the image data of the amplified borehole wall crack amplified according to the image data of the amplified borehole wall crack is as follows: 2, dividing the test result into a training set and a test set; the training set is used for training the model, and the test set is used for testing the detection precision of the model;
step 320: and respectively labeling crack regions existing in the divided well wall crack images by using Label Img software, wherein the labeling frame needs to be noticed to avoid the existence of an interference object, and if necessary, longer well wall cracks are divided into a plurality of sections for labeling.
In the coal mine well wall crack identification method based on the deep neural network, the Efficinedet model training module is improved in the step 400, and the method comprises the following steps:
step 410: constructing an improved backbone feature extraction network: a multi-space visual angle fusion module and a three-element coordinate attention module are added in an improved main feature extraction network, the first three layers of multi-level features still consist of feature layers with different depths of efficiency and are respectively named as: p is 3 ,P 4 ,P 5 (ii) a The latter two layers are characterized by: from P 5 P is formed by multi-space visual angle fusion module and up-sampling 6 (ii) a From P 6 Forming P through multi-space fusion block and upsampling 7
Step 420: the multi-level trunk feature P constructed in step 410 3 ,P 4 ,P 5 ,P 6 ,P 7 Sending into a three-dimensional attention module to form an improved multi-level trunk featureSign P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′;
Step 430: multilevel trunk feature P to be improved 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ' feeding an enhanced feature extraction network BiFPN to obtain a more comprehensive multi-level enhanced feature P by fusing the improved multi-level features with each other 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 ″;
Step 440: multiple levels are enhanced by a feature P 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 "sending to decoders Class Prediction Net and Box Prediction Net to obtain the predicted target Class and the position of the Prediction frame;
step 450: inputting the position of the prediction frame obtained in the step 440 and a real prediction frame carried by the labeled borehole wall fracture data set into a formula (1) to calculate a loss value;
L=αL ce +βL focal (1)
wherein alpha and beta are hyper-parameters, and both alpha and beta are set to be 0.5; l is ce Is a binary cross entropy loss function, L focal For the focal loss function, both are calculated as follows:
L ce =-y log(p)-(1-y)log(1-p) (2)
L focal =-y(1-p) γ log(p)-(1-y)p γ log(1-p) (3)
wherein, in the formula (2) and the formula (3), y represents a data true tag value; p represents a model predictive tag value; γ represents a weight parameter, typically set to 2;
step 460: performing back propagation on the loss value calculated in the step 450, updating network parameters, and repeating the steps 410-460 until the model is trained to converge; according to practical experience, the number of model training rounds is set to be 30.
In step 410, a multi-space visual angle fusion module is established, and the method comprises the following substeps:
step (ii) of411: extracting characteristic P of Efficient backbone network 5 Sending the data into a multi-space visual angle fusion module;
step 412: p 5 ∈R N×C×H×W Cut into two parts along the channel dimension: x 1 ∈R N×αC×H×W ,X 2 ∈R N×(1-α)C×H×W (ii) a Wherein X 1 The original features do not participate in the multi-space view fusion operation, so that the original feature graph can be accessed; x 2 Participating in multi-view modeling operation; wherein N, C, H, W and alpha respectively represent the batch size of the training data, the number of channels, the image height, the image width and the cut rate of the number of channels;
step 413: firstly, X is 2 Feeding into a multi-view branch 1; x 2 Respectively obtaining the data through 1 × 1 convolution, 3 × 1 convolution and 1 × 3 convolution
Figure BDA0003881849280000041
Step 414: then X is put in 2 Feeding into a multi-view branch 2; x 2 Obtaining through 1 × 1 convolution, 5 × 1 convolution and 1 × 5 convolution respectively
Figure BDA0003881849280000042
Step 415: finally, X is 2 Feeding into a multi-view branch 3; x 2 Obtaining through 1 × 1 convolution, 7 × 1 convolution and 1 × 7 convolution respectively
Figure BDA0003881849280000043
Step 416: calculating multi-view spatial feature X using equation (4) 2 ′;
Figure BDA0003881849280000051
Where δ is the activation function, β 123 Parameters that can be updated by learning through gradient back propagation;
step 417: calculating an integral fusion characteristic X by using a formula (5);
X=Concat(X 1 ,X 2 ′) (5)
where Concat represents splicing features along the channel dimension.
In the method for identifying the coal mine well wall crack based on the deep neural network, the ternary coordinate attention module in the step 420 comprises the following substeps:
step 421: the multi-level feature P constructed in step 410 3 ,P 4 ,P 5 ,P 6 ,P 7 Respectively sent into a three-dimensional attention module to obtain an improved multi-level feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′;
Step 422: by a feature layer P 3 For example, P is first introduced 3 ∈R C×H×W Inputting the information into branch 1 of the ternary coordinate attention module, performing self-adaptive global average pooling along the H direction, and aggregating the information in the vertical direction to obtain X 1 ∈R C×H×1 (ii) a Then, the pooled features are sent to a 3 x 1 convolution to explore the features of the well wall crack region in the horizontal direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 1 (ii) a M to be generated finally 1 And P 3 Multiplying corresponding elements to obtain the enhanced feature map of branch 1
Figure BDA0003881849280000052
Spatial weight feature map M for Branch 1 of the ternary coordinate attention Module 1 The formula (2) is shown in formula (6);
step 423: secondly, P is added 3 ∈R C×H×W Inputting the data into branch 2 of the ternary coordinate attention module, performing self-adaptive global average pooling along the whole space direction, and aggregating the information on channel dimension to obtain X 2 ∈R 1×H×W (ii) a Then, sending the pooled features into the fracture region features in the 3 x 3 convolution exploration space direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 2 (ii) a M to be generated finally 2 And P 3 Multiplication of corresponding elements, obtaining increase of branch 2Strong feature map P 3 2 (ii) a Spatial weight profile M for branch 2 of the ternary coordinate attention module 2 The formula (2) is shown in formula (6);
step 424: finally P is added 3 ∈R C×H×W Inputting the information into branch 3 of the ternary coordinate attention module, performing self-adaptive global average pooling along the W direction, and aggregating the information in the horizontal direction to obtain X 3 ∈R C×1×W (ii) a Then, sending the pooled features into a 1-by-3 convolution to explore well wall crack region features in the vertical direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 3 (ii) a M to be generated finally 3 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 3
Figure BDA0003881849280000061
Spatial weight profile M for Branch 3 of the ternary coordinate attention Module 3 The formula (2) is shown in formula (6);
step 425: enhanced feature map obtained by each branch of ternary coordinate attention module
Figure BDA0003881849280000062
Splicing according to the channel dimensions and recovering the channel number through 1-by-1 convolution to obtain the final improved multi-level feature P 3 ', the calculation formula thereof is shown in formula (7);
M i =δ(Conv i (Avgpool i (P 3 ))) (6)
Figure BDA0003881849280000063
step 426: other hierarchical features such as P 4 ′,P 5 ′,P 6 ′,P 7 ' construction mode and P 3 ' same, finally obtaining improved multi-level trunk feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′。
The coal mine well wall crack identification method based on the deep neural network comprises the following substeps in step 500:
step 510: deploying the optimal network weight obtained in the step 400 to a robot server of a result detection module;
step 520: and (4) transmitting the borehole wall crack image to be detected to a robot server side by the client side, and acquiring and storing the identification detection result.
The technical scheme of the invention achieves the following beneficial technical effects:
the method can efficiently identify and detect the coal mine well wall cracks by utilizing the improved Efficientdet network. The network can extract well wall crack characteristics and crack detail information with different sizes and shapes by adding a multi-space visual angle fusion module, so that the small-scale crack omission ratio is reduced from 25% to 17%. (the difficulty in reducing the undetected rate and detecting the accuracy is that the crack image collected from the coal mine well wall is different from the detection tasks of other obvious cracks (such as highway crack detection), the crack of the coal mine well wall is more subtle and less obvious, and the crack can be positioned into a finer-grained detection task. Meanwhile, the network realizes the weight redistribution and fusion of the spatial features in different spatial directions by adding the ternary coordinate attention module, so that the noise features can be effectively inhibited, the crack features are emphasized, and the occurrence of the false detection condition of the well wall cracks is avoided as much as possible.
The invention improves the Efficientdet model, and adds a multi-space visual angle fusion module and a ternary coordinate attention module on the basis of the existing Efficientdet model, so that the model can overcome the complex interference of the underground coal mine environment and solve the problem of small crack missing detection, and the detection precision is improved from 75% to 83%.
Drawings
FIG. 1 is a flow chart diagram of a coal mine well wall crack identification method based on a deep neural network;
FIG. 2 is a visual illustration of the overall process of the invention for coal mine borehole wall crack detection;
FIG. 3 is an overall block diagram of the improved Efficientdet model of the present invention;
FIG. 4 is a block diagram of an embodiment of a multi-spatial-view fusion module according to the present invention;
FIG. 5 is a block diagram of an implementation of the ternary coordinate attention module of the present invention.
The reference numbers in the figures denote: 101-underground robot data acquisition, 102-data collection module, 103-data processing enhancement module, 104-data set construction module, 105-improved Efficientdet model training module and 106-result detection module.
Detailed Description
Embodiment 1, colliery wall of a well crack identification system based on deep neural network.
The coal mine well wall crack identification system comprises underground robot acquisition data 101, a data collection module 102, a data processing enhancement module 103, a data set construction module 104, an improved Efficientdet model training module 105 and a result detection module 106.
The data collection module 102, the data processing enhancement module 103, the data set construction module 104 and the improved Efficientdet model training module 105 are all stored in a memory of a computer system; the result detection module 106 is deployed in the downhole robot server. The downhole robot collecting data 101 includes: original coal mine well wall crack video data and original coal mine well wall crack image data.
Embodiment 2 is used for detecting the crack of the coal mine well wall of a coal mine underground robot, and is shown in fig. 1.
Step 100: the data collection module 102 collects visual data from the data 101 collected by the underground robot according to a time sequence, the data 101 collected by the underground robot comprises original coal mine well wall crack video data and original coal mine well wall crack image data, the original coal mine well wall crack video data is divided into frame-by-frame divided well wall crack image data, and the original coal mine well wall crack image data and the divided well wall crack image data are stored in the data collection module 102.
Step 200: the data processing enhancement module 103 acquires well wall crack image data collected by the data collection module 102, and manually eliminates well wall crack image data with poor quality and image data without cracks; performing data enhancement on the screened well wall crack image data; 800 pieces of well wall crack image data after screening;
step 210: firstly, manually screening collected well wall crack image data, and eliminating fuzzy well wall crack images and collected normal crack-free images caused by the movement of an underground robot;
step 220: due to low illumination intensity under the mine, the definition and the contrast of the acquired image data are low. Carrying out contrast enhancement on the image by adopting a CLAHE algorithm in an attributes library function, so that the borehole wall crack image data subjected to contrast enhancement processing is detected by a deep neural network;
step 230: in order to expand a limited number of databases and train a detection model with strong robustness, data amplification is carried out on the manually screened high-quality well wall fracture image data, and the high-quality well wall fracture image data is subjected to horizontal/vertical turning, rotation, scaling, cutting, shearing and translation processing.
Step 300: inputting the high-quality well wall crack image data obtained after the processing by the data processing enhancement module 103 into the data set construction module 104; manually marking a crack area on the high-quality well wall image data; dividing the image data after the labeling into a training set and a testing set according to a proportion, and completing the construction of a well wall crack data set;
step 310: the image data of the amplified borehole wall crack amplified according to the image data of the amplified borehole wall crack is as follows: 2, dividing the test result into a training set and a test set; the training set is used for training the model, and the test set is used for testing the detection precision of the model;
step 320: and respectively labeling crack regions existing in the divided well wall crack images by using Label Img software, wherein the labeling frame needs to be noticed to avoid the existence of an interference object, and if necessary, longer well wall cracks are divided into a plurality of sections for labeling.
Step 400: sending the well wall crack data set established by the data set establishing module 104 into an improved Efficientdet model training module 105 for training, and storing a weight model with the highest test set crack detection precision in the training process for detecting a well wall crack image to be detected;
step 500: and (3) sending the borehole wall crack image to be detected into a result detection module 106, detecting the borehole wall crack image by using the weight model with the highest crack detection precision obtained in the step 400, and displaying and storing the detection result.
Step 510: deploying the optimal network weight obtained in the step 400 to a robot server of the result detection module 106;
step 520: and (4) transmitting the borehole wall crack image to be detected to a robot server side by the client side, and acquiring and storing the identification detection result.
Referring to fig. 2, a visual description of the overall process of the coal mine well wall crack detection method is provided. And constructing the processed and enhanced crack image data into a data set, dividing the data set, sending the divided data set into an improved Efficientdet model for training, and then carrying out crack detection and reporting the result.
The improved Efficienedet model training module and the multi-spatial view fusion module are explained as follows.
Fig. 3 is a structure diagram of an improved Efficientdet network, and the specific process is as follows:
the improved efficiency model mainly comprises three parts: an improved trunk feature extraction network, a reinforced feature extraction network (BiFPN) and a prediction result decoder network (Class prediction net and Box prediction net); the improvement of the classical Efficientdet model mainly lies in a backbone feature extraction network part.
An improved backbone feature extraction network is first constructed. The typical infrastructure network of the efficiency is efficiency, the first three layers of the multilevel backbone features are composed of feature layers with different depths and different sizes of efficiency, and the three layers are respectively named as: p is 3 ,P 4 ,P 5 (ii) a The last two layers are characterized by P 5 The method is formed by two times of upsampling and is respectively named as: p 6 ,P 7
Multi-space visual angle fusion module and ternary coordinate note added in improved trunk characteristic extraction networkThe intention module, the first three layers of the multilayer characteristics are still formed by characteristic layers with different depths of the effect, and are named as follows: p is 3 ,P 4 ,P 5 (ii) a The latter two layers are characterized by P 5 P is formed by multi-space visual angle fusion module and up-sampling 6 (ii) a From P 6 Forming P through multi-space fusion module and up-sampling 7
Secondly, constructing a multi-level trunk characteristic P 3 ,P 4 ,P 5 ,P 6 ,P 7 Sending into a three-coordinate attention module to form an improved multi-level trunk feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′。
The improved multi-level trunk feature P is then 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ' feeding into a robust feature extraction network (BiFPN) to obtain a more comprehensive multi-level robust feature P by fusing improved multi-level features with each other 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 ″。
Finally, the multi-level is strengthened with the feature P 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 "feed decoder Class Prediction Net and Box Prediction Net obtain the predicted target Class and the predicted Box position. And (3) inputting the obtained position of the prediction frame and a real prediction frame carried by the labeled data set into formula (1) to calculate a loss value.
L=αL ce +βL focal (1)
Where α, β are hyper-parameters, both are set to 0.5 in the present application. L is ce Is a binary cross entropy loss function, L focal For the focal loss function, both are calculated as follows:
L ce =-y log(p)-(1-y)log(1-p) (2)
L focal =-y(1-p) γ log(p)-(1-y)p γ log(1-p) (3)
wherein, in the formula (2) and the formula (3), y represents a data true tag value; p represents a model predictive tag value; γ represents a weight parameter, typically set to 2; and performing back propagation through the calculated loss value, updating network parameters, and circulating the steps until the model is trained to be converged. According to practical experience, the training session is set to 30 rounds.
The multi-spatial view fusion Module (MSF) is implemented by convolution of different sizes, see fig. 4:
the first step is as follows: extracting characteristic P of Efficient backbone network 5 Sending the data into a multi-space visual angle fusion module;
the second step: p is 5 ∈R N×C××H Cut into two parts along the channel dimension: x 1 ∈R N×αC×H×W ,X 2 ∈R N×(1-α)C×H×W Wherein N, C, H, W and alpha respectively represent the training data batch size, the channel number, the image height, the image width and the channel number slicing rate.
Wherein X 1 The original features do not participate in the multi-space view fusion operation, so that the original feature graph can be accessed; x 2 Participating in multi-view modeling operation;
firstly, X is 2 Feeding into a multi-view branch 1; x 2 Obtaining through 1 × 1 convolution, 3 × 1 convolution and 1 × 3 convolution respectively
Figure BDA0003881849280000111
Then X is put in 2 Feeding into a multi-view branch 2; x 2 Obtaining through 1 × 1 convolution, 5 × 1 convolution and 1 × 5 convolution respectively
Figure BDA0003881849280000112
Finally, X is 2 Feeding into a multi-view branch 3; x 2 Obtaining through 1 × 1 convolution, 7 × 1 convolution and 1 × 7 convolution respectively
Figure BDA0003881849280000113
The third step: calculating multi-view spatial feature X using equation (4) 2
Figure BDA0003881849280000114
Where δ is the activation function, β 123 Are parameters that can be updated by learning. These three parameters are defined in the network and can therefore be set to autonomously learn the parameters that can participate in the gradient back propagation.
The fourth step: calculating the integral fusion characteristic X by using the formula (5)
X=Concat(X 1 ,X 2 ′) (5)
Where Concat represents splicing features along the channel dimension.
The X obtained by final calculation is the output of the multi-space visual angle fusion module, and the X is subjected to primary up-sampling to obtain P 6 A 1 is to P 6 Obtaining P through multi-space visual angle fusion module and up-sampling 7
Referring to fig. 5, a ternary coordinate attention module (TCA) structure diagram is designed for the present invention.
Firstly, multi-level features P 3 ,P 4 ,P 5 ,P 6 ,P 7 Respectively sent into a three-dimensional attention module to obtain an improved multi-level feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′。
With P 3 For example, P is first introduced 3 ∈R C×H×W Inputting the information into branch 1 of the ternary coordinate attention module, performing self-adaptive global average pooling along the H direction, and aggregating the information in the vertical direction to obtain X 1 ∈R C×H×1
Then, the pooled features are sent to a 3-by-1 convolution to explore the features of the well wall fracture area in the horizontal direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 1 . M to be generated finally 1 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 1
Figure BDA0003881849280000115
Spatial weight feature map M for Branch 1 of the ternary coordinate attention Module 1 The formula (2) is shown in the formula (6).
Secondly, P is added 3 ∈R C×H×W Inputting the data into branch 2 of the ternary coordinate attention module, performing self-adaptive global average pooling along the whole space direction, and aggregating the information on channel dimension to obtain X 2 ∈R 1×H×W . Then, sending the pooled features into the fracture region features in the 3 x 3 convolution exploration space direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 2 . M to be generated finally 2 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 2
Figure BDA0003881849280000121
Spatial weight profile M for branch 2 of the ternary coordinate attention module 2 The formula (2) is shown in the formula (6).
Finally, P is added 3 ∈R C×H×W Inputting to branch 3 of the ternary coordinate attention module, performing self-adaptive global average pooling along W direction, and aggregating information in horizontal direction to obtain X 3 ∈R C×1×W . Then, sending the pooled features into a 1-by-3 convolution to explore well wall crack region features in the vertical direction; and then generating a spatial weight map after batch normalization operation processing and Sigmoid operation activation. M to be generated finally 3 And P 3 Multiplying corresponding elements to obtain the enhanced feature map of branch 3
Figure BDA0003881849280000122
Spatial weight profile M for Branch 3 of the ternary coordinate attention Module 3 The formula (2) is shown in the formula (6).
Enhanced feature map obtained by each branch of ternary coordinate attention module
Figure BDA0003881849280000123
Splicing according to the channel dimensions and recovering the channel number through 1-by-1 convolution to obtain the final improved multi-level feature P 3 ', the calculation formula thereof is shown in formula (7).
M i =δ(Conv i (Avgpool i (P 3 ))) (6)
Figure BDA0003881849280000124
Note here that P 4 ′,P 5 ′,P 6 ′,P 7 ' construction mode and P 3 ' same, finally obtaining improved multi-level trunk characteristic P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′。
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications derived therefrom are intended to be within the scope of the claims of this patent.

Claims (7)

1. A coal mine well wall crack identification method based on a deep neural network is characterized by comprising the following steps:
step 100: the data collection module (102) collects original coal mine well wall crack video data and original coal mine well wall crack image data from the data (101) collected by the underground mine robot according to time sequence, divides the original coal mine well wall crack video data into frame-by-frame split well wall crack image data, and stores the split well wall crack image data and the original coal mine well wall crack image data to the data collection module (102);
step 200: the data processing enhancement module (103) acquires well wall crack image data collected by the data collection module (102), and manually eliminates well wall crack image data with poor quality and image data without cracks; performing data enhancement on the screened well wall crack image data;
step 300: the image data of the high-quality well wall cracks obtained after the processing by the data processing and enhancing module (103) is input into a data set construction module (104); manually marking a crack area on the high-quality well wall image data; dividing the image data after the marking into a training set and a testing set according to a proportion, and completing the construction of a well wall crack data set;
step 400: sending the well wall crack data set established by the data set establishing module (104) into an improved Efficientdet model training module (105) for training and learning, and storing a weight model with the highest crack detection precision of a test set in the training process for detecting a well wall crack image to be detected;
step 500: and (3) sending the borehole wall crack image to be detected into a result detection module (106), detecting by the weight model with the highest crack detection precision obtained in the step 400, and displaying and storing the detection result.
2. The coal mine borehole wall crack identification method based on the deep neural network as claimed in claim 1, characterized in that the data enhancement processing module constructed in the step 200 comprises the following substeps:
step 210: firstly, manually screening collected well wall crack image data, and removing fuzzy well wall crack images and collected normal crack-free images caused by the movement of an underground robot; the well wall crack image data after screening is more than or equal to 800;
step 220: carrying out contrast enhancement on the image by adopting a CLAHE algorithm in an attributes library function, so that the borehole wall crack image data subjected to contrast enhancement processing is detected by a deep neural network;
step 230: in order to expand a limited number of databases and train a detection model with strong robustness, data amplification is carried out on the manually screened high-quality well wall fracture image data, and the high-quality well wall fracture image data is subjected to horizontal/vertical turning, rotation, scaling, cutting, shearing and translation processing.
3. The coal mine well wall crack identification method based on the deep neural network is characterized in that in the step 300 data set construction module (104), the method comprises the following steps:
step 310: the image data of the amplified borehole wall crack amplified according to the image data of the amplified borehole wall crack is as follows: 2, dividing the test result into a training set and a test set; the training set is used for training the model, and the test set is used for testing the detection precision of the model;
step 320: and respectively labeling crack regions existing in the divided well wall crack images by using Label Img software, wherein the labeling frame needs to be noticed to avoid the existence of an interference object, and long well wall cracks are divided into a plurality of sections for labeling if necessary.
4. The coal mine well wall crack identification method based on the deep neural network as claimed in claim 1, wherein the step 400 of improving the Efficienedet model training module comprises the following steps:
step 410: constructing an improved backbone feature extraction network: a multi-space visual angle fusion module and a three-element coordinate attention module are added in an improved main feature extraction network, the first three layers of multi-level features still consist of feature layers with different depths of efficiency and are respectively named as: p 3 ,P 4 ,P 5 (ii) a The latter two layers are characterized by: from P 5 P is formed by multi-space visual angle fusion module and upsampling 6 (ii) a From P 6 Forming P through multi-space fusion block and upsampling 7
Step 420: the multi-level trunk feature P constructed in step 410 3 ,P 4 ,P 5 ,P 6 ,P 7 Sending into a three-coordinate attention module to form an improved multi-level trunk feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′;
Step 430: multilevel trunk feature P to be improved 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ' feeding an enhanced feature extraction network BiFPN to obtain a more comprehensive multi-level enhanced feature P by fusing the improved multi-level features with each other 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 ″;
Step 440: multiple levels are added to the feature P 3 ″,P 4 ″,P 5 ″,P 6 ″,P 7 "feed decoder Class Prediction Net and Box Prediction Net to obtain the predicted target Class and the predicted frame position;
step 450: inputting the position of the prediction frame obtained in the step 440 and a real prediction frame carried by the labeled borehole wall fracture data set into a formula (1) to calculate a loss value;
L=αL ce +βL focal (1)
wherein alpha and beta are hyper-parameters, and both alpha and beta are set to be 0.5; l is ce As a binary cross-entropy loss function, L focal For the focal loss function, both are calculated as follows:
L ce =-ylog(p)-(1-y)log(1-p) (2)
L focal =-y(1-p) γ log(p)-(1-y)p γ log(1-p) (3)
wherein, in the formula (2) and the formula (3), y represents a data true tag value; p represents a model predictive tag value; γ represents a weight parameter, typically set to 2;
step 460: performing back propagation on the loss value calculated in the step 450, updating network parameters, and repeating the steps 410-460 until the model is trained to converge; according to actual experience, the number of model training rounds is set to be 30.
5. The coal mine borehole wall crack identification method based on the deep neural network as claimed in claim 4, wherein in step 410, a multi-space visual angle fusion module is established, and the method comprises the following substeps:
step 411: extracting characteristic P of Efficient backbone network 5 Sending the data into a multi-space visual angle fusion module;
step 412: p 5 ∈R N×C×H×W Cut into two parts along the channel dimension: x 1 ∈R N×αC×H×W ,X 2 ∈R N×(1-α)C×H×W (ii) a Wherein X 1 The original features do not participate in the multi-space visual angle fusion operation, and the original feature graph can be ensuredTo access; x 2 Participating in multi-view modeling operation; wherein N, C, H, W and alpha respectively represent the batch size of the training data, the number of channels, the height of the image, the width of the image and the number of channels;
step 413: firstly, X is 2 Feeding into a multi-view branch 1; x 2 Obtaining through 1 × 1 convolution, 3 × 1 convolution and 1 × 3 convolution respectively
Figure FDA0003881849270000031
Step 414: then X is put in 2 Feeding into the multi-view branch 2; x 2 Obtaining through 1 × 1 convolution, 5 × 1 convolution and 1 × 5 convolution respectively
Figure FDA0003881849270000032
Step 415: finally, X is 2 Feeding into the multi-view branch 3; x 2 Respectively obtaining the data through 1 × 1 convolution, 7 × 1 convolution and 1 × 7 convolution
Figure FDA0003881849270000041
Step 416: calculating multi-view spatial feature X using equation (4) 2 ′;
Figure FDA0003881849270000042
Where δ is the activation function, β 123 Parameters that can be updated by learning through gradient back propagation;
step 417: calculating an overall fusion characteristic X by using a formula (5);
X=Concat(X 1 ,X 2 ′) (5)
where Concat represents splicing features along the channel dimension.
6. The method for identifying the coal mine borehole wall crack based on the deep neural network as claimed in claim 4, wherein the three-coordinate attention module in the step 420 comprises the following substeps:
step 421: the multi-level feature P constructed in step 410 3 ,P 4 ,P 5 ,P 6 ,P 7 Respectively sent into a three-dimensional attention module to obtain an improved multi-level feature P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′;
Step 422: by a characteristic layer P 3 For example, first P 3 ∈R C×H×W Inputting the information into branch 1 of the ternary coordinate attention module, performing self-adaptive global average pooling along the H direction, and aggregating the information in the vertical direction to obtain X 1 ∈R C×H×1 (ii) a Then, the pooled features are sent to a 3 x 1 convolution to explore the features of the well wall crack region in the horizontal direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 1 (ii) a M to be generated finally 1 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 1
Figure FDA0003881849270000043
Spatial weight feature map M for Branch 1 of the ternary coordinate attention Module 1 The formula (2) is shown in formula (6);
step 423: secondly, P is added 3 ∈R C×H×W Inputting to branch 2 of the ternary coordinate attention module, performing self-adaptive global average pooling along the whole space direction, and aggregating information on channel dimension to obtain X 2 ∈R 1×H×W (ii) a Then, sending the pooled features into the fracture region features in the 3 x 3 convolution exploration space direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 2 (ii) a M to be generated finally 2 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 2
Figure FDA0003881849270000044
Spatial weight profile M for Branch 2 of the ternary coordinate attention Module 2 Is calculated as in equation (6)Shown in the specification;
step 424: finally P is added 3 ∈R C×H×W Inputting to branch 3 of the ternary coordinate attention module, performing self-adaptive global average pooling along W direction, and aggregating information in horizontal direction to obtain X 3 ∈R C×1×W (ii) a Then, sending the pooled features into a 1-by-3 convolution to explore well wall crack region features in the vertical direction; then, a space weight characteristic graph M is generated after batch normalization operation processing and Sigmoid operation activation 3 (ii) a M to be generated finally 3 And P 3 Multiplying corresponding elements to obtain the enhanced characteristic diagram of the branch 3
Figure FDA0003881849270000051
Spatial weight profile M for Branch 3 of the ternary coordinate attention Module 3 The formula (2) is shown in formula (6);
step 425: enhanced feature map obtained by each branch of ternary coordinate attention module
Figure FDA0003881849270000052
Splicing according to the channel dimensions and recovering the number of channels through 1-by-1 convolution to obtain the final improved multi-level feature P 3 ', the calculation formula thereof is shown in formula (7);
M i =δ(Conv i (Avgpool i (P 3 ))) (6)
Figure FDA0003881849270000053
step 426: other level features such as P 4 ′,P 5 ′,P 6 ′,P 7 ' construction mode and P 3 ' same, finally obtaining improved multi-level trunk characteristic P 3 ′,P 4 ′,P 5 ′,P 6 ′,P 7 ′。
7. The coal mine well wall crack identification method based on the deep neural network as claimed in claim 1, wherein the step 500 comprises the following sub-steps:
step 510: deploying the optimal network weight obtained in the step 400 to a robot server side of a result detection module (106);
step 520: and (4) transmitting the borehole wall crack image to be detected to a robot server side by the client side, and acquiring and storing the identification detection result.
CN202211237746.5A 2022-10-10 2022-10-10 Coal mine well wall crack identification method based on deep neural network Pending CN115496982A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211237746.5A CN115496982A (en) 2022-10-10 2022-10-10 Coal mine well wall crack identification method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211237746.5A CN115496982A (en) 2022-10-10 2022-10-10 Coal mine well wall crack identification method based on deep neural network

Publications (1)

Publication Number Publication Date
CN115496982A true CN115496982A (en) 2022-12-20

Family

ID=84473695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211237746.5A Pending CN115496982A (en) 2022-10-10 2022-10-10 Coal mine well wall crack identification method based on deep neural network

Country Status (1)

Country Link
CN (1) CN115496982A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152117A (en) * 2023-04-18 2023-05-23 煤炭科学研究总院有限公司 Underground low-light image enhancement method based on Transformer
CN116524293A (en) * 2023-04-10 2023-08-01 哈尔滨市科佳通用机电股份有限公司 Gate regulator pull rod head missing fault image recognition method and system based on deep learning
CN117710296A (en) * 2023-12-04 2024-03-15 淮北矿业传媒科技有限公司 Method for identifying structural defects of well wall of coal mine air shaft

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116524293A (en) * 2023-04-10 2023-08-01 哈尔滨市科佳通用机电股份有限公司 Gate regulator pull rod head missing fault image recognition method and system based on deep learning
CN116524293B (en) * 2023-04-10 2024-01-30 哈尔滨市科佳通用机电股份有限公司 Brake adjuster pull rod head loss fault identification method and system based on deep learning
CN116152117A (en) * 2023-04-18 2023-05-23 煤炭科学研究总院有限公司 Underground low-light image enhancement method based on Transformer
CN116152117B (en) * 2023-04-18 2023-07-21 煤炭科学研究总院有限公司 Underground low-light image enhancement method based on Transformer
CN117710296A (en) * 2023-12-04 2024-03-15 淮北矿业传媒科技有限公司 Method for identifying structural defects of well wall of coal mine air shaft

Similar Documents

Publication Publication Date Title
Sony et al. A systematic review of convolutional neural network-based structural condition assessment techniques
CN115496982A (en) Coal mine well wall crack identification method based on deep neural network
Zhou et al. Automated residential building detection from airborne LiDAR data with deep neural networks
CN112967243A (en) Deep learning chip packaging crack defect detection method based on YOLO
CN110276402B (en) Salt body identification method based on deep learning semantic boundary enhancement
CN111898507A (en) Deep learning method for predicting earth surface coverage category of label-free remote sensing image
CN111563408B (en) High-resolution image landslide automatic detection method with multi-level perception characteristics and progressive self-learning
Oskouie et al. Automated recognition of building façades for creation of As-Is Mock-Up 3D models
CN113505670A (en) Remote sensing image weak supervision building extraction method based on multi-scale CAM and super-pixels
Yang et al. Semantic segmentation in architectural floor plans for detecting walls and doors
CN111160389A (en) Lithology identification method based on fusion of VGG
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
Yan et al. MSG-SR-Net: A weakly supervised network integrating multiscale generation and superpixel refinement for building extraction from high-resolution remotely sensed imageries
Kemajou et al. Wellbore schematics to structured data using artificial intelligence tools
Schönfelder et al. Deep learning-based text detection and recognition on architectural floor plans
Reghukumar et al. Vision based segmentation and classification of cracks using deep neural networks
Faltin et al. Inferring interconnections of construction drawings for bridges using deep learning-based methods
CN117173550A (en) Method and system for detecting underwater small target of synthetic aperture sonar image
Jafrasteh et al. Generative adversarial networks as a novel approach for tectonic fault and fracture extraction in high resolution satellite and airborne optical images
Jayaraju et al. A Deep Learning-Image Based Approach for Detecting Cracks in Buildings.
Hake et al. Damage detection for port infrastructure by means of machine-learning-algorithms
Anouncia et al. A knowledge model for gray scale image interpretation with emphasis on welding defect classification—An ontology based approach
CN112966774B (en) Picture Bert-based tissue pathology picture classification method
Wijaya et al. Building crack due to lombok earthquake classification based on glcm features and svm classifier
CN114049648A (en) Engineering drawing text detection and identification method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination