CN113420811B

CN113420811B - Coal rock identification method using deep learning

Info

Publication number: CN113420811B
Application number: CN202110695256.9A
Authority: CN
Inventors: 伍云霞; 邹正阳
Original assignee: China University of Mining and Technology Beijing CUMTB
Current assignee: China University of Mining and Technology Beijing CUMTB
Priority date: 2021-06-23
Filing date: 2021-06-23
Publication date: 2023-04-07
Anticipated expiration: 2041-06-23
Also published as: CN113420811A

Abstract

The invention aims to provide a coal rock identification method using deep learning, which is directly oriented to a coal rock identification task, directly learns a network model from coal rock data set without using ImageNet pre-training by monitoring the coal rock identification accuracy and the identification frame accuracy, has generalization capability on different coal rock interface images, has higher accuracy and robustness, and provides reliable coal rock identification information and auxiliary action for the production processes of automatic excavation, automatic coal discharge, automatic gangue separation and the like.

Description

Coal rock identification method using deep learning

Technical Field

The invention belongs to the field of coal rock identification methods, and particularly relates to a coal rock identification method using deep learning.

Background

Coal rock identification is a method for automatically identifying coal rock objects as coal or rock. In the coal production process, the coal rock identification technology can be widely applied to production links such as roller coal mining, tunneling, top coal caving mining, gangue separation of raw coal and the like, and has important significance for reducing the number of working personnel on the mining working face, reducing the labor intensity of workers, improving the working environment and realizing safe and efficient production of coal mines.

Various coal rock identification methods such as a natural gamma ray detection method, a radar detection method, a stress cutting tooth method, an infrared detection method, an active power monitoring method, a vibration detection method, a sound detection method, a dust detection method, a memory cutting method and the like exist, but the methods have the following problems: (1) various sensors need to be additionally arranged on the existing equipment to acquire information, so that the device is complex in structure and high in cost. (2) The equipment such as a coal mining machine roller, a heading machine and the like is complex in stress, violent in vibration, serious in abrasion, large in dust, difficult in sensor deployment, easy to cause damage to mechanical components, sensors and electric circuits, and poor in reliability. (3) For different types of mechanical equipment, the optimal type of the sensor and the selection of the signal pickup point are greatly different, personalized customization is needed, and the adaptability of the system is poor.

In order to solve the problems, a convolutional neural network technology in image recognition is more and more emphasized and some coal rock recognition methods based on target detection are developed, however, some existing target detection algorithms need to pre-train on data sets such as ImageNet and then perform migration learning to train a final frame of target detection, so that very many pre-training resources are often consumed, and the recognition stability and accuracy are insufficient when target detection is applied in a special occasion similar to a coal rock interface due to the fact that the pre-training is different from the final training data set; in addition, in the conventional convolutional neural network, the parameter quantity is huge and the requirement on hardware is high, which seriously influences the practical deployment of the convolutional neural network.

Accordingly, there is a need for a coal-rock identification method that addresses or at least ameliorates one or more problems inherent in the prior art.

Disclosure of Invention

The invention aims to provide a coal rock recognition method for deep learning, which is directly oriented to a coal rock recognition task, directly learns a network model from coal rock data set without using ImageNet pre-training by monitoring the coal rock recognition accuracy and the recognition frame accuracy, has generalization capability on different coal rock interface images, has higher accuracy and robustness, and provides reliable coal rock recognition information and auxiliary action for the production processes of automatic excavation, automatic coal discharge, automatic gangue selection and the like.

According to one embodiment form, there is provided a coal rock recognition method using deep learning, characterized by including the steps of:

s10, marking the coal serving as the target in the training set, normalizing each picture to be 300x300, and establishing a coal rock recognition target detection data set;

s20, building a compression type coiling layer;

s30, constructing a feature extraction module by using the compression type convolution layer in the S20;

s40, constructing a detection frame generation module;

and S50, training a model and storing the model.

In a further specific but non-limiting form, the specific solving process at said step S20 is as follows:

s22, in one convolution operation in the convolutional neural network, only one convolution kernel F1 with the size of H x W x C is used for carrying out convolution operation on the input IP of H x W x C, wherein the first layer of the depth of the convolution kernel and the first layer of the input depth are subjected to dot product operation in a traversal mode to obtain a local output, wherein H and W respectively represent the length and the width of the convolution kernel, H and W respectively represent the length and the width of the input, and C and C are equal in numerical value and respectively represent the depth of the convolution kernel and the input;

s23, performing dot product traversal on all the C and the C by using the mode in the S22 to obtain a complete output layer OP1;

s25, performing convolution operation on the OP1 by using N convolution kernels with the length and the width of 1x1 and the depth equal to the output dimension of the OP1 to obtain the output of the convolution layer.

In a further specific but non-limiting form, the specific solving process at said step S30 is as follows:

s31, parallel characteristic modules: inputting pictures into two convolution layers and an average pooling layer which are parallel, and superposing the outputs of the three layers to carry out maximum pooling for one time to obtain the output of the module;

s32, a depth module: 8 sets of convolution kernels with step size of 1, length and width of 1x1 and 3x3 are used, wherein the input of one set is x _i(i ∈ _{(0,1,2,3,4,5,6,7))} Output is P _i(xi)(i ∈ _{(0,1,2,3,4,5,6,7))} During operation, the input of the previous convolutional layer is added to the input of the current convolutional layer to be used as the input of the current convolution, and the final parameter output x7= P7 ([ x0, x 1.,. X6,) is obtained through calculation in sequence])；

S33, a dimension-variable module: using a convolution kernel with step size 1 of 1x1 in combination with a 2x2 max pooling kernel with step size 2, wherein the input depth is consistent with the output depth;

and S34, taking the parallel feature module as a first module of the framework, and then connecting a plurality of depth modules in series and combining the depth modules with the dimension-variable module to obtain the construction of a final feature extraction module.

In a further specific but non-limiting form, the specific solving process at said step S40 is as follows:

s41, a detection frame module: in the first three outputs of the new six groups of the S4 characteristic abstraction module, the middle point of the output pixel point (the assumed length is 1) is taken as the center, and three frames are selected (one square has the size of

The length and width of the two rectangles are

1 and 1->

) In the last three outputs, nine frames are selected with the pixel as the center frame (based on the size of the previous three frames, enlarge 1 and make the frame greater than or equal to>

2 times).

In a further specific but non-limiting form, the specific solving process at said step S50 is as follows:

s51, initializing the built model;

s52, predicting the picture 1 by using the initialized model, wherein each target box predicts five values: x, y, w, h and the confidence coefficient of the coal, so as to obtain the position of the target frame in the picture and the probability of the coal existing in the target frame;

s53, reading the labeling data of the picture 1, sorting the predicted values of all the target frames, filtering most of the target frames according to the intersection ratio and the confidence coefficient of the target frames with the actual labeling frames, then performing inhibition filtering on the rest target frames by using a non-maximum value to obtain the frames which best meet the actual condition, and finally outputting the positions and the confidence coefficients of the frames;

s54, predicting regression loss of the position of a target frame in the picture by using Smooth L1 loss and predicting classification loss of the target frame by using SoftMax loss, and then weighing the weight of the regression loss and the classification loss by using alpha;

s55, utilizing a back propagation optimization model until training is finished;

and S56, after the training is finished, storing the model.

Drawings

The embodiments of the figures, which become apparent from the following description, are given by way of example only of at least one preferred but non-limiting embodiment described in connection with the figures.

FIG. 1 is a flow chart of a coal rock recognition framework according to the present invention without pre-training.

FIG. 2 is a flow chart of the construction of a compressive convolution according to the present invention.

Detailed Description

S10, acquiring images of the coal and rock samples, wherein the images comprise images of different shooting viewpoints, different illumination and different shooting distances, and the images contain possible imaging conditions. For each image, firstly updating the size to be 300x300, and then marking the position of the coal by using Labelme;

s20. Build up the compressed convolutional layer using pychar, see fig. 2:

s21, in one convolution operation in the convolutional neural network, only one convolution kernel F1 with the size of H x W x C is used for carrying out convolution operation on the input IP of H x W x C, wherein the first layer of the depth of the convolution kernel and the first layer of the input depth are subjected to dot product operation in a traversal mode to obtain a local output, wherein H and W respectively represent the length and the width of the convolution kernel, H and W respectively represent the length and the width of the input, and C and C are equal in numerical value and respectively represent the depth of the convolution kernel and the input;

s22, performing dot product traversal on all C and C by using the mode in the S22 to obtain a complete output layer OP1;

s23, performing convolution operation on the OP1 by using N convolution kernels with the length and the width of 1x1 and the depth equal to the output dimension of the OP1 to obtain the output of the convolution layer. C2. In the convolution operation of B1, C1 (the first layer of the convolution kernel depth) of F and C1 (the first layer of the input depth) of IP are subjected to dot product operation in sequence and ergodic using step length s (the pixel distance of each movement of F) to obtain a layer of output; traversing all C and C according to the forms of C2 and C2, and Ci and Ci to obtain the complete output layer OP1;

s30, using Pycharm to build a feature extraction module and a detection frame module, and referring to fig. 1:

s31, constructing a parallel characteristic module: inputting pictures into two convolution layers and an average pooling layer which are parallel, and superposing the outputs of the three layers to carry out maximum pooling for one time to obtain the output of the module;

s32, a depth module: 8 sets of convolution kernels with step size of 1, length and width of 1x1 and 3x3 are used, wherein the input of one set is x _i(i ∈ _{(0,1,2,3,4,5,6,7))} Output is P _i(xi)(i ∈ _{(0,1,2,3,4,5,6,7))} During operation, the input of the previous convolutional layer is added to the input of the current convolutional layer to be used as the input of the current convolution, and the final parameter output x7= P7 ([ x0, x 1.,. X6.) is obtained through calculation in sequence])；

S35, a detection frame module: in the first three outputs of the new six groups of the S4 characteristic abstraction module, the middle point of the output pixel point (the assumed length is 1) is taken as the center, and three frames are selected (one square has the size of

The length and width of the two rectangles are

1 and 1->

2 times).

S40, initializing the model, then starting training the model, and using a back propagation optimization model:

s41, loss function is used in training

And optimizing the model.

And S50, storing the model storage model after training is finished.

Claims

1. A coal rock identification method using deep learning is characterized by comprising the following steps:

s10, marking targets in the training set, normalizing each picture to be 300x300, and establishing a coal rock recognition target detection data set;

s20, building a compression type coiling layer;

s30, constructing a feature extraction module by using the compression type convolutional layer in the S20, and is characterized by comprising the following steps of:

s32, a depth module: using 8 groups of convolution kernels with the step length of 1, the length and the width of 1x1 and 3x3, wherein one group of convolution kernels has an input of xi (i belongs to (0, 1,2,3,4,5,6, 7)) and an output of Pi (xi) (i belongs to (0, 1,2,3,4,5,6, 7)), the input of the previous convolution layer is added to the input of the current convolution layer during operation to be used as the input of the current convolution, and the final parameter output x7= P7 ([ x0, x1, \ 8230;, x6 ]) is obtained through sequential calculation;

s34, taking the parallel feature module as a first module of the frame, and then connecting a plurality of depth modules in series and combining the depth modules with the dimension-variable module to obtain a final feature extraction module;

s40, constructing a target frame generating module;

s50, training and storing the model, which comprises the following steps:

s51, initializing the built model;

s52, predicting the picture by using the initialized model, wherein each target box predicts five values: x, y, w, h and the confidence coefficient of the coal, so as to obtain the position of the target frame in the picture and the probability of the coal existing in the target frame;

s53, reading the marking data of the picture, sorting the predicted values of all the target frames, filtering most of the target frames according to the intersection ratio and the confidence coefficient of the target frames with the actual marking frame, then performing inhibition filtering on the rest target frames by using a non-maximum value to obtain the frame which best meets the actual condition, and finally outputting the position and the confidence coefficient of the frame;

s54, predicting regression loss of the position of the target frame in the picture by using Smooth L1 loss, predicting classification loss of the target frame by using SoftMax loss, and then weighing the weight of the regression loss and the classification loss by using alpha;

and S55, optimizing the model by using back propagation until the training is finished, and storing the model.

2. The method according to claim 1, wherein the step S20 is specifically:

s21, in one convolution operation in the convolutional neural network, only one convolution kernel F1 with the size of H x W x C is used for carrying out convolution operation on the input IP of H x W x C, wherein dot product operation is carried out on the first layer of the depth of the convolution kernel and the first layer of the input depth in a traversal mode to obtain a local output, wherein H and W respectively represent the length and the width of the convolution kernel, H and W respectively represent the length and the width of the input, and C and C are equal in numerical value and respectively represent the depth of the convolution kernel and the input;

s22, performing dot product traversal on all C and C by using the mode in the S21 to obtain a complete output layer OP1;

s23, performing convolution operation on the OP1 by using N convolution kernels with length and width being 1x1 and depth being equal to the output dimension of the OP1 to obtain the output of the convolution layer.

3. The method according to claim 1, wherein the step S40 is specifically:

s41, a target frame generation module: in the first three outputs of the S30 feature extraction module, three frames are framed with the midpoint of the output pixel point as the center, and nine frames with three amplification ratios are framed with the pixel as the center in the last three outputs.