CN112633131B

CN112633131B - Underground automatic tracking method based on deep learning video identification

Info

Publication number: CN112633131B
Application number: CN202011509162.XA
Authority: CN
Inventors: 应永华; 杨立春; 林一铭; 余庭
Original assignee: Ningbo Long Wall Fluid Kinetic Sci Tech Co Ltd
Current assignee: Ningbo Long Wall Fluid Kinetic Sci Tech Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2022-09-13
Anticipated expiration: 2040-12-18
Also published as: CN112633131A

Abstract

The invention discloses an underground automatic tracking method based on deep learning video identification, which is characterized in that before a coal mining machine starts to work, a camera is used for collecting videos according to the position of the coal mining machine; identifying and classifying the collected videos by using a deep learning method, selecting the videos which can completely contain the appointed parts, and transmitting the videos to a monitoring display terminal; after the coal mining machine starts to work, judging whether the specified part leaves the monitoring range of the camera or not, if so, switching the camera and transmitting the acquired video to a monitoring display end; if not, continuing monitoring until leaving the monitoring range; finally, completing real-time tracking of the coal mining machine; the method has the advantages that when the coal mining machine works and moves, the automatic seamless switching of videos can be realized at the monitoring picture end, the video switching rate is guaranteed to be the minimum, the problem that the monitoring picture flickers due to frequent video switching, so that the monitoring effect is influenced is solved, the remote monitoring of the coal mining machine is realized, and the support is provided for the unmanned working face mining.

Description

Underground automatic tracking method based on deep learning video recognition

Technical Field

The invention relates to the field of fully mechanized mining monitoring, in particular to an underground automatic machine following method based on deep learning video identification.

Background

At present, with the continuous improvement of the national fully mechanized mining automation construction level, the nation pays more attention to the safety conditions of coal mine personnel and production equipment, coal mining production equipment generally comprises a coal mining machine, a hydraulic support, an end support and the like, the coal mining machine is the most important production equipment of a fully mechanized mining working face and is the equipment which must be monitored by an underground coal mine video monitoring system, but because the underground environment is severe, the illuminance is low and the dust is large, the false judgment rate of the obtained identification result is very high by using the existing common machine learning method to identify the coal mining machine; in addition, the camera needs to be frequently switched in the existing method for shooting the follow-up shot of the coal mining machine, so that the monitoring picture flickers, and the monitoring effect is influenced.

Disclosure of Invention

The invention aims to solve the technical problem of providing an underground automatic tracking method based on deep learning video identification, which can not only improve the accuracy of identifying a coal mining machine, but also reduce the frequency of switching pictures and solve the problem of flickering monitoring pictures when the coal mining machine works.

The technical scheme adopted by the invention for solving the technical problems is as follows: an underground automatic tracking method based on deep learning video identification comprises the steps that an infrared receiving device is fixedly installed on each hydraulic support, a plurality of cameras facing a fully mechanized mining face are fixedly installed on the hydraulic supports, each camera is correspondingly and fixedly installed on one hydraulic support, at least six hydraulic supports and at most seven hydraulic supports are arranged between every two adjacent cameras in each group at intervals, and an infrared emitting device corresponding to the infrared receiving device is arranged on a coal mining machine;

the specific machine following method comprises the following steps:

the method comprises the steps that a video processing module is arranged in an upper computer, before a coal mining machine starts to work, an infrared receiving device on a hydraulic support receives an infrared signal transmitted by an infrared transmitting device on the coal mining machine, the initial position of the coal mining machine is determined, and a video collected by a camera on the hydraulic support corresponding to the infrared receiving device capable of receiving the infrared signal is transmitted to the video processing module according to the initial position of the coal mining machineA block;

in the video processing module, aiming at a video before the coal mining machine starts to work, identifying and classifying the acquired video by using a deep learning method to obtain a video which can completely contain a specified part in a picture, and transmitting the video to a monitoring display end;

after the coal mining machine starts to work, whether the specified part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end or not is judged in the video processing module, and if yes, the steps are executed

(ii) a If not, the video collected by the camera corresponding to the video displayed by the current monitoring display end is continuously transmitted to the monitoring display end until the specified part leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and the steps are executed

；

According to the advancing direction of the coal mining machine during working, the videos collected from the next adjacent camera of the camera corresponding to the video displayed by the current monitoring display end to the last camera in the advancing direction are sequentially transmitted to a video processing module, and the steps are executed

；

In the video-processing module, it is,to step (c)

Identifying and classifying the collected videos by using a deep learning method, and stopping the step when the video collected by one camera cannot completely contain the designated part and the video collected by the last camera adjacent to the camera can completely contain the designated part

Identifying and classifying the collected videos, transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the designated part to a monitoring display end, judging whether the coal mining machine is still working or not, and if so, executing the steps

(ii) a If not, executing the step

；

Judging whether the coal mining machine reaches the end head of the working advancing direction or not, if not, returning to the execution step

(ii) a If yes, changing the advancing direction of the coal mining machine and returning to the execution step

；

And after the coal mining machine finishes coal mining work, finishing monitoring the coal mining machine, and finishing real-time tracking of the coal mining machine underground.

The specific method for identifying and classifying the collected video by using the deep learning method comprises the following steps:

a, shooting and acquiring a video of the movement of a specified part of a coal mining machine by using a camera in advance, preprocessing the pre-shot video to obtain a preprocessed video frame image set, taking the preprocessed video frame image set as a training set, and calibrating the specified part in the training set;

b, defining a 3D convolutional neural network model to be trained, comprising an input layer, a hard line layer, a first convolutional layer, a first downsampling layer, a second convolutional layer, a second downsampling layer, a third convolutional layer and an output layer, wherein the input layer is used for inputting video frame image data in a training set, the hard line layer is used for extracting channel information, and the first convolutional layer adopts two convolutional layers

The first down-sampling layer adopts

The second convolution layer uses three convolution layers

The second down-sampling layer adopts

The down-sampling window of (2), the third convolution layer

The output layer is used for outputting the feature vector obtained by the third convolution layer;

inputting the training set into a 3D convolutional neural network model to be trained for training to obtain a trained 3D convolutional neural network model, and extracting a feature vector corresponding to a specified part in the training set by using the trained 3D convolutional neural network model;

inputting the feature vectors corresponding to the designated parts in the training set into a linear classifier for training to obtain a trained linear classifier;

and E, inputting the acquired video after the preprocessing in the step A into the trained 3D convolutional neural network model to extract the characteristic vector, and inputting the characteristic vector into the trained linear classifier to perform recognition classification to obtain the video which can completely contain the designated part in the picture.

The preprocessing is to cut out a picture according to each 7 frames and cut the picture to 60 × 40.

The specific method for judging whether the designated part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a designated part, the designated part does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video can not completely contain the designated part, the designated part leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end.

The appointed part consists of four parts, namely a front roller of the coal mining machine, a rear roller of the coal mining machine, a front arm and a front half body of the coal mining machine, and a rear arm and a rear half body of the coal mining machine;

the steps are

In the video processing module, aiming at a video before the coal mining machine starts to work, a deep learning method is utilized to identify and classify the collected video, a video which can completely contain a front roller of the coal mining machine in a picture is obtained, and the video is transmitted to a monitoring display end; obtaining a video which can completely contain a rear roller of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain the front arm and the front half body of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain the rear arm and the rear half body of the coal mining machine in the picture, and transmitting the video to a monitoring display terminal；

The steps are as follows

After the coal mining machine starts to work, whether a front roller of the coal mining machine leaves a monitoring range of a camera corresponding to a video displayed by a current monitoring display end or not is judged in a video processing module, and if yes, the steps are executed

(ii) a If not, continuously transmitting the video collected by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the steps

(ii) a Judging whether a rear roller of the coal mining machine leaves a monitoring range of a camera corresponding to a video displayed by a current monitoring display end, if so, executing the step

(ii) a If not, the video collected by the camera corresponding to the video displayed by the current monitoring display end is continuously transmitted to the monitoring display end until the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and the steps are executed

(ii) a Judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end, if so, executing the step

(ii) a If not, the video collected by the camera corresponding to the video displayed by the current monitoring display end is continuously transmitted to the monitoring display end until the front arm and the front half body of the coal mining machine leave the current monitoringMonitoring range of camera corresponding to video displayed by display end and executing steps

(ii) a Judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end, if so, executing the step

(ii) a If not, continuously transmitting the video collected by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the steps

；

The steps are

In the video processing module, for the step

Identifying and classifying the collected videos by using a deep learning method, and stopping the step when the video collected by one camera cannot completely contain the front roller of the coal mining machine and the video collected by the last camera close to the camera can completely contain the front roller of the coal mining machine

Identifying and classifying the collected videos, and transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the front roller of the coal cutter to a monitoring display end; when the obtained video acquired by one camera cannot completely contain the rear roller of the coal mining machine and the video acquired by the last camera close to the camera can completely contain the coal mining machineStopping the alignment step when the back roller is rotated

Identifying and classifying the collected videos, and transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the rear drum of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the front arm and the front half body of the coal mining machine and the video acquired by the last camera close to the camera can completely contain the front arm and the front half body of the coal mining machine, stopping the pairing step

Identifying and classifying the collected videos, and transmitting the video collected by a camera close to a camera corresponding to the video which cannot completely contain the forearms and the front half body of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the rear arm and the rear half body of the coal mining machine and the video acquired by the last camera close to the camera can completely contain the rear arm and the rear half body of the coal mining machine, the pairing step is stopped

Identifying and classifying the collected videos, transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the rear arm and the rear half body of the coal mining machine to a monitoring display end, judging whether the coal mining machine is still working or not, and if so, executing the steps

(ii) a If not, executing the step

。

a, shooting and acquiring videos of front drum movement of a coal mining machine, rear drum movement of the coal mining machine, front arm and front half body movement of the coal mining machine and rear arm and rear half body movement of the coal mining machine by using a camera in advance, preprocessing the pre-shot videos to obtain a preprocessed video frame image set, taking the preprocessed video frame image set as a training set, and carrying out category calibration on specified positions in the training set, wherein the front drum of the coal mining machine, the rear drum of the coal mining machine, the front arm and front half body of the coal mining machine and the rear arm and rear half body of the coal mining machine are respectively represented by 0,1,2 and 3;

b, defining a 3D convolutional neural network model to be trained, comprising an input layer, a hard line layer, a first convolutional layer, a first down-sampling layer, a second convolutional layer, a second down-sampling layer, a third convolutional layer and an output layer, wherein the input layer is used for inputting video frame image data in a training set, the hard line layer is used for extracting channel information, and the first convolutional layer adopts two convolutional layers

The first down-sampling layer adopts

The second convolution layer uses three downsampling windows

The second down-sampling layer adopts

The down-sampling window of (2), the third convolution layer

c, inputting the training set into a 3D convolutional neural network model to be trained for training to obtain a trained 3D convolutional neural network model, and extracting feature vectors corresponding to a front roller of a coal mining machine, a rear roller of the coal mining machine, a front arm and a front half body of the coal mining machine and a rear arm and a rear half body of the coal mining machine in the training set by using the trained 3D convolutional neural network model;

d, inputting the characteristic vectors corresponding to the front roller of the coal mining machine, the rear roller of the coal mining machine, the front arm and the front half body of the coal mining machine and the rear arm and the rear half body of the coal mining machine in the training set into a linear classifier for training to obtain the trained linear classifier;

and e, inputting the acquired video after pretreatment into the trained 3D convolutional neural network model to extract a characteristic vector, and inputting the characteristic vector into the trained linear classifier for classification to obtain a video which can completely contain a front roller of the coal mining machine in a picture, a video which can completely contain a rear roller of the coal mining machine in the picture, a video which can completely contain a front arm and a front half body of the coal mining machine in the picture and a video which can completely contain a rear arm and a rear half body of the coal mining machine in the picture.

The specific method for judging whether the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a front roller of a coal mining machine, the front roller of the coal mining machine does not leave a monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the front roller of the coal mining machine cannot be completely contained in the video, the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a rear roller of the coal mining machine, the rear roller of the coal mining machine does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the rear roller of the coal mining machine, the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine do not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video can not completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, the acquired video is identified and classified by using a deep learning method, and if the video can completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine do not leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; and if the video cannot completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end.

The camera is a video acquisition camera, and the camera shoots a video of the coal mining machine in a static state or a moving state at 7 fps.

Compared with the prior art, the method has the advantages that compared with the common machine learning method for realizing video identification classification, the method for realizing video identification classification by utilizing the 3D convolutional neural network can process more complicated object motion behaviors, so that the mining condition of the coal mining machine on the working face can be better identified, and when the coal mining machine works and moves, the video collected by the last camera adjacent to the camera corresponding to the video which cannot completely contain the appointed part is selected to be transmitted to the monitoring display end, so that the automatic seamless switching of the video can be realized at the monitoring picture end, the video switching rate is ensured to be minimum, the problem that the monitoring picture flickers due to frequent video switching so as to influence the monitoring effect is solved, the remote monitoring of the coal mining machine is realized, and technical support is provided for unmanned working face mining.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying examples.

The first embodiment is as follows: as shown in fig. 1, in the method for automatically tracking the underground machine based on deep learning video identification, an infrared receiving device is fixedly installed on each hydraulic support, a plurality of cameras facing a fully mechanized mining face are fixedly installed on the hydraulic supports, each camera is correspondingly and fixedly installed on one hydraulic support, at least six hydraulic supports and at most seven hydraulic supports are arranged between every two adjacent cameras in each group at intervals, and an infrared transmitting device corresponding to the infrared receiving device is arranged on a coal mining machine;

the specific machine following method comprises the following steps:

the method comprises the steps that a video processing module is arranged in an upper computer, before a coal mining machine starts to work, an infrared receiving device on a hydraulic support receives an infrared signal transmitted by an infrared transmitting device on the coal mining machine, the initial position of the coal mining machine is determined, and a video collected by a camera on the hydraulic support corresponding to the infrared receiving device capable of receiving the infrared signal is transmitted to the video processing module according to the initial position of the coal mining machine;

in a video processing module, aiming at a video before the coal mining machine starts to work, a deep learning method is utilized to identify and classify the acquired video, a video which can completely contain a specified part in a picture is obtained, and the video is transmitted to a monitoring display end;

the preprocessing is to intercept one picture according to every 7 frames and to cut the picture into 60 × 40 size;

The first down-sampling layer adopts

The second convolution layer uses three

The second down-sampling layer adopts

Down-sampling window of (2), third convolution layer

e, inputting the acquired video after the preprocessing in the step A into a trained 3D convolutional neural network model to extract a characteristic vector, and inputting the characteristic vector into a trained linear classifier to perform recognition and classification to obtain a video which can completely contain a specified part in a picture;

；

The specific method for judging whether the specified part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a designated part, the designated part does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the designated part, the designated part leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end;

；

In the video processing module, steps are aimed at

(ii) a If not, executing the step

；

Judging whether the coal mining machine reaches the end head of the working advancing direction, if not, returning to the execution step

；

And after the coal mining machine finishes the coal mining work, finishing the monitoring of the coal mining machine and finishing the real-time tracking of the coal mining machine underground.

Example two: the rest parts are the same as the first embodiment, and the difference is that the designated part consists of four parts, namely a front roller of the coal mining machine, a rear roller of the coal mining machine, a front arm and a front half body of the coal mining machine, and a rear arm and a rear half body of the coal mining machine;

step (ii) of

In the video processing module, aiming at a video before the coal mining machine starts to work, a deep learning method is utilized to identify and classify the collected video, a video which can completely contain a front roller of the coal mining machine in a picture is obtained, and the video is transmitted to a monitoring display end; obtaining a video which can completely contain a rear roller of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain the front arm and the front half body of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain a rear arm and a rear half body of the coal mining machine in a picture, and transmitting the video to a monitoring display end;

a, shooting and acquiring videos of the front drum motion of a coal cutter, the rear drum motion of the coal cutter, the front arm and front half body motion of the coal cutter and the rear arm and rear half body motion of the coal cutter in advance by using a camera, preprocessing the pre-shot videos to obtain a preprocessed video frame image set, taking the preprocessed video frame image set as a training set, carrying out category calibration on specified positions in the training set, and respectively representing the front drum of the coal cutter, the rear drum of the coal cutter, the front arm and front half body of the coal cutter, the rear arm and rear half body of the coal cutter by 0,1,2 and 3;

b, defining a 3D convolutional neural network model to be trained, wherein the model comprises an input layer, a hard line layer, a first convolutional layer, a first downsampling layer, a second convolutional layer, a second downsampling layer, a third convolutional layer and an output layer, the input layer is used for inputting video frame image data in a training set, the hard line layer is used for extracting channel information, and the first convolutional layer adopts two convolutional layers

The first down-sampling layer adopts

With three second convolution layers

The second down-sampling layer adopts

Down-sampling window of, a third convolution layer

c, inputting the training set into a 3D convolutional neural network model to be trained for training to obtain a trained 3D convolutional neural network model, and extracting feature vectors corresponding to a front roller of the coal mining machine, a rear roller of the coal mining machine, a front arm and a front half body of the coal mining machine and a rear arm and a rear half body of the coal mining machine in the training set by using the trained 3D convolutional neural network model;

e, inputting the acquired video after pretreatment into a trained 3D convolutional neural network model to extract a characteristic vector, and inputting the characteristic vector into a trained linear classifier for classification to obtain a video which can completely contain a front roller of the coal mining machine in a picture, a video which can completely contain a rear roller of the coal mining machine in the picture, a video which can completely contain a front arm and a front half body of the coal mining machine in the picture and a video which can completely contain a rear arm and a rear half body of the coal mining machine in the picture;

step (ii) of

(ii) a If not, the video collected by the camera corresponding to the video displayed by the current monitoring display end is continuously transmitted to the monitoring display end until the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and the steps are executed

(ii) a Judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end, if so, executing the steps

(ii) a If not, continuously transmitting the video acquired by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the steps

(ii) a Judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end, if so, executing the steps

；

The specific method for judging whether the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a front roller of a coal mining machine, the front roller of the coal mining machine does not leave a monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the front roller of the coal mining machine, the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a rear roller of the coal mining machine, the rear roller of the coal mining machine does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the rear roller of the coal mining machine cannot be completely contained in the video, the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine do not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, the acquired video is identified and classified by using a deep learning method, and if the video can completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine do not leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end;

step (ii) of

In videoIn the processing module, for the step

Identifying and classifying the collected videos, and transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the front roller of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the rear roller of the coal mining machine and the video acquired by the last camera close to the camera can completely contain the rear roller of the coal mining machine, stopping the pairing step

Identifying and classifying the collected videos, and transmitting the video collected by the last camera close to the camera corresponding to the video which cannot completely contain the rear drum of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the front arm and the front half body of the coal mining machine and the video acquired by the last camera adjacent to the camera can completely contain the front arm and the front half body of the coal mining machine, stopping the pairing step

Identifying and classifying the collected videos, and transmitting the video collected by a camera close to a camera corresponding to the video which cannot completely contain the forearms and the front half body of the coal mining machine to a monitoring display end; when the obtained video acquired by one camera cannot completely contain the rear arm and the rear half body of the coal mining machine, and the video acquired by the last camera close to the camera can completely contain the rear arm and the rear half body of the coal mining machineWhen the rear half body is used, the pairing step is stopped

(ii) a If not, executing the step

；

The 3D convolutional neural network comprises the following components connected in sequence: the device comprises an input layer, a hard wire layer, a first convolution layer, a first down-sampling layer, a second convolution layer, a second down-sampling layer, a third convolution layer and an output layer;

for the input layer, the input end of the input layer receives 7 continuous frames of video frame images with the size of 60 × 40, and the output end of the input layer outputs the 7 continuous frames of video frame images with the size of 60 × 40 to the hard line layer;

for the hard-wired layer, the input end of the hard-wired layer receives continuous 7 frames of output from the output end of the input layer and has the size of

The video frame image of (1) extracting 5 channel information from each frame of video frame image, wherein the channel information comprises gray scale, horizontal coordinate gradient, vertical coordinate gradient, x optical flow and y optical flow, the output end of the hard line layer outputs 33 characteristic graphs, and the size of each characteristic graph is

；

For the first winding layer, the input end of the first winding layer receives the output end of the hard wire layerAll output characteristic maps, two

The 3D convolution kernel performs convolution operation, and the output end of the first convolution layer outputs

A plurality of characteristic maps, and the size of each characteristic map is

Wherein, in the step (A),

two different sets of feature maps are shown;

for the first down-sampling layer, the input end of the first down-sampling layer receives all the characteristic graphs output by the output end of the first convolution layer, and the method adopts

The down-sampling window carries out down-sampling, and the output end of the first down-sampling layer outputs

A plurality of feature maps, each feature map having a size of

；

For the second convolution layer, the input end of the second convolution layer receives all the characteristic graphs output by the output end of the first down-sampling layer, and three are adopted

The 3D convolution kernel performs convolution operation, and the output end of the second convolution layer outputs

A plurality of characteristic maps, and the size of each characteristic map is

Wherein, in the step (A),

six different sets of characteristic diagrams are shown;

for the second down-sampling layer, the input end of the second down-sampling layer receives all the characteristic graphs output by the output end of the second convolution layer, and the method adopts

The down-sampling window carries out down-sampling, and the output end of the second down-sampling layer outputs

A plurality of feature maps, each feature map having a size of

；

For the third convolutional layer, the input end of the third convolutional layer receives all the characteristic graphs output by the output end of the second down-sampling layer, and

the 2D convolution kernel of (1) performs convolution operation, the output end of the third convolution layer outputs 128 characteristic graphs, and the size of each characteristic graph is

；

For the output layer, the input end of the output layer receives all the feature maps output by the output end of the third convolutional layer, and the output end outputs all the feature vectors corresponding to all the feature maps.

Claims

1. An underground automatic tracking method based on deep learning video identification is characterized in that an infrared receiving device is fixedly installed on each hydraulic support, a plurality of cameras facing a fully mechanized mining face are fixedly installed on the hydraulic supports, each camera is correspondingly and fixedly installed on one hydraulic support, at least six hydraulic supports and at most seven hydraulic supports are arranged between every two adjacent cameras in each group at intervals, and an infrared transmitting device corresponding to the infrared receiving device is arranged on a coal mining machine;

the specific machine following method comprises the following steps:

firstly, a video processing module is arranged in an upper computer, before a coal mining machine starts to work, an infrared receiving device on a hydraulic support receives an infrared signal transmitted by an infrared transmitting device on the coal mining machine, the initial position of the coal mining machine is determined, and a video collected by a camera on the hydraulic support corresponding to the infrared receiving device capable of receiving the infrared signal is transmitted to the video processing module according to the initial position of the coal mining machine;

secondly, in the video processing module, aiming at the video before the coal mining machine starts to work, the collected video is identified and classified by using a deep learning method to obtain the video which can completely contain the appointed part in the picture, and the video is transmitted to the monitoring display end,

the specific method for identifying and classifying the collected videos by using the deep learning method to obtain the videos completely containing the designated parts in the picture comprises the following steps:

b, defining a 3D convolutional neural network model to be trained, which comprises an input layer, a hard line layer, a first convolutional layer, a first downsampling layer, a second convolutional layer, a second downsampling layer, a third convolutional layer and an output layer, wherein the input layer is used for inputting video frame image data in a training set, the hard line layer is used for extracting channel information, the first convolutional layer adopts two 7 x 3D convolution kernels, the first downsampling layer adopts a 2 x 2 downsampling window, the second convolutional layer adopts three 7 x 6 x 3D convolution kernels, the second downsampling layer adopts a 3 x 3 downsampling window, the third convolutional layer adopts a 7 x 4 2D convolution kernel, and the output layer is used for outputting a feature vector obtained by the third convolutional layer;

inputting the feature vectors corresponding to the specified parts in the training set into a linear classifier for training to obtain a trained linear classifier;

judging whether the designated part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end in the video processing module after the coal mining machine starts to work, and if so, executing the step IV; if not, continuously transmitting the video collected by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the specified part leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end, and executing the step IV;

transmitting the video collected from the next adjacent camera of the camera corresponding to the video displayed by the current monitoring display end to the last camera in the advancing direction to the video processing module in sequence according to the advancing direction of the coal mining machine, and executing the fifth step;

fifthly, in the video processing module, aiming at the videos collected in the step (iv), identifying and classifying the collected videos by using a deep learning method, stopping identifying and classifying the videos collected in the step (iv) when the video collected by one camera cannot completely contain the appointed part and the video collected by the last camera adjacent to the camera can completely contain the appointed part, transmitting the video collected by the last camera adjacent to the camera corresponding to the video which cannot completely contain the appointed part to a monitoring display end, judging whether the coal mining machine is still working, and if so, executing the step (iv); if not, executing step (c);

judging whether the coal mining machine reaches the end of the working advancing direction, if not, returning to the execution step III; if so, changing the advancing direction of the working of the coal mining machine and returning to the step III;

and seventhly, after the coal mining machine finishes coal mining work, finishing monitoring on the coal mining machine, and finishing real-time tracking of the coal mining machine underground.

2. The method according to claim 1, wherein the preprocessing comprises cutting a picture from every 7 frames and cutting the picture to 60 × 40.

3. The method for automatically following the underground machine based on the deep learning video identification as claimed in claim 1, wherein the specific method for judging whether the designated part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a designated part, the designated part does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; and if the video cannot completely contain the designated part, the designated part is away from the monitoring range of the camera corresponding to the video displayed by the current monitoring display end.

4. The underground automatic machine following method based on deep learning video identification as claimed in claim 1, wherein the designated part comprises four parts of a front roller of the coal mining machine, a rear roller of the coal mining machine, a front arm and a front half body of the coal mining machine and a rear arm and a rear half body of the coal mining machine;

in the second step, in the video processing module, aiming at the video before the coal mining machine starts to work, the collected video is identified and classified by using a deep learning method, the video which can completely contain the front roller of the coal mining machine in the picture is obtained, and the video is transmitted to the monitoring display end; obtaining a video which can completely contain a rear roller of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain the front arm and the front half body of the coal mining machine in a picture, and transmitting the video to a monitoring display end; obtaining a video which can completely contain a rear arm and a rear half body of the coal mining machine in a picture, and transmitting the video to a monitoring display end;

in the third step, after the coal mining machine starts to work, whether a front roller of the coal mining machine leaves a monitoring range of a camera corresponding to a video displayed by a current monitoring display end is judged in a video processing module, and if yes, the fourth step is executed; if not, continuously transmitting the video acquired by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the step IV; judging whether a rear roller of the coal mining machine leaves a monitoring range of a camera corresponding to a video displayed by a current monitoring display end, if so, executing a fourth step; if not, continuously transmitting the video acquired by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the fourth step; judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end, if so, executing a fourth step; if not, continuously transmitting the video acquired by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the step IV; judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end, if so, executing a fourth step; if not, continuously transmitting the video collected by the camera corresponding to the video displayed by the current monitoring display end to the monitoring display end until the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end and executing the fourth step;

in the fifth step, in the video processing module, aiming at the videos collected in the fourth step, a deep learning method is utilized to identify and classify the collected videos, when the obtained videos collected by one camera cannot completely contain a front roller of the coal mining machine and the videos collected by the last camera adjacent to the camera can completely contain the front roller of the coal mining machine, the identification and classification of the videos collected in the fourth step are stopped, and the videos collected by the last camera adjacent to the camera corresponding to the videos which cannot completely contain the front roller of the coal mining machine are transmitted to a monitoring display end; when the video acquired by one camera cannot completely contain the rear roller of the coal mining machine and the video acquired by the last camera close to the camera can completely contain the rear roller of the coal mining machine, stopping identifying and classifying the video acquired in the step (IV) and transmitting the video acquired by the last camera close to the camera corresponding to the video which cannot completely contain the rear roller of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the front arm and the front half body of the coal mining machine and the video acquired by the last camera adjacent to the camera can completely contain the front arm and the front half body of the coal mining machine, stopping the identification and classification of the video acquired in the step (IV) and transmitting the video acquired by the last camera adjacent to the camera corresponding to the video which cannot completely contain the front arm and the front half body of the coal mining machine to a monitoring display end; when the video acquired by one camera cannot completely contain the rear arm and the rear half body of the coal mining machine and the video acquired by the last camera adjacent to the camera can completely contain the rear arm and the rear half body of the coal mining machine, stopping identifying and classifying the video acquired in the step (iv), transmitting the video acquired by the last camera adjacent to the camera corresponding to the video which cannot completely contain the rear arm and the rear half body of the coal mining machine to a monitoring display end, judging whether the coal mining machine is still working, and if so, executing the step (v); if not, execute step (c).

5. The downhole automatic tracking method based on deep learning video recognition according to claim 4, characterized in that the specific method for recognizing and classifying the collected videos by using the deep learning method is as follows:

6. The method for automatically following the underground coal mining machine based on the deep learning video identification as claimed in claim 4, wherein the specific method for judging whether the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at a video displayed by a current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a front roller of a coal mining machine, the front roller of the coal mining machine does not leave a monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the front roller of the coal mining machine, the front roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain a rear roller of the coal mining machine, the rear roller of the coal mining machine does not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the rear roller of the coal mining machine, the rear roller of the coal mining machine leaves the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine do not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; if the video cannot completely contain the front arm and the front half body of the coal mining machine, the front arm and the front half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end; the specific method for judging whether the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end comprises the following steps: aiming at the video displayed by the current monitoring display end, identifying and classifying the acquired video by using a deep learning method, wherein if the video can completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine do not leave the monitoring range of a camera corresponding to the video displayed by the current monitoring display end; and if the video cannot completely contain the rear arm and the rear half body of the coal mining machine, the rear arm and the rear half body of the coal mining machine leave the monitoring range of the camera corresponding to the video displayed by the current monitoring display end.

7. The method for automatically following the underground coal mining machine based on the deep learning video identification is characterized in that the camera is a video acquisition camera, and the camera shoots a video of the coal mining machine at a static state or a moving state at 7 fps.