CN111401139B

CN111401139B - Method for obtaining mine underground equipment position based on character image intelligent recognition

Info

Publication number: CN111401139B
Application number: CN202010114364.8A
Authority: CN
Inventors: 巫乔顺; 陈甫刚; 尹业华; 李云财; 许斌; 梁伟
Original assignee: Yunnan Kungang Electronic Information Technology Co ltd
Current assignee: Yunnan Kungang Electronic Information Technology Co ltd
Priority date: 2020-02-25
Filing date: 2020-02-25
Publication date: 2024-03-29
Anticipated expiration: 2040-02-25
Also published as: CN111401139A

Abstract

The invention provides a method for obtaining the position of underground mine equipment based on intelligent recognition of character images, which comprises the steps of installing a plurality of character cards at intervals beside underground tracks, marking a plurality of characters on each character card, and recording the track position corresponding to each character number in a database of a production scheduling center; installing image acquisition equipment on an unmanned locomotive under a mine, and acquiring all character images marked on corresponding character cards in running; the acquired images are transmitted to a U-Net network for detection, after the images with 8 characters are detected, the images are divided into 8 non-overlapping sub-images, each sub-image contains 1 character, classification and identification are carried out after convolution operation and downsampling operation, corresponding character values and credible values are obtained, the character values corresponding to the position with the largest credible value are transmitted to a production scheduling center in real time through a wireless network, and the unmanned locomotive can be accurately positioned in the pit, so that the requirements of industrialized and automatic mining and transportation production are met.

Description

Method for obtaining mine underground equipment position based on character image intelligent recognition

Technical Field

The invention relates to a method for obtaining the position of underground mine equipment, in particular to a method for obtaining the position of underground mine equipment based on intelligent character image recognition, and belongs to the technical field of character image recognition.

Background

The mining industry is an upstream industry of the metallurgical industry, provides main raw materials for metallurgy, is a capital-intensive, resource-intensive and technology-intensive industry, and is a high-energy-consumption industry. Along with the continuous expansion of the production scale of enterprises, the production safety in the operation process is increasingly emphasized, various measures are actively taken to innovate the safety management mode and method of the enterprises, and the safety management level of the enterprises is continuously improved. Unmanned mining operation is an important safety measure for mining production, unmanned underground mining operation is completely and automatically completed by automatic equipment, and because underground distance is hundreds of kilometers and the number of fork is very large, automatic equipment for mining needs to be positioned, the position of the automatic mining equipment in the underground is mastered, the automatic mining equipment is convenient to monitor by a production scheduling center, and the production operation is carried out by scheduling equipment in real time and scientifically according to the production condition.

The unmanned locomotive is an electric locomotive for transporting ores under a mine, the electric locomotive runs on a rail of a tunnel under the mine, the rail is hundreds of kilometers long, and after the electric locomotive is reformed into an unmanned electric locomotive, the electric locomotive automatically runs on the rail, so that the position of the electric locomotive on the rail needs to be known in real time. The unmanned locomotive is positioned in various ways, such as active signal technology of RFID, but RFID equipment needs to be installed underground for hundreds of kilometers, so that the cost is too high, and the wireless signal is greatly interfered underground, so that the practical production is not facilitated. Along with the progress of artificial intelligence technology, it is possible to solve the positioning problem by using character recognition, namely, installing a plate with 8 character length at every 20 meters beside a rail, wherein the plate is similar to the car license plate in size, when a locomotive passes through the character plate, an image acquisition device on the unmanned locomotive detects corresponding characters, then the characters are segmented, classified and recognized, and recognition results are transmitted to a production scheduling center through a wireless network, and the scheduling center can grasp the specific position of the unmanned locomotive in the pit according to the character numbers. For character image recognition under the mine, a general character recognition algorithm cannot be used, and the traditional digital image processing technology cannot be well adapted to various complex environments, for example, imaging effect can be influenced due to low light intensity under the mine, and character surface characteristics can be influenced due to serious dust under the mine. Accordingly, there is a need for improvements in the art.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for intelligently identifying images based on image data acquired by image acquisition equipment and combining a deep learning technology to finally obtain the specific position of a locomotive in a well.

The invention is completed by the following technical scheme: the method for obtaining the position of the mine underground equipment based on the intelligent recognition of the character image is characterized by comprising the following steps:

1) A plurality of character cards are arranged beside a rail under the mine well at intervals, a plurality of characters are marked on each character card, and the rail position corresponding to each character number is recorded in a database of a production scheduling center;

2) The image acquisition equipment is arranged on an unmanned locomotive under the mine, and all character images marked on the corresponding character cards are acquired in the running process;

3) Reading the collected character image data by using a video capture class in an open source image processing library OpenCV, wherein the frame rate is 15 frames per second, the pixel format is RGB three-channel image, and the original pixel is 1024900;

4) Performing conventional scaling and filtering preprocessing on each frame of image data, wherein: the scaling treatment is to shrink or enlarge the image size to reduce the data processing amount of the deep learning network model, accelerate the segmentation and recognition speed of each frame of image, compress the image into 800600 pixels by Lanczos algorithm, and the pixel format is a RGB color image; the image data filtering processing is to carry out smoothing processing on burrs existing in the compressed image, so that the burrs of the edge characteristics of the image are reduced, and the image is clearer and is easy to identify;

5) The preprocessed 800600 pixel image is transmitted to a U-Net network for detection, after the image with 8 characters is detected, the image is divided into 8 non-overlapping sub-images, and each sub-image contains 1 character;

the U-Net network is an Encoder-Decoder network structure, wherein: the method comprises the steps that an Encoder network is used for convolution operation, a Decoder network is used for up-sampling operation, the Encoder network is of a five-layer convolution network structure, the convolution kernel of each layer of convolution network structure is 55, the packing is 0, the packing is 1, the Decoder network is of a five-layer convolution network structure, the convolution kernel of each layer of convolution network is 11, and the step length is 1;

the convolution operation and the downsampling operation are performed by adopting the following algorithm:

5-1) convolution operates as follows: an image of 800600 pixels is processed into 780580 pixels through five-layer convolution operation, and a convolution kernel of 22 is used for pooling operation with the step length of 2 into 390290 pixels; repeating the operation for three times to obtain a pixel with an image of 6045; the convolution operation formula is as follows:

where X is the image data, i and j are the image sizes, 800 and 600 respectively, W is the convolution kernel, m and n are the convolution kernel sizes, where 5 and 5,s (i, j) are the new image data after the convolution operation, respectively;

after each convolution operation, the nonlinear calculation is carried out by using an activation function, the activation function of the whole network uses a Maxout activation function, and the formula of the activation function is as follows:wherein z is _ij ＝x ^T w _...ij +b _ij +c, where x ^T Is the value of the network neuron, W _ij Is the convolution kernel value, i and j are the coordinate positions in the convolution kernel, k is the channel number of the image, the image is an RGB color image, k is 3, bij is the constant corresponding to each neuron, c is an empirical constant after the activation calculation, initial value 0, j is the subscript, and max Z _ij ；

Because the convolution operation is linear operation, nonlinear processing needs to be performed by using a loss function, the loss function uses pixel-wise softmax, and the output corresponding to the pixel is independently softmax, and the formula is as follows:

wherein x is the pixel position on a two-dimensional plane, a is a learning coefficient, the initial value is 1, w (x) is a weight term in cross entropy, pl (x) represents the output probability of x on a channel where a real label is located, c is a constant term, and the initial value is 0;

5-2) downsampling operations are as follows: sending an image with a pixel of 6045 obtained by convolution operation into a Decoder network, carrying out up-sampling operation to increase the length and the width of the image by two times to be 12090 pixels, repeating the up-sampling operation of the Decoder network for two times, obtaining a pixel of 480360 by using a convolution kernel of 55 for convolution operation to obtain a pixel of 420300, carrying out one-time convolution operation by using a convolution kernel of 51 to obtain a rubber of 400300, and then carrying out one-time up-sampling operation to recover 800600 rubber which is identical to the original image;

5-3) performing full connection operation on the restored 800600 rubber image to obtain a specific position of a single character in the original image, and representing the position by the upper left corner coordinate and the lower right corner coordinate of the sub-image;

the full join operation is as follows: the position of 8 character images are mainly output in full connection, each image coordinate is composed of 4 values of the upper left corner and the lower right corner, and 32 output values of 8 images are obtained; the image is a two-dimensional array of 800600, the image is converted into a one-bit array according to the row, the length is 480000, then 32 groups of one-dimensional array parameters with the length of 48000 are multiplied and summed with the pixel values of the one-bit array of the image, and an intercept parameter is added, so that the obtained 32 values are the coordinate positions of 8 characters;

full connection calculation formula: x is x _i ＝a _nm *w _i +c _i

In which x is _i Is 32 coordinate values, i takes on values from 1 to 32, a _nm Is a one-dimensional array of images, w _i Is one-dimensional array parameter, length n is m, c _i Is the intercept parameter, w _i And c _i Are all learnable parameters, n and m are the length and width of the source image, respectivelyI.e. n is 800 and m is 600;

6) And (3) classification and identification: classifying and identifying the sub-images which are separated in the step 5) and contain single characters respectively by a convolutional neural network in sequence, identifying each sub-image once to obtain a character value and a trusted value, classifying and identifying 8 times to obtain 8 character values and 8 corresponding trusted values, wherein each trusted value is larger than 90%;

the convolutional neural network structure is 8 layers in total, wherein: the 1-3 layer network uses 9 types of convolution kernels to extract 9 types of features, each type of convolution kernel is 33, the 4-6 layer network uses 12 types of convolution kernels, each convolution kernel is 33, the 7 th layer of convolution kernel uses 1024 types of convolution kernels, each convolution kernel is 11, the 8 th layer is a full connection layer, 62 trusted values are output, and the character corresponding to the position with the largest trusted value is the recognized character value;

after each convolution operation, performing nonlinear operation by using an activation function, wherein the activation function uses an exponential linear unit ELU function;

7) The obtained 8 character values are transmitted to a production scheduling center in real time through a wireless network, the character numbers are used as query conditions, specific positions are looked up in a database system, the positions of the unmanned locomotives under the well can be determined, the unmanned locomotives under the well are positioned, meanwhile, image data with low probability values of recognition results are automatically stored, the stored images are trained in a targeted mode each month, the network model with the accuracy rate of 99.89% in updating production is trained, and the purpose of continuous learning is achieved.

The invention has the following beneficial effects:

the invention solves the technical problems of large error, high cost and the like caused by unclear images due to insufficient light and large dust quantity of underground automatic equipment, can greatly improve the recognition rate of underground character images to 99.89 percent, and can achieve 100 percent by combining with the recognition error correction technology, thereby accurately positioning an underground unmanned automatic locomotive, meeting the requirements of industrialized, automatic mining and transportation production, saving investment and being widely applied to underground mine operation, other industrial production and logistics transportation industries.

Detailed description of the preferred embodiments

The following detailed description of embodiments of the invention is exemplary and intended to be illustrative of the invention and not to be construed as limiting the invention.

Examples

The image acquisition device of the embodiment uses Nvidia TX2 artificial intelligent edge computing equipment, the size of the equipment is 50 mm x 87 mm, the size is small, the power consumption is only 7.5 watts, the equipment is installed on a mobile unmanned locomotive, the space occupied by installation is small, the power consumption is low, and the method is not specified in the prior art.

The method comprises the following specific steps:

1) And installing a character plate at intervals of 20 meters beside a rail under the mine, wherein the size of the character plate is similar to that of an automobile license plate, 8 characters are printed on each license plate, and the character plate is installed at each intersection. Recording the rail position of each number in a database system which is arranged in a production scheduling center;

2) The method comprises the steps that image acquisition equipment is arranged on an unmanned locomotive under a mine, image acquisition work is carried out, 6 thousand underground character images are acquired altogether, wherein 5 thousand images are used for training a network model, and 1 thousand images are used for testing the network model;

3) Video capture class in an open source image processing library OpenCV is used for reading video image data of a camera, the frame rate is 15 frames per second, the pixel format is RGB three-channel image, and the original pixel is 1024900;

full connection calculation formula: x is x _i ＝a _nm *w _i +c _i

In which x is _i Is 32 coordinate values, i takes on values from 1 to 32, a _nm Is a one-dimensional array of images, w _i Is one-dimensional array parameter, lengthn*m，c _i Is the intercept parameter, w _i And c _i Are both learnable parameters, n and m are the length and width of the source image, respectively, i.e., n is 800 and m is 600;

Claims

1. The method for obtaining the position of the mine underground equipment based on the intelligent recognition of the character image is characterized by comprising the following steps:

1) A plurality of character cards are arranged beside underground rails of the mine at intervals, a plurality of characters are marked on each character card, and rail positions corresponding to the serial numbers of the characters are recorded in a database of a production scheduling center;

2) Installing image acquisition equipment on an unmanned locomotive under a mine, and acquiring all character images marked on corresponding character cards in running;

the U-Net network is an Encoder-Decoder network structure, wherein: the method comprises the steps that an Encoder network is used for convolution operation, a Decoder network is used for up-sampling operation, the Encoder network is of a five-layer convolution network structure, the convolution kernel of each layer of convolution network structure is 55, the packing is 0, the string is 1, the Decoder network is of a five-layer convolution network structure, the convolution kernel of each layer of convolution network is 11, and the step length is 1;

after each convolution operation, the nonlinear calculation is carried out by using an activation function, the activation function of the whole network uses a Maxout activation function, and the formula of the activation function is as follows:wherein->Wherein x is ^T Is the value of the network neuron, W _ij Is the convolution kernel value, i and j are the coordinate positions in the convolution kernel, k is the channel number of the image, the image is an RGB color image, k is 3, bij is the constant corresponding to each neuron, c is an empirical constant after the activation calculation, initial value 0, j is the subscript, and max Z _ij ；

，

full connection calculation formula:

in which x is _i Is 32 coordinate values, i takes on values from 1 to 32, a _nm Is a one-dimensional array of images, w _i Is one-dimensional array parameter, length n is m, c _i Is the intercept parameter, w _i And c _i Are both learnable parameters, n and m are the length and width of the source image, respectively, i.e., n is 800 and m is 600;