CN111895931B

CN111895931B - Coal mine operation area calibration method based on computer vision

Info

Publication number: CN111895931B
Application number: CN202010694534.4A
Authority: CN
Inventors: 陈咖宁; 陈加忠; 李综艺; 舒琴; 黄帅
Original assignee: Wuhan Hualian Technology Co Ltd; Jiaxing Boling Technology Co Ltd
Current assignee: Wuhan Hualian Technology Co Ltd; Jiaxing Boling Technology Co Ltd
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2021-11-26
Anticipated expiration: 2040-07-17
Also published as: CN111895931A

Abstract

The invention discloses a coal mine operation area calibration method based on computer vision, which relates to the technical field of coal mine operation area calibration and comprises the following steps: building and training a deep learning neural network model for two-dimensional detection and identification of an object; building and training a deep learning neural network model for three-dimensional detection of an object; collecting RGB images as the input of the deep learning neural network model for the two-dimensional detection and identification of the object, and acquiring a two-dimensional detection frame and a corresponding model of the working vehicle; cutting the RGB picture content in the obtained two-dimensional detection frame as the input of the deep learning neural network model of the three-dimensional detection, and obtaining the three-dimensional detection frame of the base of the working vehicle; determining pixel specification characteristics of the work vehicle in the image; specification characteristics of the work vehicle in practice are determined. The method and the device for detecting the working area of the working vehicle obtain the three-dimensional working area of the working vehicle in the image, determine the area where the working vehicle should arrive, help the working vehicle to work in the specified working area and reduce potential safety hazards.

Description

Coal mine operation area calibration method based on computer vision

Technical Field

The invention relates to the technical field of coal mine operation area calibration, in particular to a coal mine operation area calibration method based on computer vision.

Background

In coal mine operation, whether a vehicle is driven by a person or an unmanned vehicle, the vehicle needs to work in a designated working area, and once the vehicle is out of range, serious production accidents can be caused. The camera monitoring is widely applied in real life, and can be conveniently installed in various scenes without complicated configuration. At present, technologies such as automatic driving, factory automatic operation and the like are assisted by a computer vision method, and good effects are obtained. The method of the deep learning technology of the convolutional neural network can automatically extract the characteristics of the object and carry out accurate positioning and classification. However, the traditional two-dimensional detection of objects cannot accurately define the working area around the excavator according to the distance.

Therefore, the operation area needs to be calibrated, and the vehicle in operation needs to be tracked in real time, so that the vehicle can only work in the calibrated area. The current practice is to use UWB (ultra wide band) positioning devices with fixed positions outside the working area, to set mobile UWB positioning devices in the vehicle, and to calculate the position of the vehicle by measuring the time from when a pulse signal starts from the fixed UWB positioning device to when the mobile UWB positioning device returns to the fixed UWB positioning platform in the vehicle, in order to detect whether the vehicle crosses the boundary of the working area.

In practical application, position coordinates need to be calibrated between fixed UWB devices and between the fixed UWB devices and the vehicle-mounted mobile UWB device. However, in a shallow coal mine operation area, the position and the landform of an operation place are frequently changed, soil layers, coal layers and blasting layers are staggered, the combination of all the layers is not regular, and earthwork is repeatedly peeled off, so that the operation area is difficult to calibrate by the conventional technology, a common UWB positioning scheme needs to move the position in a large range along with the change of the operation area, and once the UWB positioning scheme is moved, a professional is required to arrive at the operation area to refresh a coordinate system again, so that the UWB positioning system is inconvenient to use.

An effective solution to the problems in the related art has not been proposed yet.

Disclosure of Invention

Aiming at the problems in the related art, the invention provides a coal mine operation area calibration method based on computer vision, so as to overcome the technical problems in the prior related art.

The technical scheme of the invention is realized as follows:

a coal mine operation area calibration method based on computer vision comprises the following steps:

step S1, building and training a deep learning neural network model for two-dimensional detection and identification of an object;

step S2, building and training a deep learning neural network model for three-dimensional detection of an object;

step S3, collecting RGB images as the input of the deep learning neural network model for the two-dimensional detection and identification of the object, and acquiring a two-dimensional detection frame and a corresponding model of the working vehicle;

step S4, cutting the content of the RGB image in the two-dimensional detection frame as the input of the deep learning neural network model of the three-dimensional detection, and obtaining the three-dimensional detection frame of the base of the working vehicle;

step S5, determining pixel specification characteristics of the work vehicle in the image;

step S6, determining the specification characteristics of the work vehicle in practice;

in step S7, a three-dimensional work area of the work vehicle in the image is determined.

Further, the deep learning neural network model for two-dimensional detection and recognition of the object is built and trained, and the deep learning neural network model comprises the following steps:

collecting RGB images of a mining operation area containing operation vehicles;

manually marking a two-dimensional marking frame of the operation vehicle in the collected RGB image, and marking the model of the operation vehicle to obtain a marked image;

and training to obtain a deep learning neural network model for two-dimensional detection and identification of the object by taking the collected RGB image as input and taking the two-dimensional marking frame and the marking model as labels.

Further, the deep learning neural network model for building and training the three-dimensional detection of the object comprises the following steps:

cutting the region of the working vehicle according to the obtained marked images, applying a new data set to the cut region of the working vehicle, and manually marking a three-dimensional marking frame of the chassis of the working vehicle;

and taking the cut working vehicle region as input, taking a three-dimensional labeling frame of the working vehicle region as a label, and training to obtain a deep learning neural network for three-dimensional detection of the object.

Further, the image resolution of the deep learning neural network model for inputting the two-dimensional detection and identification of the object is 512 multiplied by 512; the image resolution of the deep learning neural network for three-dimensional detection of an input object is 64 × 64.

Further, the captured RGB image includes work vehicle element characteristic information.

Further, the work vehicle elements include a transport vehicle and an excavator.

Further, the collected RGB image is a high-definition camera, and the distance between the camera and the working area is less than 200 meters.

Further, the method comprises the following steps:

step S8, instructing the work vehicle to drive into the work area,

and step S9, determining whether the three-dimensional detection frame of the working vehicle is matched with the acquired three-dimensional working area, and if so, executing a working instruction.

The invention has the beneficial effects that:

the invention relates to a coal mine operation area calibration method based on computer vision, which comprises the steps of acquiring RGB images as the input of a deep learning neural network model for object two-dimensional detection and identification, acquiring a two-dimensional detection frame and a corresponding model of an operation vehicle, cutting the RGB image content in the acquired two-dimensional detection frame as the input of the deep learning neural network model for three-dimensional detection, acquiring a three-dimensional detection frame of an operation vehicle base, determining the pixel specification characteristics of the operation vehicle in the image and the specification characteristics of the operation vehicle in practice, thereby acquiring a three-dimensional working area of the operation vehicle in the image, determining the area which the operation vehicle should reach, helping the operation vehicle to operate in the specified working area and reducing potential safety hazards.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic flow chart of a coal mine operation area calibration method based on computer vision according to an embodiment of the invention;

FIG. 2 is a schematic flow chart of three-dimensional operating area calibration of a coal mine operating area calibration method based on computer vision according to an embodiment of the invention;

FIG. 3 is a schematic diagram of deep convolutional neural network training for two-dimensional vehicle detection and identification based on a coal mine operation area calibration method based on computer vision according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of deep convolutional neural network training for vehicle three-dimensional detection based on a coal mine operation area calibration method based on computer vision according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a deep convolutional neural network framework of a vehicle two-dimensional detection and recognition model based on a computer vision-based coal mine operation area calibration method according to an embodiment of the invention;

FIG. 6 is a schematic diagram of a deep convolutional neural network framework of a vehicle three-dimensional detection model of a coal mine operation area calibration method based on computer vision according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.

According to the embodiment of the invention, a coal mine operation area calibration method based on computer vision is provided.

As shown in fig. 1-2, a method for calibrating a coal mine operation area based on computer vision according to an embodiment of the present invention includes the following steps: step S1, building and training a deep learning neural network model for two-dimensional detection and identification of an object;

The method comprises the following steps of constructing and training a deep learning neural network model for two-dimensional detection and identification of an object, wherein the deep learning neural network model comprises the following steps:

collecting RGB images of a mining operation area containing operation vehicles;

The deep learning neural network model for building and training the three-dimensional detection of the object comprises the following steps:

The image resolution of the deep learning neural network model for inputting the two-dimensional detection and identification of the object is 512 multiplied by 512; the image resolution of the deep learning neural network for three-dimensional detection of an input object is 64 × 64.

Wherein the captured RGB image includes work vehicle element feature information.

Wherein the work vehicle elements include a transport vehicle and an excavator.

The RGB image acquisition device comprises a high-definition camera, a working area and a control panel, wherein the RGB image acquisition device is a high-definition camera, and the distance between the camera and the working area is less than 200 meters.

The method comprises the following steps:

step S8, instructing the work vehicle to drive into the work area,

By means of the technical scheme, the coal mine working area calibration method based on computer vision acquires RGB images as input of a deep learning neural network model for two-dimensional detection and identification of an object, acquires a two-dimensional detection frame and a corresponding model of a working vehicle, cuts the content of the RGB images in the acquired two-dimensional detection frame as input of the deep learning neural network model for three-dimensional detection, acquires a three-dimensional detection frame of a working vehicle base, determines pixel specification characteristics of the working vehicle in the images and specification characteristics of the working vehicle in practice, accordingly acquires a three-dimensional working area of the working vehicle in the images, determines an area where the working vehicle should arrive, helps the working vehicle to work in the specified working area, and reduces potential safety hazards.

In addition, specifically, in order to train the two-dimensional detection model and the three-dimensional detection model, 1000 videos of mine operation are collected, and twenty thousand pictures containing the excavator and the transport vehicle at the same time are selected from the videos, wherein the resolution ratio is 512 multiplied by 512. And taking the manually marked two-dimensional detection frames and model types of the excavators and the transport vehicles as labels of the two-dimensional object detection model. After marking the two-dimensional detection frame, cutting out the area of the vehicle and zooming the area into the size of 64 multiplied by 64 to be used as input data of the three-dimensional detection model, manually marking the base of the excavator and the three-dimensional marking frame of the transport vehicle, and using the three-dimensional marking frame as a label of the three-dimensional detection model.

The working flow of the method is shown in fig. 2, firstly, a video monitored by a monitoring camera is converted into a picture as input, and after the picture passes through a two-dimensional detection model, a two-dimensional detection frame and a corresponding model of a transport vehicle and an excavator in the picture are obtained. Then, the obtained areas of the transporter and the excavator are cut out, the sizes of the areas are scaled to 64 × 64, and the areas are input into a three-dimensional detection model to obtain a three-dimensional detection frame of the chassis of the transporter and the excavator. The actual length and width of the chassis of the excavator are found out according to the model of the excavator, the pixel length occupied by the length and width of the chassis is calculated according to the obtained three-dimensional detection frame of the chassis of the excavator, the relation between the actual length and width of the chassis of the excavator and the pixel length is obtained according to the actual length and width of the chassis of the excavator, and the working interval where the transport vehicle should reach is calculated by combining the arm length of the excavator. And sending an instruction to enable the transport vehicle to reach a specified interval, and sending the reached instruction when the three-dimensional detection frame of the transport vehicle is basically coincided with the working interval.

Specifically, in practical application, as shown in fig. 3 to 6, the following links are included:

firstly, acquiring and processing picture data:

select suitable angle in mining area and place the camera, the camera is not more than 200 meters apart from the job scene, lets the camera can be complete see the work flow between transport vechicle and the excavator. And selecting 1000 videos in different scenes, selecting one picture every 30 frames in order to prevent the data set from being overlarge, and discarding pictures which do not comprise the excavator and the transport vehicle. After the picture data set is obtained, manually marking the two-dimensional marking frames and the corresponding models of the transport vehicle and the excavator in the RGB image, and using the categories corresponding to the coordinates, the length, the height and the models of the center points of the two-dimensional marking frames as labels of the two-dimensional detection and identification model. In order to train the three-dimensional detection model, the marked two-dimensional frame picture is cut out and scaled to 64 × 64, and the three-dimensional frame picture is marked on the vehicle picture manually as the input of the three-dimensional frame detection model. Due to the view angle of the vehicle, only three vertexes of the lower bottom surface of the vehicle three-dimensional detection frame are generally visible in the picture, and in the manual labeling process, the confidence coefficient of the visible vertexes of the lower bottom surface of the three-dimensional labeling frame is set to be 1, and the confidence coefficient of the invisible vertexes of the lower bottom surface of the three-dimensional labeling frame is set to be 0. During training, the coordinates and confidence degrees of the 4 vertexes of the lower bottom surface and the height of the three-dimensional detection frame are used as labels. When predicting, only three vertexes with the highest confidence degree and a high-reconstruction three-dimensional detection frame are selected.

Secondly, constructing and training an object two-dimensional detection and recognition model:

the method comprises the steps of taking pictures containing a transport vehicle and an excavator as input, taking coordinates, length and width of a vehicle center point marked manually as labels of a two-dimensional marking frame, and taking the type of the vehicle as a classified label.

TABLE 1 types of anchor boxes

Anchor box type	Size of
		C1	66×54
C2	130×100
		C3	177×243
C4	166×130
		C5	260×200

The two-dimensional object detection model of the vehicle adopts a neural network based on two-dimensional convolution as a main detection network, and for an input picture with the size of 512 x 512, the size of a final feature layer is 16 x 16. For each feature point of the last layer, five vehicle detection frames are predicted, and the five vehicle detection frames correspond to 5 anchor frames (an anchor refers to a pixel point which may exist in the video object. The predicted value of each vehicle detection box contains the coordinates of the center point of the vehicle, the offset values for the length and width of the anchor point box, the probability of each type of model, and the probability of whether or not an object is contained.

The model is trained using two loss functions simultaneously. Identifying the vehicle model by using a binary cross entropy between a prediction category and a calibration category as a loss function; and detecting the position of the vehicle by adopting the mean square error loss between the detected center point coordinate, the detected length and the detected high offset value and the calibrated center point coordinate, the calibrated length and the calibrated high offset value as a loss function. Based on the actual vehicle size in the mine, 5 anchor blocks were designed, and their sizes are shown in table 1.

During training, firstly, the characteristics of the picture are extracted through a backbone network, five detection frames predicted by each characteristic point are output at the last layer, and the detection frame with the IOU (the intersection and union ratio of the predicted detection frame and the manual labeling frame) larger than a certain proportion is selected to calculate the loss function. The learning rate of the model is set to be 0.01-0.001, the sum of mean square errors is used as a loss function in coordinate regression, cross entropy is used as a loss function in vehicle model classification, the learning rate is attenuated by one half every 10 generations, and 60 batches are trained.

Thirdly, building and training a three-dimensional object detection model:

the vehicle and excavator area in the picture is cut off and then scaled to 64 × 64, and the size is input into the object three-dimensional detection model for training. The network for three-dimensional frame detection comprises four convolutional layers, a global pooling layer and two full-connection layers. The final output dimensionality of the full connection layer is 1 multiplied by 13, coordinates and corresponding confidence degrees of four vertexes of the lower bottom surface of the three-dimensional detection frame are sequentially represented, and the height of the three-dimensional detection frame is further represented. And during prediction, selecting three vertexes with the maximum confidence coefficient and a high-reconstruction three-dimensional labeling frame. The network structure of the three-dimensional inspection model is shown in table 2.

And (3) adopting the mean square error loss as a loss function for the coordinates and the confidence degrees of the four vertexes of the bottom side quadrangle of the last layer and the 13 values of the height of the three-dimensional labeling box (the two-dimensional coordinates of the four vertexes account for two values, the total value is 8 values, the confidence degrees of the 4 vertexes are 4 values, the height is one value, and the total value is 13 values).

During training, an Adam or SGD optimizer is adopted as with the two-dimensional detection model, the initial learning rate is set to 0.01, the attenuation is half per 10 generations, the minimum attenuation is 0.001, and 60 epochs are trained in total (the learning rate, Adam, SGD and Epoch are common terms in the field of deep learning).

TABLE 2 three-dimensional vehicle detection network architecture

Fourthly, calibrating a working area:

in the actual operation process of the method, mining area operation pictures obtained by the camera in real time sequentially pass through the two-dimensional detection and identification model and the three-dimensional detection model, so that the model of the excavator and the chassis three-dimensional detection frame can be obtained.

In order to calculate the working area of the transport vehicle, the relationship between the surrounding pixel points of the excavator in the picture and the actual distance needs to be known. And finding out the actual length and width of the excavator chassis according to the model, obtaining the pixel length in the picture according to the three-dimensional detection frame of the chassis, and establishing the corresponding relation between the pixel and the actual length. And finally, calculating the working interval where the transport vehicle needs to be located according to the arm length of the excavator and the position of the chassis.

Assuming that the number of pixel points occupied by the width of the chassis of the excavator in the picture is x, and the actual width of the chassis is y meters, and obtaining the ratio k of x to y. The arm length of the excavator is h meters according to the model of the excavator, and the optimal distance between the transport vehicle and the excavator is 2/3 arm lengths of the excavator. Therefore, the distance between the vehicle and the excavator in the picture is required to be 2hk/3 pixel points, and therefore the three-dimensional working area of the transport vehicle is parallelly arranged at 2hk/3 pixel points away from the left side surface and the right side surface of the three-dimensional frame of the excavator, namely, the position corresponding to 2/3h meters away from the left side surface or the right side surface of the excavator in the real world.

Fifthly, positioning of the transport vehicle:

and after the operation interval of the transport vehicle in the image is determined, the three-dimensional detection of the vehicle is carried out in the video frame acquired in real time. When the transport vehicle in the image is 2hk/3 pixel points away from the excavator (namely, the transport vehicle reaches the position of 2/3h meters away from the left side surface or the right side surface of the excavator in the real world), and the three-dimensional detection frame of the transport vehicle is approximately matched with the three-dimensional detection frame of the working interval, sending a reached instruction; and if not, continuing to send instructions to enable the transport vehicle to enter the calibrated station.

In summary, with the aid of the technical solutions of the present invention, through

The method comprises the steps of acquiring an RGB image as the input of a deep learning neural network model for two-dimensional detection and identification of an object, acquiring a two-dimensional detection frame and a corresponding model of the working vehicle, cutting the acquired two-dimensional detection frame as the input of the deep learning neural network model for three-dimensional detection, acquiring a three-dimensional detection frame of a base of the working vehicle, determining the pixel specification characteristics of the working vehicle in the image and the specification characteristics of the working vehicle in practice, thereby acquiring a three-dimensional working area of the working vehicle in the image, determining the area where the working vehicle should arrive, helping the working vehicle to work in a specified working area, and reducing potential safety hazards.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A coal mine operation area calibration method based on computer vision is characterized by comprising the following steps:

(1) acquiring RGB pictures of a transport vehicle and an excavator in a mining operation area;

(2) manually marking two-dimensional marking frames of the transport vehicle and the excavator in the RGB image, and meanwhile, marking the models of the transport vehicle and the excavator;

(3) cutting the areas of the transport vehicle and the excavator marked in the step (2), taking the image areas where the transport vehicle and the excavator are located as a new data set, and manually marking three-dimensional marking frames of chassis of the transport vehicle and the excavator;

(4) taking an RGB image containing a transport vehicle and an excavator as input, taking two-dimensional marking frames and models of the transport vehicle and the excavator as labels, and training a deep learning neural network model for two-dimensional object detection and identification;

(5) taking the cut excavator and transport vehicle pictures as input, taking the three-dimensional marking frames of the transport vehicle and the excavator base as labels, and training a deep learning neural network model for three-dimensional detection of an object;

(6) using RGB pictures acquired by a camera in real time as input of a trained object two-dimensional detection and recognition deep learning neural network model to detect two-dimensional detection frames and corresponding models of a transport vehicle and an excavator;

(7) cutting the vehicle area of the two-dimensional detection frame obtained in the step (6) as the input of a deep learning neural network model for three-dimensional detection of the object, and detecting the three-dimensional detection frames of the transport vehicle and the excavator base;

(8) calculating the length and width of pixels in a picture according to a three-dimensional detection frame of the chassis of the excavator;

(9) determining the actual length and width of the excavator chassis according to the identified excavator model;

(10) synthesizing the pixel length and the actual length and width of the chassis of the excavator in the picture to obtain a ratio, and determining the working interval of the transport vehicle according to the ratio and the safe distance between the excavator and the transport vehicle in the picture, wherein the safe distance is a multiple of the arm length of the excavator, and the arm length of the known excavator;

(11) and (3) after the working interval of the transport vehicle is determined, instructing the transport vehicle to drive into the working interval, and when the distance between the transport vehicle and the excavator in the image is a safe distance and the three-dimensional detection frame of the transport vehicle is matched with the working interval of the transport vehicle determined in the step (10), issuing an instruction of reaching the working interval to the transport vehicle.

2. The computer vision-based coal mine working area calibration method as claimed in claim 1, wherein the image resolution of the deep learning neural network model for two-dimensional detection and identification of the input object is 512 x 512; the image resolution of the deep learning neural network model for three-dimensional detection of an input object is 64 × 64.

3. The computer vision-based coal mine working area calibration method as claimed in claim 1, wherein the equipment for collecting RGB images is a high-definition camera, and the distance from the high-definition camera to the working area is less than 200 m.