CN113173502A

CN113173502A - Anti-collision method and system based on laser visual fusion and deep learning

Info

Publication number: CN113173502A
Application number: CN202110054308.4A
Authority: CN
Inventors: 罗永祥; 严志展; 陈志辉; 刘键涛; 魏秋新
Original assignee: Fujian E Port Co ltd
Current assignee: Fujian E Port Co ltd
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2021-07-27
Anticipated expiration: 2041-01-15
Also published as: CN113173502B

Abstract

The invention provides an anti-collision method based on laser visual fusion and deep learning, which comprises the following steps: acquiring image information and 3D point cloud data; registering the image information and the 3D point cloud data, and generating an RGBD image by using the registered image data and the 3D point cloud data, wherein the RGBD image is divided into a training set and a test set; training the 3D target detection model by adopting the RGBD image of the training set to obtain a trained 3D target detection model; inputting RGBD images of the test set into a trained 3D target detection model, performing category detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of a gantry crane; and determining whether to perform deceleration and stop operations according to the calculated object type and distance. The method provided by the invention adopts the combination of a visual means and a laser means, obtains the RGBD image through the fusion and registration of the image information and the laser point cloud information, and realizes more accurate anti-collision detection by combining the depth information of the image.

Description

Anti-collision method and system based on laser visual fusion and deep learning

Technical Field

The invention relates to the field of safe operation of gantry cranes, in particular to an anti-collision method and system based on laser vision fusion and deep learning.

Background

Rubber-tyred container gantry cranes (RTGs) and rail-mounted container gantry cranes (RMGs) (collectively referred to as yard bridges) are important machines for container terminal operations, and the efficiency, safety and operational accuracy of the rubber-tyred container gantry cranes have an important influence on the terminal operations. The field bridge mainly undertakes the container transfer operation between the container piled in the container yard and the container horizontal transport equipment (container truck or automatic guided transport vehicle AGV, running in truck land), and is characterized by complex operation environment, large danger coefficient, poor driver vision and high dependence on the driver;

during the operation of a bridge, because the position of a driver cab is higher, the lighting condition below the driver cab is unsatisfactory, and a hanger is arranged to shield a visual field blind area, so that the driver can not timely and completely observe the surrounding obstacles, collision accidents occur between a cart and obstacles such as people, vehicles and equipment in the field, the existing method mainly takes pictures through a camera to carry out target detection to determine the obstacles, the detection accuracy is lower, and the requirement of safety production can not be met.

Disclosure of Invention

The invention mainly aims to overcome the defects in the prior art and provides an anti-collision method based on laser vision fusion and depth learning.

The invention adopts the following technical scheme:

an anti-collision method based on laser visual fusion and deep learning comprises the following steps:

acquiring image information by using a camera arranged on equipment, and acquiring 3D point cloud data by using an area array laser sensor arranged on the equipment;

registering the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data;

generating an RGBD image by using the registered image data and 3D point cloud data, and dividing the RGBD image into a training set and a testing set;

training the 3D target detection model by adopting the RGBD image of the training set to obtain a trained 3D target detection model;

inputting the test set into a trained 3D target detection model, performing class detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of the gantry crane;

and determining whether to perform deceleration and stop operations according to the calculated object type and distance.

Specifically, registering image information and 3D point cloud data to obtain registered image data and 3D point cloud data; the registration operation specifically includes:

moving and mapping a reference radar;

carrying out iterative registration and calculation on the reconstructed map of the reference radar by using the rest radar data;

reducing the matching error according to the consistency hypothesis until the algorithm converges and the rigidity invariant characteristic of the calibration matrix is satisfied;

and obtaining a final calibration matrix according to a consistency algorithm.

Specifically, generating an RGBD image by using the registered image data and 3D point cloud data specifically includes:

acquiring depth information of each point in the RGB image by utilizing sampling interpolation;

wherein f (x, y) is the depth value of point Q (x, y), Q₁₁(x₁,y₁)，Q₁₂(x₁,y₂)，Q₂₁(x₂,y₁)，Q₂₂(x₂,y₂) Is the adjacent value of Q (x, y).

Specifically, the RGBD image of the training set is used to train the 3D target detection model, so as to obtain the trained 3D target detection model, which specifically includes:

adjusting the size of the RGBD image of the training set;

performing histogram equalization operation on the image to obtain an image after the histogram equalization;

randomly adding white noise z-N (0, sigma) to the image after histogram equalization²)；

And inputting the E-Yolo model to perform gradient regression training to obtain the trained E-Yolo model.

Specifically, inputting a test set into a trained 3D target detection model, performing category detection on obstacles in a field of view by using a target detection method, and calculating the distance from an object to the nearest end of a gantry crane, specifically comprising:

adjusting the size of the RGBD image of the test set according to the adjusted size of the RGBD image of the training set;

inputting the coordinate points into a trained E-Yolo model, fixing a batchnorm layer and a dropout layer of the model, and outputting a result as a coordinate point of a target detection object;

and calculating the distance between the coordinate point of the target detection object and the gantry crane, and taking the minimum distance as an output distance.

In another aspect, an embodiment of the present invention provides an anti-collision system based on laser visual fusion and deep learning, including:

a data acquisition unit: acquiring image information by using a camera arranged on equipment, and acquiring 3D point cloud data by using an area array laser sensor arranged on the equipment;

a data registration unit: registering the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data;

an RGBD image generation unit: generating an RGBD image by using the registered image data and 3D point cloud data, and dividing the RGBD image into a training set and a testing set;

a model training unit: training the 3D target detection model by adopting the RGBD image of the training set to obtain a trained 3D target detection model;

a detection and calculation unit: inputting the test set into a trained 3D target detection model, performing class detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of the gantry crane;

a determination unit: and determining whether to perform deceleration and stop operations according to the calculated object type and distance.

Specifically, the data registration unit registers the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data; the registration operation specifically includes:

moving and mapping a reference radar;

and obtaining a final calibration matrix according to a consistency algorithm.

Specifically, the RGBD image generating unit generates an RGBD image by using the registered image data and 3D point cloud data, and specifically includes:

adjusting the size of the RGBD image of the training set;

In another aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the above-mentioned steps of the collision avoidance method based on laser visual fusion and deep learning.

As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:

(1) the invention provides an anti-collision method based on laser vision fusion and depth learning, which is characterized in that an RGBD image is obtained by adopting the combination of a vision means and a laser means through the fusion and registration of image information and laser point cloud information, and more accurate anti-collision detection is realized by combining the depth information of the image.

(2) According to the method provided by the invention, the area array laser sensor is adopted to obtain the 3D point cloud data, so that the data information of surrounding obstacles can be comprehensively obtained, and accurate judgment is realized.

Drawings

Fig. 1 is a flowchart of an anti-collision method based on laser visual fusion and deep learning according to an embodiment of the present invention;

FIG. 2 is an exemplary diagram of an RGB image and a corresponding RGBD image provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of object coordinates output by the target detection model according to an embodiment of the present invention;

fig. 4 is a block diagram of a laser vision fusion and deep learning collision avoidance system according to an embodiment of the present invention.

Fig. 5 is a block diagram of a readable storage medium according to an embodiment of the present invention.

Detailed Description

The invention is further described below by means of specific embodiments.

The embodiment of the invention provides an anti-collision method and system based on laser vision fusion and depth learning.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.

As shown in fig. 1, a flowchart of an anti-collision method based on laser visual fusion and deep learning provided by an embodiment of the present invention specifically includes the following steps:

s101: acquiring image information by using a camera arranged on equipment, and acquiring 3D point cloud data by using an area array laser sensor arranged on the equipment;

the embodiment of the invention is provided with 6 cameras on equipment to detect whether dangerous obstacles exist around a cart or not; the cameras are arranged on the front and the back of the cart and the side faces of the cart, and are connected with fixed ends of the cart through customized steel structure platforms (vertically placed) and supports, so that the shaking is reduced, the cameras are ensured to be within the tire outer frame, and the mounting position of the cameras is kept at the vertical position in the middle of the two groups of tires as far as possible.

In addition, the invention adopts the area array laser sensor to obtain the 3D point cloud data of surrounding obstacles, and can comprehensively obtain the data information of the obstacles, thereby realizing accurate judgment.

S102: registering the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data;

in practical use, it is sometimes inconvenient to find an open source environment or reference object for Calibration, for which reason Livox introduced the Automatic Calibration technique TFAC-Livox algorithm (Target-Free Automatic Calibration) and opened the source on Github. The technology mainly depends on the assumption of geometric consistency, namely that local three-dimensional models scanned by a plurality of radars are consistent, moving mapping is carried out on a reference radar (LiDAR0), then iterative registration and calculation are carried out on reconstructed maps of the LiDAR0 by the rest radar data, matching errors are continuously reduced by means of the consistency assumption until the algorithm converges and the rigidity invariant characteristic of a calibration matrix (six parallel lines) is met, and finally a final calibration matrix (external reference) is obtained by means of the consistency algorithm.

S103: generating an RGBD image by using the registered image data and 3D point cloud data, and dividing the RGBD image into a training set and a testing set;

the coordinate systems of the registered image data and the laser point cloud data are consistent, so that depth information of a certain point in the RGB image is obtained in a sampling mode, namely for each point in the RGB image, linear interpolation of some points in a nearby area is taken as the depth value of the point, and the depth value is expressed by a mathematical formula:

Fig. 2 is an exemplary diagram of an RGB image and a corresponding RGBD image according to an embodiment of the present invention, where 2(a) is the RGB image, and 2(b) is the RGBD image, and the RGBD image includes depth information of each point in the RGB image.

S104: training the 3D target detection model by adopting the RGBD image of the training set to obtain a trained 3D target detection model;

a. adjusting the size of an input image, and setting the length and the width of the image as w and h respectively; in embodiments of the present invention, the length w of the selected image is no greater than 1000, and the width h of the image is no greater than 640;

when w > h: adjusting h to 640, adjusting w according to the proportion, and if w does not exceed 1000, adjusting according to the proportion; if w exceeds 1000, adjust w to 1000 and scale h to the corresponding value.

When h > w: adjusting w to 640, adjusting h according to the proportion, and if h is not more than 1000, adjusting according to the proportion; if h exceeds 1000, then adjust h to 1000 and scale w to the corresponding value.

b. Carrying out histogram equalization operation on the image to achieve the effect of enhancing the contrast of the image;

c. randomly adding a white noise z-N (0, sigma) to the image²) Enhancing the robustness of the model;

d. and inputting the E-Yolo model to perform gradient regression training, and obtaining the trained E-Yolo model by using an SGD optimizer and a learning rate of 0.001.

S105: inputting the test set into a trained 3D target detection model, performing class detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of the gantry crane;

a. adjusting the size of an input image, and setting the length and the width of the image as w and h respectively; similarly, setting the length w of the image to be not more than 1000 and the width h of the image to be not more than 640;

b. Histogram equalization operation is carried out on the image, and the effect of enhancing the contrast of the image is achieved.

c. Inputting a picture to a trained E-Yolo model, fixing a batchnorm layer and a dropout layer of the model, outputting 8 (x, y, z) coordinate points of a target detection object, wherein the eight points can form a cubic frame of the object position, and representing the position and the shape size of the object, as shown in FIG. 3, then calculating the distance between the 8 points and a gantry crane, and taking the minimum distance of the eight distances as an output distance.

S106: and determining whether to perform deceleration and stop operations according to the calculated object type and distance.

In the embodiment of the invention, the deceleration operation is carried out when the minimum distance between the object and the gantry crane is 10 meters, and the stop operation is required when the minimum distance is 6 meters.

Another aspect of the embodiment of the present invention provides an anti-collision system 40 based on laser visual fusion and deep learning, as shown in fig. 4, including:

the data acquisition unit 401: acquiring image information by using a camera arranged on equipment, and acquiring 3D point cloud data by using an area array laser sensor arranged on the equipment;

The data registration unit 402: registering the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data;

in practical use, it is sometimes inconvenient to find an open source environment or reference object for Calibration, for which reason Livox introduced the Automatic Calibration technique TFAC-Livox algorithm (Target-Free Automatic Calibration) and opened the source on Github. The technology mainly relies on the assumption of geometric consistency, namely, local three-dimensional models scanned by a plurality of radars are consistent, a reference radar (LiDAR0) is subjected to mobile mapping, then other radar data are subjected to continuous iterative registration and calculation on a reconstruction map of the LiDAR0, matching errors are continuously reduced by relying on the assumption of consistency until the algorithm converges and meets the rigidity invariant characteristic of a calibration matrix (six parallel lines), and finally a final calibration matrix (external reference) is obtained by using the assumption of consistency

The RGBD image generating unit 403: generating an RGBD image by using the registered image data and 3D point cloud data, and dividing the RGBD image into a training set and a testing set;

Model training unit 404: training the 3D target detection model by adopting the RGBD image of the training set to obtain a trained 3D target detection model;

e. adjusting the size of an input image, and setting the length and the width of the image as w and h respectively; in embodiments of the present invention, the length w of the selected image is no greater than 1000, and the width h of the image is no greater than 640;

f. Carrying out histogram equalization operation on the image to achieve the effect of enhancing the contrast of the image;

g. randomly adding a white noise z-N (0, sigma) to the image²) Enhancing the robustness of the model;

and inputting the E-Yolo model to perform gradient regression training, and obtaining the trained E-Yolo model by using an SGD optimizer and a learning rate of 0.001.

The detection and calculation unit 405: inputting the test set into a trained 3D target detection model, performing class detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of the gantry crane;

Determination unit 406: and determining whether to perform deceleration and stop operations according to the calculated object type and distance.

The invention provides an anti-collision method based on laser vision fusion and depth learning, which is characterized in that an RGBD image is obtained by adopting the combination of a vision means and a laser means through the fusion and registration of image information and laser point cloud information, and more accurate anti-collision detection is realized by combining the depth information of the image.

As shown in fig. 5, the present embodiment provides a computer-readable storage medium 50, on which a computer program 501 is stored, the computer program 501, when executed by a processor, implementing the steps of:

inputting RGBD images of the test set into a trained 3D target detection model, performing category detection on obstacles in a visual field range by using a target detection method, and calculating the distance between an object and the nearest end of a gantry crane;

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).

The above description is only an embodiment of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modifications made by using the design concept should fall within the scope of infringing the present invention.

Claims

1. An anti-collision method based on laser visual fusion and deep learning is characterized by comprising the following steps:

2. The anti-collision method based on laser visual fusion and deep learning of claim 1, wherein the image information and the 3D point cloud data are registered to obtain the registered image data and the registered 3D point cloud data; the registration operation specifically includes:

moving and mapping a reference radar;

and obtaining a final calibration matrix according to a consistency algorithm.

3. The collision avoidance method based on laser visual fusion and deep learning according to claim 1, wherein the RGBD image is generated by using the registered image data and 3D point cloud data, and specifically comprises:

4. The anti-collision method based on laser visual fusion and deep learning according to claim 1, wherein the 3D target detection model is trained by using RGBD images of a training set to obtain the trained 3D target detection model, and specifically comprises:

adjusting the size of the RGBD image of the training set;

5. The anti-collision method based on laser vision fusion and deep learning of claim 4, wherein the test set is input into a trained 3D target detection model, and the target detection method is used to detect the type of the obstacle in the field of view and calculate the distance between the object and the nearest end of the gantry crane, specifically comprising:

6. A collision avoidance system based on laser visual fusion and deep learning, comprising:

7. The collision avoidance system based on laser visual fusion and deep learning of claim 6, wherein the data registration unit registers the image information and the 3D point cloud data to obtain registered image data and 3D point cloud data; the registration operation specifically includes:

moving and mapping a reference radar;

and obtaining a final calibration matrix according to a consistency algorithm.

8. The collision avoidance system based on laser visual fusion and deep learning according to claim 6, wherein the RGBD image generation unit generates an RGBD image by using the registered image data and 3D point cloud data, and specifically comprises:

9. The collision avoidance system based on laser visual fusion and deep learning according to claim 6, wherein the 3D object detection model is trained by using the RGBD image of the training set to obtain the trained 3D object detection model, specifically comprising:

adjusting the size of the RGBD image of the training set;

10. The collision avoidance system based on laser vision fusion and deep learning of claim 9, wherein the test set is input into a trained 3D target detection model, and a target detection method is used to detect the type of the obstacle in the field of view and calculate the distance of the object from the nearest end of the gantry crane, specifically comprising: