CN113643232A

CN113643232A - Pavement pit automatic detection method based on binocular camera and convolutional neural network

Info

Publication number: CN113643232A
Application number: CN202110743188.9A
Authority: CN
Inventors: 冯永慧; 翟帅; 罗宏煜; 肖思航; 田丽玲; 吴少伟
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2021-07-01
Filing date: 2021-07-01
Publication date: 2021-11-12

Abstract

The invention discloses a road surface pit slot detection method based on a neural network and binocular stereoscopic vision. The method comprises the following steps: acquiring left and right views of a road scene using a binocular camera mounted in front of a vehicle; detecting the left view by using a target detection neural network; when the pit slot is detected to exist, the left view and the right view are subjected to stereo matching to obtain a depth map; cutting out the pit and a depth map of the road surface around the pit according to the position information of the pit predicted by the neural network; fitting a road plane around the pit slot by using a method combining RANSAC and a genetic algorithm; calculating the distance from the pit point cloud to the fitting plane, and summing and averaging the distances to obtain the pit depth; and projecting the pit point cloud onto the fitted road plane, solving the minimum circumscribed rectangle of the projection point, and finally solving the area of the minimum circumscribed rectangle to obtain the area of the pit. The invention can effectively detect and measure the depth and the area of the pit slot and can provide reference information for road maintenance.

Description

Pavement pit automatic detection method based on binocular camera and convolutional neural network

Technical Field

The invention relates to an automatic detection method of a road surface pit, in particular to a road surface pit detection method based on a neural network and binocular stereoscopic vision.

Background

The automobile holding capacity and road mileage in China are rapidly increased year by year, road maintenance becomes a huge project, and particularly, the defects of pits on the road surface can have great influence on driving comfort and safety. Meanwhile, the road environment needs to be detected during automatic driving, and better route planning information can be provided for automatic driving through detection of the road pit.

At present, an automatic detection method for pavement pits is mainly based on image processing and three-dimensional laser. In the former method, the image acquired by a monocular camera is used for detecting the pit groove by the traditional methods such as computer vision edge detection, and the like, and the method is easily interfered by various shadows, stains, sundries, water stains and the like on the road surface, and meanwhile, the depth information cannot be acquired. The depth information of the road surface can be obtained by the aid of the characteristics of strong anti-interference capability and strong robustness of the laser, but the depth information is relatively high in price and high in installation requirement precision, and meanwhile, the depth information needs to be matched with an encoder to work, so that the practicability is limited.

Therefore, it is necessary to provide a method for detecting and measuring a road surface pit, which can not only quickly and accurately detect the road surface pit, but also obtain the road surface depth information.

Disclosure of Invention

In order to solve the problems in the prior art, the invention aims to overcome the defects in the prior art and provide the automatic detection method of the pavement pit slot based on the binocular camera and the convolutional neural network, which can quickly and accurately detect the pavement pit slot and can acquire the pavement depth information.

In order to achieve the purpose of the invention, the invention adopts the following technical scheme:

a road surface pit slot automatic detection method based on a binocular camera and a convolutional neural network comprises the following steps:

(1) using a binocular stereo camera installed on a vehicle to acquire left and right views of a road ahead and sending the left and right views to a computing unit;

(2) preprocessing the obtained left view or right view;

(3) leading the preprocessed image into a pre-trained target recognition neural network for detection;

(4) if no pit is detected, repeating the step (1), and if a pit is detected, recording the number of the pits and the coordinates of the left view;

(5) obtaining a depth map relative to the left camera or the right camera according to the left and right views obtained in the step (1);

(6) cutting the depth map to obtain a depth map of the pit and the road surface around the pit according to the coordinates of the pit obtained in the step (4);

(7) converting the cut depth image from a pixel coordinate system to an image coordinate system, and further projecting the depth image to a camera coordinate system;

(8) carrying out plane fitting on the point clouds around the pit slot projected on the camera coordinate system, and distinguishing the point clouds belonging to the pit slot according to the fitting plane;

(9) calculating the distance from the pit point cloud to the fitting plane, and summing and averaging to obtain the depth of the pit;

(10) projecting the pit point cloud onto a fitting plane, calculating the minimum external rectangle of the projection points on the fitting plane, and calculating the maximum external rectangle

The area of the small circumscribed rectangle is used as the area of the pit slot;

(11) and storing the coordinates, the depths and the area information in the left and right views, the depth map, the GPS data and the pit slot image coordinate system for subsequent research and repair.

Preferably, the step of obtaining the depth of the road pit from the step (2) to the step (9) comprises:

a. preprocessing a left view in left and right views of a road acquired by the binocular camera, and then predicting the preprocessed left view by using a trained target detection neural network;

b. if the prediction result shows that one or more pit slots exist in the road ahead, the coordinates of the pit slots in the original left view are calculated according to the scaling during preprocessing and the predicted pixel coordinates of the pit slots, wherein the coordinates of the pit slots are the coordinates of the upper left corner and the lower right corner of a rectangular frame which just surrounds the pit slots in the image;

c. carrying out stereo matching on the obtained left and right views to obtain a road depth map;

d. expanding a certain proportion to the periphery of the pit according to the calculated coordinate information of the pit, so that the rectangular frame comprises the road surface at the periphery of the pit, and the road surface at the periphery of the pit can be considered as a plane;

e. cutting out pavement pit slots and a pavement depth map around the pit slots in the depth map according to the expanded coordinate information of the rectangular frame;

f. converting the cut depth image from a pixel coordinate system to an image coordinate system, and further converting the cut depth image to a camera coordinate system;

g. carrying out plane fitting on the road surface around the pit slot in the camera coordinate system, and distinguishing point clouds belonging to the pit slot in the camera coordinate system according to the fitted plane;

h. and calculating the distance from the pit point cloud to the fitting plane, and summing and averaging all the distance data to obtain the depth of the pit.

Preferably, in the step (2), the preprocessing of the obtained left view includes filtering noise reduction and resizing to input sizes required by the neural network model.

Preferably, in the step (3), the target recognition neural network is characterized by being similar to DenseNet, a large number of dense link modules are used in a backbone to perform feature extraction, and inputs are predicted on two scales by using Yolov3-tiny, three anchors are defined on each scale, and the anchors are generated by K-means.

Preferably, in the step (4), the neural network predicted pit coordinates are expressed as (u)₁，v₁，u₂，v₂) Wherein (u)₁，v₁) Representing the predicted position of the upper left corner of the rectangular box in the image coordinate system of the left view, (u₂，v₂) Representing the position of the predicted lower right corner of the rectangular box in the image coordinate system of the left view.

Preferably, the equation for converting the cropped depth map from the pixel coordinate system (u, v) to the image coordinate system (x, y) is:

u in formula (1)₀，v₀Representing the horizontal and vertical coordinates of the origin of the image coordinate system in the pixel coordinate system, d_x，d_yRepresenting the width and height of each pixel point in the pixel coordinate system. The cropped depth map is converted from an image coordinate system (x, y) to a camera coordinate system (x)_c，y_c，z_c) The equation of (a) is:

in the formula (2) z_cThe value obtained when the coordinates in the image coordinate system are (x, y), that is, the distance from the road surface to the left camera imaging surface, is represented, and f represents the focal length of the left camera. Conversion from the pixel coordinate system (u, v) to the camera coordinate system (x)_c，y_c，z_c) Expressed as:

in formula (3) f_x，f_yIs equivalent focal length in x-axis and y-axis, same as u₀，v₀And the parameters are internal parameters of the camera, and specific values can be obtained during calibration.

Preferably, the fitting of the point cloud plane around the pit slot on the camera coordinate system adopts a method combining RANSAC and a genetic algorithm. Specifically, the method comprises the following steps: RANSAC iterates for a certain number of times, calculates and records the number of interior points and the values of all parameters of each iteration, then sorts the interior points from large to small, counts the value range of the first 20% of all parameters, uses the value range as the coding value range of all parameters of the genetic algorithm, and finally carries out population iteration for a certain number of times to obtain the better value of the plane parameter.

Preferably, the plane fitting in the step (9) includes:

(9-1) the object of plane fitting is point cloud data of the road surface around the pit slot in the camera coordinate system, and the point cloud data of the pit slot is not included;

(9-2) iterating for a certain number of times by using a RANSAC algorithm to perform plane fitting, recording a plane coefficient and an interior point number obtained by random sampling each time, sequencing the plane coefficients and the interior point number from high to low according to the interior point number, taking a result sequenced in the previous sequence from the sequenced plane parameter results, and counting the variation range of each parameter in the selected result;

and (9-3) optimizing a fitting plane by using a genetic algorithm, coding the parameters according to the variation range of each statistic parameter, and obtaining a final plane fitting result through iteration.

Preferably, the step of acquiring the area of the road pit in the step (10) includes:

(10-1) distinguishing point cloud data belonging to the pit slot from a camera coordinate system according to the fitting plane;

(10-2) projecting the point cloud of the pit slot onto the fitted plane;

(10-3) solving the minimum circumscribed rectangle of the projection point on the fitting plane;

and (10-4) calculating the length and the width of the rectangle according to the coordinate information of the minimum circumscribed rectangle, thereby calculating the area of the minimum circumscribed rectangle, wherein the area of the minimum circumscribed rectangle is the area of the pit slot.

Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable advantages:

1. the method is a complete binocular-based pavement pit detection and measurement scheme; using a binocular camera, road depth information of higher resolution than a lidar and a ToF camera can be obtained; the convolutional neural network adopts a large number of dense connection modules, so that the model becomes deeper, the shallow layer characteristics of the neural network are repeatedly utilized, and the parameters of the model and the occupied memory are greatly reduced; the pit slot is detected through the neural network, the robustness is stronger, the detection accuracy is higher, meanwhile, binocular stereo matching is carried out only when the pit slot is detected, and the calculation is reduced to a great extent; the target detection neural network directly outputs the position information of the pit, and only the pavement in a certain range around the pit is operated when coordinate conversion and plane fitting are carried out, so that the calculated amount is greatly reduced, and meanwhile, the interference of point cloud of the pit on plane fitting is reduced; when the converted point cloud is subjected to plane fitting, a method combining RANSAC and a genetic algorithm is adopted, so that a road plane can be better fitted;

2. the invention can effectively detect and measure the depth and the area of the pit slot, and can provide reference information for road maintenance;

3. the invention can quickly and accurately detect the pits on the road surface;

4. the invention has low cost, high cost performance and strong market competitiveness.

Drawings

Fig. 1 is a flowchart of a road surface pit detection method based on a neural network and binocular stereo vision provided by the invention.

Fig. 2 is a relationship diagram of a pixel coordinate system and an image coordinate system provided by the present invention.

FIG. 3 is a diagram of the relationship between the image coordinate system and the camera coordinate system provided by the present invention.

Detailed Description

The above-described scheme is further illustrated below with reference to specific embodiments, which are detailed below:

the first embodiment is as follows:

in this embodiment, referring to fig. 1, an automatic detection method for a pavement pit based on a binocular camera and a convolutional neural network includes the following steps:

(2) preprocessing the obtained left view or right view;

The method for automatically detecting the pavement pit based on the binocular camera and the convolutional neural network can quickly and accurately detect the pavement pit and can acquire pavement depth information.

Example two:

this embodiment is substantially the same as the first embodiment, and is characterized in that:

in this embodiment, the step of obtaining the depth of the road pit from step (2) to step (9) includes:

In this embodiment, in the step (2), the preprocessing of the obtained left view includes filtering noise reduction and resizing to the input size required by the neural network model.

In this embodiment, the plane fitting in the step (9) includes:

In this embodiment, the step of acquiring the area of the road pit in the step (10) includes:

(10-2) projecting the point cloud of the pit slot onto the fitted plane;

The method comprises the steps of a set of complete binocular-based pavement pot hole detection and measurement scheme; using a binocular camera, road depth information of higher resolution than a lidar and a ToF camera can be obtained; the convolutional neural network adopts a large number of dense connection modules, so that the model becomes deeper, the shallow layer characteristics of the neural network are repeatedly utilized, and the parameters of the model and the occupied memory are greatly reduced; the pit slot is detected through the neural network, the robustness is stronger, the detection accuracy is higher, meanwhile, binocular stereo matching is carried out only when the pit slot is detected, and the calculation is reduced to a great extent; the target detection neural network directly outputs the position information of the pit, and only the pavement in a certain range around the pit is operated when coordinate conversion and plane fitting are carried out, so that the calculated amount is greatly reduced, and meanwhile, the interference of point cloud of the pit on plane fitting is reduced; when the converted point cloud is subjected to plane fitting, a method combining RANSAC and a genetic algorithm is adopted, and a road plane can be better fitted.

Example three:

this embodiment is substantially the same as the above embodiment, and is characterized in that:

in this embodiment, fig. 1 is a schematic diagram of a road surface pit detection flow based on a neural network and binocular stereo vision. The method comprises two parts, wherein S101-S104 are preparation parts of an embodiment and S105-S115 are specific working parts of the embodiment:

step S101, calibrating a binocular camera, wherein a Zhangyingyou chessboard calibration method can be used;

step S102, installing binocular cameras in front of a vehicle in parallel, and acquiring left and right views of a road scene in front of the vehicle;

step S103, manually selecting a road surface image containing a pit, labeling, and randomly dividing the collected pit image into a training set and a testing set;

step S104, firstly establishing a target detection neural network model, then using the training set established in the step S103 to train the model, and storing the model parameters which are best represented on the test set;

step S105, the specific working part of the embodiment starts, and left and right views of the road are acquired using a binocular camera installed in parallel in front of the vehicle;

step S106, filtering and denoising the obtained left view image or right view image, and adjusting the size to the size required by the target detection neural network;

the right view image may correspond to a camera for obtaining a depth map in the following step, and in this embodiment, either the left view image or the right view image is used;

step S107, inputting the preprocessed left view or right view into a target detection neural network, wherein the result of the target detection neural network is represented by a rectangular frame, and outputting pixel coordinates of the upper left corner and the lower right corner of the rectangular frame of the pit slot in the preprocessed left view or right view, wherein the pixel coordinates of the left view or right view in the original size are obtained by processing the result;

step S108, if the output coordinates are empty, indicating that no pit exists in the road ahead, directly returning to the step S105, otherwise, continuing the processing;

step S109, performing stereo matching on the left and right views obtained in step S105 to obtain a depth map relative to the left camera or the right camera;

step S110, according to the pixel coordinates obtained in the step S107, expanding the four sides of the rectangular frame to a certain range of 20% to the periphery to obtain a new rectangular frame, wherein the new rectangular frame not only comprises the pit slot, but also comprises a road plane in a certain range of the periphery of the pit slot; cutting the depth map by using the new rectangular frame to obtain the depth map of the pit and the peripheral road surface;

step S111, converting the coordinates of each point on the cropped depth map obtained in step S110 from a pixel coordinate system to an image coordinate system, as shown in fig. 2;

step S112, converting the points on the image coordinate system obtained in step S111 to a camera coordinate system, as shown in fig. 3;

step S113, performing plane fitting on the road surface point cloud around the pit slot converted into the camera coordinate system; randomly selecting 3 point clouds each time by using RANSAC during plane fitting to obtain each parameter of the plane, calculating the number of inner points, and repeating for 200 times; sorting the obtained plane parameters from large to small according to the number of interior points, selecting the first 40 planes, counting the variation range of each parameter of the selected 40 planes, then coding the parameters according to the variation range of each parameter by using a genetic algorithm, repeatedly selecting, crossing and varying the number of individuals in a group for 100 times, and obtaining the plane with the maximum value on a fitness function as a final fitting result;

step S114, distinguishing point clouds input into the pit slot by using the fitting plane, calculating the distance between each point and the fitting plane, summing and averaging to obtain the depth of the pit slot;

and S115, projecting the pit point cloud onto a fitting plane, calculating a minimum circumscribed rectangle, and calculating the length and the width according to the coordinates of each vertex on the rectangle on a camera coordinate system after the minimum circumscribed rectangle is obtained, so as to obtain the pit area.

The road surface pit slot detection method based on the neural network and the binocular stereo vision comprises the following steps: acquiring left and right views of a road scene using a binocular camera mounted in front of a vehicle; detecting the left view by using a target detection neural network; when the pit slot is detected to exist, the left view and the right view are subjected to stereo matching to obtain a depth map; cutting out the pit and a depth map of the road surface around the pit according to the position information of the pit predicted by the neural network; fitting a road plane around the pit slot by using a method combining RANSAC and a genetic algorithm; calculating the distance from the pit point cloud to the fitting plane, and summing and averaging the distances to obtain the pit depth; and projecting the pit point cloud onto the fitted road plane, solving the minimum circumscribed rectangle of the projection point, and finally solving the area of the minimum circumscribed rectangle to obtain the area of the pit. The method can effectively detect and measure the depth and the area of the pit slot, and can provide reference information for road maintenance.

Example four:

in this embodiment, in the step (3), the target recognition neural network is characterized by being similar to DenseNet, a large number of dense link modules are used in a backbone to perform feature extraction, and inputs are predicted on two scales by using YOLOv3-tiny, three anchors are defined on each scale, and the anchors are generated by K-means.

In this embodiment, in the step (4), the neural network predicts the coordinates of the pit as (u)₁，v₁，u₂，v₂) Wherein (u)₁，v₁) Indicating the position of the predicted upper left corner of the rectangular box in the image coordinate system of the left view, (u)₂，v₂) Representing the position of the predicted lower right corner of the rectangular box in the image coordinate system of the left view.

In this embodiment, the equation for converting the cropped depth map from the pixel coordinate system (u, v) to the image coordinate system (x, y) is:

In this embodiment, the fitting of the point cloud plane around the pit on the camera coordinate system adopts a method combining RANSAC and a genetic algorithm. Specifically, the method comprises the following steps: RANSAC iterates for a certain number of times, calculates and records the number of interior points and the values of all parameters of each iteration, then sorts the interior points from large to small, counts the value range of the first 20% of all parameters, uses the value range as the coding value range of all parameters of the genetic algorithm, and finally carries out population iteration for a certain number of times to obtain the better value of the plane parameter. The present embodiment can obtain road surface depth information with higher resolution than the laser radar and the ToF camera; the convolutional neural network adopts a large number of dense connection modules, so that the model becomes deeper, shallow features of the neural network are repeatedly utilized, and parameters of the model and the occupied memory are greatly reduced.

The embodiments of the present invention have been described with reference to the accompanying drawings, but the present invention is not limited to the embodiments, and various changes and modifications can be made according to the purpose of the invention, and any changes, modifications, substitutions, combinations or simplifications made according to the spirit and principle of the technical solution of the present invention shall be equivalent substitutions, as long as the purpose of the present invention is met, and the present invention shall fall within the protection scope of the present invention without departing from the technical principle and inventive concept of the present invention.

Claims

1. A road surface pit slot automatic detection method based on a binocular camera and a convolutional neural network is characterized by comprising the following steps:

(2) preprocessing the obtained left view or right view;

(10) projecting the pit point cloud onto a fitting plane, calculating a minimum external rectangle for a projection point on the fitting plane, and taking the area of the minimum external rectangle as the area of the pit;

2. The binocular camera and convolutional neural network-based automatic pavement pit detection method according to claim 1, wherein the step of obtaining the depth of the pavement pit from step (2) to step (9) comprises:

3. The binocular camera and convolutional neural network based automatic pavement pit detection and measurement method according to claim 1, wherein in the step (2), the preprocessing of the obtained left view includes filtering noise reduction and changing the size to the input size required by the neural network model.

4. The binocular camera and convolutional neural network based pavement pit automatic detection and measurement method according to claim 1, wherein the plane fitting in the step (9) comprises:

5. The binocular camera and convolutional neural network-based automatic pavement pit detection method according to claim 1, wherein the step of acquiring a pavement pit area in the step (10) comprises:

(10-2) projecting the point cloud of the pit slot onto the fitted plane;