CN111160231A

CN111160231A - Automatic driving environment road extraction method based on Mask R-CNN

Info

Publication number: CN111160231A
Application number: CN201911374543.9A
Authority: CN
Inventors: 刘宏哲; 田锦
Original assignee: Beijing Union University
Current assignee: Beijing Union University
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2020-05-15

Abstract

An automatic driving environment road extraction method based on Mask R-CNN. The invention discloses a lane departure detection method under an automatic driving environment based on deep learning, which belongs to the field of image processing, can be applied to lane departure detection under the automatic driving environment, and aims to solve the problem and the defect that the traditional detection method cannot adapt to different complex environments.

Description

Automatic driving environment road extraction method based on Mask R-CNN

Technical Field

The invention belongs to the field of image processing, and particularly relates to a road extraction method based on deep learning, which can be applied to lane departure detection of an automatic driving vehicle.

Background

With the rapid development of artificial intelligence, the combination of the traditional automobile industry and information technology makes the research of intelligent driving technology greatly progress. Intelligent driving techniques are the main focus of computer vision research today, both at an academic level and at an industrial level. Computer vision is primarily intended to simulate the visual function of a human being by means of a computer, i.e. it extracts, processes and understands image information and uses it for detection, measurement and control.

One of the most challenging tasks of autodrive is the understanding of traffic scenarios, including computer vision tasks such as lane detection and lane departure warning. Lane detection helps to guide the vehicle, can be used in a driving assistance system, and camera-based lane detection is an important step in achieving this environmental understanding because it allows the car to be correctly positioned within the roadway lane. This is crucial for any subsequent lane departure. Therefore, accurate camera-based lane detection is critical to achieving autonomous driving. However, in practical applications, these tasks can be very challenging given the many harsh scenes of inclement weather conditions, darkness or glare, etc. The traditional lane line detection and early warning method highly depends on manually selecting lane characteristic characteristics, has large workload and has poor effect when traffic scenes are remarkably changed.

In the problem of lane line detection, the deep neural network is used for learning lane line features, so that the accuracy of lane line feature extraction is improved, and the method is suitable for complex road environments. The most popular detection and segmentation methods, Faster R-CNN and YOLO, are not ideal for lane instance segmentation because bounding box detection is more suitable for compact objects, while lanes are not.

The university of Sydney uses a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) to detect lane lines, in the method, the CNN provides geometric information of the lane line structure, and the information is used by the RNN to detect the lane lines, and experiments show that the method can effectively extract lane line type targets.

Korean robots and computer vision laboratories proposed a method of extracting multiple regions of interest, merging regions that may belong to the same class, and finally classifying candidate regions using Principal Component Analysis networks (pcanets) and neural networks, and then proposed a multitasking Network (vaning Point Guided Network, VPGNet) based on Vanishing Point guidance in 2017 to solve lane marking and recognition and classification problems under complex weather conditions.

The Ford research and innovation center adopts a deep lane Network (deep networks) to respectively extract lane line characteristics obtained by cameras on two sides of a vehicle.

The SCNN proposed by hong Kong Chinese university performs information transfer on rows and columns of pictures, obtains complete detection information by utilizing information correlation and information transfer between front and rear frames of the pictures, has a good detection effect on a target with strong priori knowledge, such as a lane line, and is suitable for lane line detection in a sheltered scene.

Mask R-CNN is an example segmentation method based on deep learning and is widely applied to the target segmentation task of images. The Mask R-CNN can not only completely and accurately segment the target to be detected, but also can distinguish different individuals belonging to the same category, the traditional segmentation method can only segment lane line information in an image, but cannot distinguish each lane line, such as the left lane line information or the right lane line information of the current vehicle, and the workload is large, and the adoption of the Mask R-CNN model can avoid manual further complicated operation.

In recent years, researchers are continuously innovated, and a plurality of image-based lane line information extraction and segmentation methods are improved, because the existing road extraction method is not complete, a high-resolution remote sensing image has certain complexity, and a single method cannot effectively extract roads. .

Disclosure of Invention

The invention aims to provide a lane line segmentation method by using Mask R-CNN, which can ensure that the lane segmentation result in an automatic driving image has high accuracy and complete lane detection, and then combines a lane line fitting equation to realize the purpose of lane departure early warning and monitoring.

The invention is realized by adopting the following technical means:

1. a Mask R-CNN-based automatic driving environment road extraction method is characterized by comprising the following steps:

step 1: inputting an image, and reading in an image under a road scene under a 1024x1024 real driving environment;

step 1.1: and putting the read image information into a backbone network layer of a Mask R-CNN network model for feature extraction of the lane line information.

Step 1.2: and putting the feature map information extracted by the backbone network layer into a candidate regional network RPN of the network model, wherein the RPN generates 9 targets with preset length-width ratios and areas for each position by means of a window sliding on the shared feature map). These 9 initial anchors contained three areas (128 × 128, 256 × 256, 512 × 512), each of which contained three aspect ratios (1:1, 1:2, 2: 1). The extracted characteristic information is more accurate and accurate when being mapped to the original image.

Step 1.3: and finally, detecting, classifying and segmenting the extracted classification information in a classification layer of a Mask R-CNN network, and marking lane line information points on the original image.

Step 2: the method comprises the steps of extracting lane line information points in an image, clustering the information points, and learning lane line features through depth, wherein the extracted lane line features only have discrete coordinate information of lane lines and cannot be directly used. The method carries out primary linear clustering on the discrete lane line coordinate information according to lanes, so that the interference among multiple lane lines can be effectively eliminated, the information of the multiple lane lines can be further obtained, and more accurate and comprehensive input is provided for the fitting of the subsequent lane lines.

And step 3: and (5) establishing a fitting model. After the characteristic point clustering processing is completed, fitting processing is carried out on the characteristic points, firstly, the Hough transform is utilized to determine the approximate area where the straight line is located, then, cubic spline equations are used for determining equation parameters for the characteristic point information which is clustered in each straight line area, the equations are established at the far points of a vehicle coordinate system, and the method has significance for next lane departure detection.

And 4, step 4: and finally, linearly carrying out lane departure detection on the fitted lane linear equation, wherein the cubic spline equation obtained in the step 3 is an equation in a vehicle coordinate system, d represents the distance between the current vehicle and the left and right lane lines, and when d is detected to be less than 0.3m, providing lane departure early warning.

Advantageous effects

Aiming at the problems and defects of the existing lane detection and lane departure early warning technology, the invention provides a detection early warning method by using Mask R-CNN, which can extract lane line information in a road divided into real driving environments, has good detection effect, and can realize lane departure detection and early warning under an automatic driving environment by applying the accurately detected lane line information to the lane departure detection in combination with a fitting method.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a Mask R-CNN training flow;

FIG. 3 shows a fitting algorithm flow;

FIG. 4 is a final result graph;

fig. 5 is a schematic diagram of feature extraction of lane line information.

Detailed Description

The invention is further described below with reference to the accompanying drawings. As shown in fig. 1, wherein steps 1 to 4 are a road extraction process in the complete high-resolution remote sensing image.

Step 2: after extracting the lane line characteristic points in the image, the clustering processing is carried out on the lane line characteristic points, so that the mutual interference of multiple lane lines can be effectively eliminated, the information of the multiple lane lines can be further obtained, and more accurate and more comprehensive input is provided for the fitting of the subsequent lane lines. We denote the extracted feature point coordinates by F (x, y), feature segment B (x)_s,y_s,x_e,y_eAnd k, b, n) represents lane marking line segment information composed of a series of lane coordinates. Wherein (x)_s,y_s) And (x)_e,y_e) And the initial coordinates and the end coordinates of a section of lane line are represented, k and b are linear parameters of clustering characteristic coordinates, and n represents the coordinate number of the current lane line section.

The clustering equation specifically adopted is as follows: x ═ b (k) x y + b (b)

Wherein B (k) is represented by:

b (b) is represented by:

step 3, after the characteristic point clustering processing is finished, fitting the characteristic points, firstly determining an approximate area where a straight line is located by utilizing Hough transform, and then determining equation parameters by utilizing cubic spline equation for the clustered characteristic point information in each straight line area, wherein the overall steps are as follows:

1. carrying out Hough transformation on the discrete feature points of the lane line to obtain straight line information;

2. finding out characteristic points with a straight line not greater than d in the characteristic point set S to form a set E;

3. determining linear parameters k and b and a mean square error E by using a least square method in the set E;

4. for any feature point (x) in the set E_i,y_i) Dividing to satisfy kx_i+b＞y_iThe feature points of (5) constitute a subset E_posSatisfy kx_i+b＜y_iCharacteristic point composition E of_neg；

5. At E_posAnd E_negIn both sets, find and remove the point with the largest error, and then update the set E_posAnd E_negAnd repeating the third step until the error e is less than epsilon.

Where d (P) represents the distance of point P from the regression line.

Finally, determining a lane line fitting equation: (x) ax³+bx²+cx+d

And 4, step 4: and (3) performing lane departure detection, namely performing lane departure detection on the fitted lane linear equation linearly, wherein the cubic spline equation obtained in the step (3) is an equation in a vehicle coordinate system, d represents the distance (unit cm) between the current vehicle and the left lane line and the right lane line, and when the threshold value detects that d is less than 30cm, providing lane departure early warning.

We define the judgment criteria of the lane line as follows: and (4) given a detected lane line and a corresponding marking truth value thereof, and rasterizing the lane line into a point set. And defining the number of points with the distance from the lane line marking truth value point set to the detection lane line point set smaller than 20cm as the matching points from the marking lane line to the detection lane line. Therefore, the input discrete points are regarded as positive detection points in the truth value area, and the total length of the positive detection points is TP; the input discrete points are regarded as error detection points in a true value area, and the total length of the error detection points is FP; if the difference between the sum of all the point sets labeled with the true value and TP is the undetected value undetected length FN, the precision S is expressed as:

in conclusion, the geometric length characteristics in the lane line detection are combined, and two general evaluation criteria of accuracy and redundancy errors are used. The redundant linear object length in the redundant error means that people regard the ground object in the background as the road object in the foreground. The linear target lengths in the examples are all measured from the original image by manual means. The accuracy is 96.05%, and the method can be applied to a real driving environment.

Finally, it is to be noted that: the above examples are only for better illustration of the present invention and do not limit the technical solutions described in the present invention; thus, while the present invention has been described in detail with reference to the foregoing examples, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted; all such modifications and variations are intended to be included herein within the scope of this disclosure and the appended claims.

Claims

1. A Mask R-CNN-based automatic driving environment road extraction method is disclosed, wherein the Mask R-CNN comprises a backbone network layer, a candidate regional network RPN and a classification layer, and is characterized by comprising the following steps:

step 1: inputting the read road scene image under the real driving environment into Mask R-CNN, and extracting the characteristic points of the lane lines in the image;

step 2: performing primary linear clustering on the characteristic points of the lane line according to the lane line to obtain a characteristic point set S;

step 3, carrying out feature detection on the feature point set S by utilizing Hough transform, and dividing the points corresponding to different lanes in the feature point set S into different subsets S_iI is 1, …, N represents the number of lanes;

performing linear fitting on each subset to obtain a preliminary fitting equation of each lane line, wherein s_iDenotes the ith subset, l_iRepresenting a straight line fit corresponding to the ith subset, the straight line fit being represented as y ═ kx + b;

continuing to optimize each subset, specifically:

computing subset s_iTo l in each feature point_iRemoving the feature points with the distance larger than 5 to form a new set E_iI is 1, …, N; for new set E_iContinuously dividing to meet kx_i+b＞y_iThe feature points of (5) constitute a subset E_posSatisfy kx_i+b＜y_iCharacteristic point composition E of_neg，(x_i，y_i) Representing the coordinates of the characteristic points;

at E_posAnd E_negIn both sets, find and remove the point with the largest error E, and then update the set E_posAnd E_negRepeating the third step until the error e is less than epsilon, wherein the error e is kx_i+b-y_i；

According to the set E after iteration updating_posAnd E_negCarrying out cubic spline interpolation, and fitting to obtain a final fitting equation f (x) ax of each lane line³+bx²+ cx + d, which is an equation in the vehicle coordinate system, d representing the distance of the current vehicle from the left and right lane lines;

and 4, step 4: performing lane departure detection, specifically:

if f (x)₀) If the number is less than 30, lane departure early warning is carried out, wherein x₀Indicating the current location of the car.

2. The Mask R-CNN-based automatic driving environment road extraction method as claimed in claim 1, wherein step 1 further comprises:

step 1.1: putting the read image information into a backbone network layer of a Mask R-CNN network model for feature extraction of lane line information;

step 1.2: putting the feature map information extracted by the backbone network layer into a candidate regional network RPN for generating an accurate detection region, wherein the RPN generates 9 targets with preset length-width ratios and areas for each position by means of a window sliding on a shared feature map;

3. The Mask R-CNN-based automatic driving environment road extraction method as claimed in claim 2, wherein step 1 further comprises: the read image is an image of a road scene in a 1024x1024 real driving environment; the initial anchors contained three areas of 128 x 128, 256 x 256, 512 x 512, each area in turn containing three aspect ratios, specifically 1:1, 1:2, 2: 1.