CN110210324B

CN110210324B - Road target rapid detection early warning method and system

Info

Publication number: CN110210324B
Application number: CN201910383340.XA
Authority: CN
Inventors: 陶文兵; 石京磊
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2019-05-08
Filing date: 2019-05-08
Publication date: 2021-01-19
Anticipated expiration: 2039-05-08
Also published as: CN110210324A

Abstract

The invention discloses a road target rapid detection early warning method and a system, belonging to the field of computer vision, wherein the method comprises the following steps: acquiring a road scene image by using image acquisition equipment, and extracting N-level interested areas from the road scene image according to the sequence of the target distance from near to far; sequentially carrying out target detection on the N-level interested areas according to the sequence of the target distance from near to far, carrying out early warning after a certain-level interested area with a near target distance detects a target, and not detecting the subsequent-level interested areas; wherein N is more than or equal to 2, and the target distance is the distance between the target and the image acquisition equipment. Compared with the existing detection and early warning algorithm for the targets such as pedestrians, vehicles and the like, the method can detect and early warn the targets appearing on the driving route more quickly and accurately, and has better practical value.

Description

Road target rapid detection early warning method and system

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a road target rapid detection early warning method and system.

Background

The pedestrian and vehicle detection technology is widely applied to the driving assistance system. In urban traffic scenes, how to avoid traffic accidents is always a hot point problem. The current assistant driving mainly utilizes pedestrian and vehicle detection technology to detect and analyze images in front of running motor vehicles, and actively gives an alarm for pedestrians and vehicles, so that the assistant driving can assist drivers to avoid in advance and prevent traffic accidents. The current pedestrian and vehicle detection technology mainly utilizes a deep learning-based correlation algorithm to detect and identify objects such as pedestrians and vehicles in an input image. One important requirement of a driving assistance system is real-time performance. The vehicle is very fast for people, if the pedestrian can not be detected quickly and the alarm is given, the danger can not be avoided in time. In addition, the system needs to have higher accuracy, and the occurrence of missed alarm and false alarm conditions is reduced as much as possible.

At present, a plurality of excellent algorithms for deeply learning the pedestrian and the vehicle are provided, compared with the traditional algorithm, the accuracy and the robustness are much higher, and the algorithm can be quickly operated on a high-performance server. However, if these networks are directly transplanted to platforms such as embedded devices and mobile devices, the operation speed is greatly reduced due to the difference of computing performance of hardware platforms. The requirement for rapidity cannot be met. Therefore, the networks running on embedded devices and mobile devices must be lightweight neural networks. But simply, the pedestrian and vehicle detection network is simplified accurately and is directly used for detection and early warning of pedestrians and vehicles, although an obvious acceleration effect can be obtained, the detection precision is inevitably reduced more, and the actual use requirement of the vehicle-mounted portable pedestrian detection and early warning system cannot be met.

Therefore, the technical problem that the target on the driving route cannot be detected and early warned quickly and accurately exists in the prior art.

Disclosure of Invention

Aiming at the defects or improvement requirements of the prior art, the invention provides a road target rapid detection and early warning method and a system, so that the technical problem that the prior art cannot rapidly and accurately detect and early warn the target on a driving route is solved.

In order to achieve the above object, according to an aspect of the present invention, there is provided a method for quickly detecting and warning a road target, comprising the steps of:

(1) acquiring a road scene image by using image acquisition equipment, and extracting N-level interested areas from the road scene image according to the sequence of the target distance from near to far;

(2) sequentially carrying out target detection on the N-level interested areas according to the sequence of the target distance from near to far, carrying out early warning after a certain-level interested area with a near target distance detects a target, and not detecting the subsequent-level interested areas;

wherein N is more than or equal to 2, and the target distance is the distance between the target and the image acquisition equipment.

Further, the step (1) comprises the following steps:

collecting road scene images, setting a target height ratio threshold value Thr and a nearest target distance L₀；

According to the relation f between the pixel height of the target and the target distance and the nearest target distance L₀To obtain the pixel height H of the target₁Is prepared from H₁Calculating the width of the first-stage interested area by using the length of the first-stage interested area and the length-width ratio of the road scene image as the length of the first-stage interested area;

for the N-level region of interest, the target height is taken to be longer than the threshold Thr and the N-1 level region of interest H_N-1Product of (A) and (B)_NCalculating the width of the N-level interested area by using the length of the N-level interested area and the length-width ratio of the road scene image as the length of the N-level interested area;

obtaining the maximum distance L of the detection target of the N-1 level interested region according to the length and the width of the N level interested region and the relation f between the pixel height of the target and the target distance_N-1While L is_N-1The nearest distance of the detection target of the Nth-level region of interest;

using the target height to make the ratio of the threshold Thr to the length H of the Nth order region of interest_NThe maximum distance L of the detection target of the Nth-level interested area is obtained by the product of the maximum distance L and the maximum distance f of the target_N(ii) a The target distance detection range of the Nth-level interested region is (L)_N-1，L_N)。

Further, the step (1) further comprises:

collecting road scene image sets at different target distances, acquiring pixel height of a target and discrete data of the distance between the target and the road scene image sets, and obtaining the relation f between the pixel height of the target and the distance between the target and the discrete data.

Further, the value range of the target height ratio threshold Thr is (0, 1).

Further, the centers of the road scene image and the N-level region of interest are both lane line intersection points.

Further, the specific implementation manner of the target detection in the step (2) is as follows:

carrying out target detection on the N-level region of interest by using a target detection model, wherein the training of the target detection model comprises the following steps:

extracting N-level interested area samples from the sample road scene image, and marking the target with the target height ratio larger than the target height ratio threshold value in each interested area sample to obtain a training sample set;

constructing a target detection convolutional neural network comprising a feature extraction layer, and extracting features of N-level region of interest samples in a training sample set in the feature extraction layer to obtain a feature map;

setting a plurality of preselection frames in the characteristic diagram, calculating the overlapping rate of a marked target in a region of interest sample in a training sample set and each preselection frame, training a target detection convolutional neural network by using the preselection frames and the training sample set, wherein the overlapping rate is greater than a preset value, obtaining a training result, calculating a loss value of the training result, and performing back propagation by using the loss value to update the parameters of the target detection convolutional neural network, thereby obtaining a target detection model.

Further, the height ratio of the preselected box is greater than the target height ratio threshold.

According to another aspect of the present invention, there is provided a road target rapid detection and early warning system, comprising:

the interesting region extracting module is used for acquiring road scene images by utilizing image acquisition equipment and extracting N-level interesting regions from the road scene images according to the sequence of the target distance from near to far;

the target detection module is used for sequentially carrying out target detection on the N-level interested areas by using a target detection model according to the sequence of the target distance from near to far, carrying out early warning after a certain-level interested area with a near target distance detects a target, and not detecting the subsequent-level interested areas;

Further, the region of interest extracting module includes the following sub-modules:

an initialization submodule for acquiring road scene images, setting a target height ratio threshold value Thr and a nearest target distance L₀；

A first-level interested region length and width calculation submodule used for calculating the length and the width of the interested region according to the relationship f between the pixel height of the target and the target distance and the distance L between the nearest target and the target₀To obtain the pixel height H of the target₁Is prepared from H₁Calculating the width of the first-stage interested area by using the length of the first-stage interested area and the length-width ratio of the road scene image as the length of the first-stage interested area;

an Nth-level interested region length and width calculation sub-module, which is used for comparing the target height with a threshold Thr and the length H of the N-1 level interested region for the Nth-level interested region_N-1Product of (A) and (B)_NCalculating the width of the N-level interested area by using the length of the N-level interested area and the length-width ratio of the road scene image as the length of the N-level interested area;

a target distance detection range calculation submodule of the Nth level interested area, which is used for obtaining the farthest distance L of the detection target of the N-1 level interested area according to the length and the width of the Nth level interested area and the relation f between the pixel height of the target and the target distance_N-1While L is_N-1The nearest distance of the detection target of the Nth-level region of interest; using the target height to make the ratio of the threshold Thr to the length H of the Nth order region of interest_NProduct of and pixel height of the target and the targetThe distance relation f is used for obtaining the farthest distance L of the detection target of the Nth-level interested region_N(ii) a The target distance detection range of the Nth-level interested region is (L)_N-1，L_N)。

Further, the training of the target detection model comprises:

the training sample set acquisition sub-module is used for extracting N-level interested area samples from the sample road scene image, and marking the target with the target height ratio larger than the target height ratio threshold value in each interested area sample to obtain a training sample set;

the characteristic diagram acquisition sub-module is used for constructing a target detection convolutional neural network comprising a characteristic extraction layer, and extracting the characteristics of N-level region of interest samples in the training sample set in the characteristic extraction layer to obtain a characteristic diagram;

the training submodule is used for setting a plurality of preselection frames in the characteristic diagram, calculating the overlapping rate of a marked target in an interested area sample in the training sample set and each preselection frame, training a target detection convolutional neural network by using the preselection frames with the overlapping rate larger than a preset value and the training sample set to obtain a training result, calculating the loss value of the training result, and performing back propagation by using the loss value to update the parameters of the target detection convolutional neural network to obtain a target detection model.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:

(1) the invention adopts a grading target detection early warning strategy, so that the near targets can be detected at the fastest speed, the early warning time is shortened as much as possible for the most dangerous condition, and the targets with small distance can be detected. Therefore, compared with the existing detection and early warning algorithm for the targets such as pedestrians, vehicles and the like, the method and the device can accelerate the detection and early warning of the targets close to the vehicle to the maximum extent, can detect and early warn the targets far away from the vehicle, can detect and early warn the targets appearing on the driving route more quickly and accurately, and have better practical value.

(2) The existing target detection algorithm generally has worse detection accuracy for small targets than for large targets. The invention improves the network operation speed by simplifying the structural design of the target detection network. By adopting the strategy of hierarchical detection, each level of detection only needs to detect a larger target. This can increase the speed of the network.

(3) According to the invention, the interesting regions with the target height ratio larger than the target height ratio threshold are marked for training by utilizing all levels of interesting regions, so that the detection precision of the target detection network on the target in the actual picture is improved, and the detection of a smaller target is avoided.

Drawings

Fig. 1 is a schematic flow chart of a method for quickly detecting and warning a road target according to an embodiment of the present invention;

FIG. 2 is a flow chart of a preferred embodiment provided by an embodiment of the present invention;

fig. 3 is a three-level region of interest extraction method provided in the embodiment of the present invention;

FIG. 4 is a schematic flow chart of target detection model training data set labeling according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a feature diagram and a pre-selected box design provided by an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a target detection convolutional neural network according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

As shown in fig. 1, a rapid detection and early warning method for road targets includes the following steps:

Further, the step (1) comprises the following steps:

collecting road scene images, setting a target height ratio threshold value Thr and a nearest target distance L₀(ii) a Acquiring a road scene image set under different target distances, acquiring pixel height of a target and discrete data of the distance between the target and the road scene image set, and obtaining a relation f between the pixel height of the target and the distance between the target and the discrete data;

using the target height to make the ratio of the threshold Thr to the length H of the Nth order region of interest_NThe maximum distance L of the detection target of the Nth-level interested area is obtained by the product of the maximum distance L and the maximum distance f of the target_N(ii) a Target distance detection of Nth-level region of interestRange of (L)_N-1，L_N)。

Further, the value range of the target height ratio threshold Thr is (O, 1), and preferably, the value of the target height ratio threshold Thr is 0.5.

setting a plurality of preselection frames in the characteristic diagram, calculating the overlapping rate of a marked target in a region of interest sample in a training sample set and each preselection frame, training a target detection convolutional neural network by using the preselection frames and the training sample set, wherein the overlapping rate is greater than a preset value, obtaining a training result, calculating a loss value of the training result, and performing back propagation by using the loss value to update the parameters of the target detection convolutional neural network, thereby obtaining a target detection model. The height ratio of the preselection frame is greater than the target height ratio threshold, and the preset value is 0.5.

As shown in fig. 2, for the case that N is 3, acquiring a road scene image, and extracting three levels of regions of interest from the road scene image according to the order of the object distance from near to far; sequentially carrying out target detection on the three-level interested areas according to the sequence of the target distance from near to far, firstly carrying out first-level detection on the first-level interested area, if the target is detected, carrying out first-level alarm, otherwise carrying out second-level detection on the second-level interested area, if the target is detected, carrying out second-level alarm, otherwise carrying out third-level detection on the third-level interested area, if the target is detected, carrying out third-level alarm, and otherwise stopping detection.

And dividing the area on the driving track into three stages of interested areas which are respectively used for detecting near, middle and far targets. The method can accelerate the detection and early warning of the near target to the maximum extent, and can detect the target far away from the early warning at the same time.

As shown in fig. 3, for the case that N is 3, the setting of the three-level region of interest in step (1) specifically includes the following steps:

(21) analyzing the collected road scene image, and extracting the track and the intersection point of the lane line in the image; setting a target height ratio threshold Thr;

(22) analyzing the relation f between the pixel height of the target in the acquired image and the target distance;

(23) setting the distance a of the nearest detection target, and obtaining the pixel height H of the target at the moment according to the relation f₁Is prepared from H₁And as the length of the first-level interested area, calculating the width of the first-level interested area by using the length of the first-level interested area and the length-width ratio of the road scene image.

(24) Obtaining the minimum target pixel height H of the first-level region of interest detection according to the threshold value Thr₂＝H₁Thr. H is to be₂And calculating the width of the second-level interested area by using the length of the second-level interested area and the length-width ratio of the road scene image as the length of the second-level interested area.

(25) From H₂And obtaining the farthest distance b of the first-level interested region detection target according to the relation f, and simultaneously obtaining the nearest distance of the second-level interested region detection target.

(26) Obtaining the minimum target pixel height H of the second-level region-of-interest detection according to the threshold value Thr₃＝H₂Thr. H is to be₃And as the length of the third-level interested area, calculating the width of the third-level interested area by using the length of the third-level interested area and the length-width ratio of the road scene image.

(27) From H₃And obtaining the farthest distance of the second-level region of interest detection target according to the relation fAnd c is the nearest distance of the third-level region of interest detection target.

(28) Obtaining the minimum target pixel height H of the third-level region of interest detection according to the threshold Thr₄＝H₃*Thr。

(29) From H₄And obtaining the farthest distance d of the third-level region of interest detection target according to the relation f.

(210) Taking the junction point as the center of each interested area, and sequentially intercepting the length of each interested area as H₁，H₂，H₃As the first, second and third levels of interest.

The distances a, b, c and d in the steps (23), (25), (27) and (29) form target detection distance sections (a, b), (b, c) and (c, d) of the first, second and third levels of interest.

As shown in fig. 4, for the case where N is 3, the training method of the target detection model is as follows:

extracting three-level interesting regions from the sample road scene image to obtain input images, and marking a complete target with a height ratio larger than THR in each input image to obtain a training data set T; the marking method is that the coordinates of the upper left corner and the lower right corner of a rectangular box for recording and framing the target are recorded in an xml format file.

Constructing a target detection convolutional neural network, and extracting the characteristics of a training data set T on a characteristic layer of the target detection convolutional neural network;

as shown in fig. 5, designing a preselection frame on a feature layer of the target detection convolutional neural network, setting the size of the preselection frame to be slightly higher than Thr, and preliminarily obtaining the position of a target object with a high occupancy rate higher than Thr in an input image; preselected frame aspect ratio a_rIs 2, 3, 4; a preselected frame height of

Width is

And training the target detection convolutional neural network step by adopting a training data set T and a preselected frame arranged on the characteristic layer to obtain a target detection model.

As shown in fig. 6, during detection, an input image is input into a target feature extraction network to obtain a feature map, detection is performed to obtain a detection result, and subsequent processing is performed by using the detection result, that is, whether to perform early warning or not is determined.

Example 1

A road target rapid detection early warning method comprises the following steps when N is 3 and a target is a pedestrian:

step 1: collecting road scene images, analyzing the collected road scene images, and extracting tracks and junction points of lane lines in the images; and setting the center of the image as a lane line junction. Analyzing the relation f between the pixel height of the pedestrian in the collected image and the pedestrian distance; the detected pedestrian height ratio threshold Thr is set to 0.5.

The nearest detected pedestrian distance is set to 5 meters. And according to the relation f, the pixel height of the pedestrian at 5 m is 480, and the length and width of the first-level region of interest are 480. (the aspect ratio of the road scene image at this time is 1)

The first-level minimum pedestrian pixel height 240 to be detected is obtained by setting the length and width of the first-level interesting area sub-picture to be 480 and the set minimum pedestrian height ratio threshold value to be 0.5. At this time, the pedestrian distance is 10 meters, which is assumed to be the closest distance of the second-stage detection pedestrian. And the length and width of the second level region of interest are set to 240.

And obtaining the pixel height 120 of the minimum pedestrian to be detected at the second level by the length and the width of the sub-picture of the second-level interested area being 240 and the set minimum detected pedestrian height occupying ratio threshold value being 0.5. At this time, the pedestrian distance is 20 meters, which is assumed to be the closest distance of the third-level detected pedestrian. And the length and width of the second level region of interest are set to be 120.

The length and width of the third-level interested area sub-picture is 120, and the set minimum detected pedestrian height ratio threshold value is 0.5, so that the minimum pedestrian pixel height 60 to be detected at the third level is obtained. At this time, the pedestrian distance is 40 meters, which is the farthest distance of the third-level detection target.

Step 2: and intercepting all levels of interested areas containing pedestrians. The third-level interested region is the region obtained by the third-level interested region extraction method. And determining whether to mark the pedestrian according to whether the ratio of the pixel height of the pedestrian in the region of interest to the resolution height of the sub-picture is greater than Thr. Pedestrians with mark occupancy greater than Thr are noted. The labeling method is to record the coordinates of the upper left corner and the lower right corner of a rectangular frame for framing the pedestrian.

Designing a preselection frame on the characteristic layer L of the single-scale pedestrian detection convolutional neural network to preliminarily obtain the position of a pedestrian target;

selecting a part of feature layers behind the network, wherein the resolution of the feature layers is 3 multiplied by 3, and the dimension of the detected pedestrian is 0.7; the input image is divided into a 3 x 3 grid. The design principle of the pre-selection frame is concentric with the middle point of the grid area; preselected frame aspect ratio a_rIs 2, 3, 4; a preselected frame height of

Width is

And training the single-scale pedestrian detection convolutional neural network by adopting a training data set T and a preselection frame to obtain a pedestrian detection model.

And step 3: and sequentially carrying out pedestrian detection on the three-level regions of interest by utilizing a pedestrian detection model.

The invention relates to a road target rapid detection early warning method suitable for an auxiliary driving system. For detecting near, intermediate and far targets, respectively. The method can accelerate the detection and early warning of the near target to the maximum extent, and can detect the target far away from the early warning at the same time. And according to the strategy of the hierarchical detection, a rapid target detection network is designed. And performing target detection work by using the preselection frame which accords with the target characteristics and the single-scale characteristic diagram to obtain a rapid target detection network. The method realizes rapid and accurate detection and early warning of the target.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A road target rapid detection early warning method is characterized by comprising the following steps:

wherein N is more than or equal to 2, and the target distance is the distance between the target and the image acquisition equipment;

the step (1) comprises the following steps:

obtaining the maximum distance L of the detection target of the N-1 level interested region according to the length and the width of the N level interested region and the relation f between the pixel height of the target and the target distance_N-1While L is_N-1Is also the nearest of the N-th order region of interest detection targetA distance;

using the target height to make the ratio of the threshold Thr to the length H of the Nth order region of interest_NThe maximum distance L of the detection target of the Nth-level interested area is obtained by the product of the maximum distance L and the maximum distance f of the target_N(ii) a The target distance detection range of the Nth-level interested region is (L)_N-1,L_N)。

2. The rapid detection and early warning method for road targets as claimed in claim 1, wherein the step (1) further comprises:

3. The method as claimed in claim 1, wherein the value range of the target height ratio threshold Thr is (0, 1).

4. The method as claimed in any one of claims 1 to 3, wherein the centers of the road scene image and the N-level region of interest are both intersection points of lane lines.

5. The rapid detection and early warning method for road targets as claimed in claim 1 or 3, wherein the specific implementation manner of the target detection in the step (2) is as follows:

6. The method as claimed in claim 5, wherein the height ratio of the pre-selection frame is greater than the target height ratio threshold.

7. The utility model provides a road target short-term test early warning system which characterized in that includes:

the region of interest extraction module comprises the following sub-modules:

A first-level interested region length and width calculation submodule used for calculating the length and the width of the interested region according to the relationship f between the pixel height of the target and the target distance and the distance L between the nearest target and the target₀To obtain the pixel height H of the target₁Is prepared from H₁As a first stageThe length of the region of interest, and the width of the first-stage region of interest is calculated by utilizing the length of the first-stage region of interest and the length-width ratio of the road scene image;

a target distance detection range calculation submodule of the Nth level interested area, which is used for obtaining the farthest distance L of the detection target of the N-1 level interested area according to the length and the width of the Nth level interested area and the relation f between the pixel height of the target and the target distance_N-1While L is_N-1The nearest distance of the detection target of the Nth-level region of interest; using the target height to make the ratio of the threshold Thr to the length H of the Nth order region of interest_NThe maximum distance L of the detection target of the Nth-level interested area is obtained by the product of the maximum distance L and the maximum distance f of the target_N(ii) a The target distance detection range of the Nth-level interested region is (L)_N-1,L_N)。

8. The system of claim 7, wherein the training of the target detection model comprises: