CN111832463A

CN111832463A - Deep learning-based traffic sign detection method

Info

Publication number: CN111832463A
Application number: CN202010646070.XA
Authority: CN
Inventors: 李东洁; 孙富文
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2020-07-07
Filing date: 2020-07-07
Publication date: 2020-10-27

Abstract

The invention discloses a traffic sign detection method based on deep learning, which comprises the following steps: (1) and acquiring an outdoor traffic sign image by using a digital camera, and denoising the image by using median filtering and Gaussian filtering. (2) And (3) making a traffic sign data set, performing rotation, translation and stretching transformation, expanding the data set, and dividing the data set into a training set and a verification set according to a ratio of 9: 1. (3) And comprehensively using the leaders cluster and the k-means cluster to determine the initial position of the anchor box of the target. (4) Using successive 1 x 1 convolutional layers in YOLOv3, shallow feature information for the image was extracted and a prediction layer for the shallow features was added. (5) The training set is input into a modified YOLOv3 network for training. (6) And testing the trained model by using a verification set, and verifying the detection and identification effects of the improved model on the traffic signs.

Description

Deep learning-based traffic sign detection method

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to a deep learning-based traffic sign detection method.

Background

The traffic sign recognition is an important subsystem of the intelligent traffic system and is mainly used for processing images and videos of surrounding conditions of a vehicle, which are shot by a vehicle-mounted camera, detecting and recognizing the traffic sign, submitting the recognition result to a driver or other intelligent traffic systems, and further assisting the driver in driving or providing decision information for the intelligent traffic systems.

With the development of deep learning technology, many target detection algorithms are widely applied in the field of image recognition, wherein the algorithms are typified by Faster-Rcnn, SSD, YOLO. SSD and YOLO both belong to a one-stage target detection algorithm, have good expression on the detection speed and classification accuracy of targets in images, particularly the YOLOv3 algorithm in the YOLO series, and realize real-time performance on the recognition of vehicles and pedestrians in the field of automatic driving. Although YOLOv3 is a detection algorithm with better effect in object detection, the recognition accuracy is still lower for the small-sized objects such as traffic signs. Therefore, in order to solve the problem, a traffic sign detection method based on deep learning is provided, wherein YOLOv3 is selected as a basic model structure for traffic sign detection, a network structure is improved, the characteristic information of the traffic sign can be fully extracted, and the characteristics of a plurality of scales are fused to accurately predict the traffic sign category.

Disclosure of Invention

The invention aims to provide a traffic sign detection method based on deep learning so as to improve the detection speed and accuracy of traffic signs. The traffic sign detection method comprises the following steps.

A. Obtaining traffic sign images

B. A traffic sign data set is created and divided into a training set and a data set

C. K-means clustering method used by improved model YOLOv3

D. Network structure of improved target detection network YOLOv3

E. Training improved Yolov3 target detection network

F. Testing of the improved Yolov3 model with a validation set

Preferably, the step a includes acquiring an outdoor traffic sign image by using a digital camera, where the traffic sign mainly includes an indication sign, a prohibition sign, and a warning sign, and therefore, the acquired image should include the three types of traffic signs. And then, filtering and denoising the acquired traffic sign image.

Preferably, the traffic sign data set is produced in the step B, and the data set is divided into a positive sample and a negative sample, wherein the positive sample indicates that the sample label is consistent with the real type of the sample, and otherwise, the positive sample is the negative sample. In addition, the positive samples should be extended to enhance the classification capability of the model.

Preferably, the step C comprises the following steps

A. Preprocessing the original data set by adopting a leader clustering algorithm to generate a plurality of sample subsets

B. Sampling the generated sample subset

C. Clustering sampled sample subsets using a K-means clustering algorithm

D. Integrating the clustering result to determine the initial position of the anchor box

Preferably, the network structure of the modified YOLOv3 in the step D includes the following two aspects:

A. in the first few layers of the YOLOv3 network, the shallow information of the input image was extracted using successive 1 × 1 convolutional layers (one 1 × 1 convolutional layer after the 1 × 1 convolutional layer).

B. Adding a scale prediction layer

The invention has the beneficial effects that:

(1) due to the adoption of the continuous 1-by-1 convolution layer, the extraction capability of the YOLOv3 model on the shallow features of the input image is enhanced, and the classification capability of the model on the small target traffic signs is improved.

(2) Because a scale prediction layer is added to the model, the extracted shallow feature can be predicted, and the accuracy of the model for predicting the traffic sign category is improved.

(3) As the initial position of the anchor box is determined by comprehensively using leader clustering and k-means clustering algorithms, the model can more accurately position the position of the traffic sign.

Drawings

Fig. 1 is a diagram of the overall network architecture of the present invention.

FIG. 2 shows a continuous 1 x 1 convolutional stack structure used in the present invention

Detailed Description

The method comprises the steps of firstly, acquiring an outdoor traffic sign image by using a digital camera, wherein the outdoor traffic sign image comprises three types of prompting signs, forbidden signs and warning signs. Removing salt and pepper noise in the traffic sign image by using a media Blur median filtering algorithm in an opencv computer vision processing library, further removing noise by using Gaussian filtering, and finally enhancing the characteristics of the traffic sign image by using a histogram matching algorithm.

And step two, making a data set. And (3) carrying out rotation, translation and stretching transformation on the traffic sign image subjected to filtering and noise reduction in the step one, expanding a data set, and dividing the data set into a training set and a verification set according to a ratio of 9: 1.

Step three: the k-means clustering algorithm is improved, and the execution method is as follows:

(1) for dataset X ═ X₁,x₂,...x_n) Using a non-linear mapping theta to map the samples x_iMapping into high dimensional space

(2) K-means clustering is carried out in high-latitude sum space, and an objective function is optimized:

in the formula m_kIs the mean value of the samples and is,

the formula for calculating the nuclear distance of two feature points in the nuclear space is defined as:

where K () is a kernel function

(3) After clustering each sample subset, all samples are merged, the sample set after merging contains K target classes, and the mean value of each class is calculated:

wherein n is_iData volume, x, representing the category_iIs the mean of the i-th class

(4) Calculating the distance between any two mean values

Setting the clustering center threshold value as M, and the distance formula between two class mean values as follows:

L＝|x_i-x_j|²

and when the distance between the mean values of the two targets is smaller than M, merging the two targets into a class, continuing to calculate the class mean value distance according to the formula, and obtaining the final clustering result after merging and fusing the pathogenesis of the sample subset.

And step four, improving the network structure of YOLOv 3. The improved YOLOv3 network is specifically illustrated in connection with fig. 1. As shown in fig. 1, the feature map obtained by the convolution of the previous layer is processed using the consecutive 1 × 1 convolutional layers at the third convolutional layer of the network, and then processed again using one 3 × 3 convolutional layer.

The convolution structure shown in fig. 2 is used in the 7 th, 8 th and 9 th convolution layers of the network, then the convolution layers 10 th, 11 th and 12 th convolution layers continue to extract image shallow features by using the convolution and the structure shown in fig. 2, and finally the feature map at the scale is sent into the image type prediction network of YOLOv3 to predict the type of the target.

Step five: inputting the training set obtained in the step two into the improved YOLOv3 network in the step four for training, wherein the algebraic epoch of the training is 10000, a batch gradient descent method is adopted in each epoch, the input data volume of each batch is 256, and in the training process, the learning rate used by the model is dynamically adjusted by using the degraded learning rate, so that the model training is accelerated.

Step six: and testing the model trained in the step five by using the verification set in the step two, and verifying the detection and identification effect of the improved model on the traffic sign.

Claims

1. A traffic sign detection method based on deep learning is characterized by comprising the following steps:

A. obtaining traffic sign images

C. K-means clustering method used by improved model YOLOv3

D. Network structure of improved target detection network YOLOv3

E. Training improved Yolov3 target detection network

F. The improved YOLOv3 model was tested using a validation set.

2. The method as claimed in claim 1, wherein the step a includes using a digital camera to obtain the outdoor traffic sign image, and the traffic signs mainly include an indication sign, a prohibition sign and a warning sign, so that the obtained image should include the three types of traffic signs. And then, filtering and denoising the acquired traffic sign image.

3. The method as claimed in claim 1, wherein the traffic sign data set is prepared in step B, and the data set is divided into two types, namely positive sample and negative sample, wherein the positive sample indicates that the sample label is consistent with the true category of the sample, and vice versa, the sample is a negative sample. In addition, the positive samples should be extended to enhance the classification capability of the model.

4. The deep learning-based traffic sign detection method according to claim 1, wherein the step C comprises the steps of:

B. Sampling the generated sample subset

C. Clustering sampled sample subsets using a K-means clustering algorithm

D. And integrating the clustering results and determining the initial position of the anchor box.

5. The method as claimed in claim 1, wherein the step D of improving the network structure of YOLOv3 includes two aspects of

B. A scale prediction layer is added.