CN111985476A - Traffic sign target detection method based on SSD algorithm - Google Patents

Traffic sign target detection method based on SSD algorithm Download PDF

Info

Publication number
CN111985476A
CN111985476A CN202010877065.XA CN202010877065A CN111985476A CN 111985476 A CN111985476 A CN 111985476A CN 202010877065 A CN202010877065 A CN 202010877065A CN 111985476 A CN111985476 A CN 111985476A
Authority
CN
China
Prior art keywords
feature
feature map
traffic sign
algorithm
ssd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010877065.XA
Other languages
Chinese (zh)
Inventor
陈炳才
刘顺民
马致明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinjiang Normal University
Original Assignee
Xinjiang Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinjiang Normal University filed Critical Xinjiang Normal University
Priority to CN202010877065.XA priority Critical patent/CN111985476A/en
Publication of CN111985476A publication Critical patent/CN111985476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/09Recognition of logos

Abstract

Aiming at the problems that the existing algorithm for detecting the target has low identification accuracy rate on the traffic sign, weak generalization capability and difficult detection on small targets and cannot be really applied to practice, the invention uses the characteristic pyramid to replace a multi-scale characteristic layer to detect the target on the basis of the SSD algorithm, and provides the method for detecting the traffic sign target of the SSD. First, data of a picture is preprocessed. And secondly, dividing the processed data into a training set and a testing set. The algorithm is then refined. The data is then used for model training. And finally testing the trained model. Compared with the original SSD algorithm, the average precision of the result is improved by 5.4 percent, the precision of detecting the traffic sign is ensured, and the detection performance of the algorithm is improved.

Description

Traffic sign target detection method based on SSD algorithm
Technical Field
The invention belongs to the technical field of target detection and unmanned crossing, and particularly relates to a traffic sign target detection method based on an SSD algorithm.
Technical Field
With the rapid development of unmanned technology, unmanned driving slowly enters the commercial stage. Traffic signs play an important role in the safe driving of vehicles, and therefore the requirements for the detection performance of traffic signs are also increasing. The traditional method mainly uses a histogram of gradient directions (HOG), a color histogram and edge features to construct different feature spaces, and performs feature extraction on the color features and the shape features of the traffic signs in the pictures. However, the methods have low detection accuracy and poor robustness, are weak in generalization capability, are difficult to detect small targets, and are difficult to meet the actual production and application requirements of people.
The construction of large-scale training data sets, the continuous enhancement of hardware computing power and the continuous development of deep learning, and the deep network structure has achieved great success in different visual tasks. The detection algorithm is rapidly developed, and a series of target detection algorithms, such as R-CNN, FastR-CNN, FasterR-CNN, YOLO series algorithms, SSD and other target detection algorithms of a rapid and high-precision deep convolution network are generated. Different from the traditional algorithm, the method has certain invariance to the geometric change, deformation and illumination depth convolution network, overcomes the problems in the traditional method, and has certain generalization capability. These mainstream algorithms are further classified into two major classes, one is a two-phase detection algorithm and the other is a single-phase detection algorithm. The two-stage detection algorithm comprises the steps of firstly generating a candidate region possibly containing an object, secondly classifying and calibrating the candidate region further, and finally obtaining detection results such as R-CNN, Fast R-CNN and Fast R-CNN. While the single-stage detection algorithm directly classifies and calibrates the object and gives the final result, the step of generating candidate regions, such as the YOLO series algorithm and the SSD algorithm, is not shown. Compared with a two-stage target detection algorithm, the single-stage target detection algorithm has high speed, false positives caused by background errors are avoided, and the generalized characteristics of the object can be learned quickly.
Disclosure of Invention
The invention aims to provide a traffic sign target detection method based on an SSD algorithm aiming at the problems mentioned in the background technology, so that traffic sign targets far away from vehicles can be well detected when the traffic sign targets are small, and the detection capability is improved.
A traffic sign target detection method based on an SSD algorithm comprises the following specific steps.
Firstly, processing a traffic sign data set before training.
And secondly, dividing the processed data set into a training set and a testing set respectively.
Thirdly, improving the SSD algorithm: the SSD detection method comprises the steps of detecting feature maps of multiple scales, detecting by using a feature pyramid instead of a multi-scale feature layer on the basis, classifying and regressing to obtain a target position, and finally obtaining a result through non-maximum value suppression.
And fourthly, training the model by using an improved algorithm.
And fifthly, detecting the traffic sign by using the trained model.
Further, the data labels in the txt file in the original data set are converted into the format of PASCAL VOC.
Further, 2500 pictures in the data set are divided into a training set and a testing set according to the proportion of 7:3, wherein the training set is 1750 pictures, and the testing set is 750 pictures.
Further, a feature map of 1x1 generated by Conv11_2 in the SSD algorithm is subjected to upsampling processing by using a double-line interpolation method to generate a feature map of 3x3, the feature map is subjected to convolution, the size of a convolution kernel is 3x3, both coding and strip are 1, and the 3x3 feature map generated after convolution is subjected to feature fusion with Conv10_2 to serve as a 3x3 feature map in a prediction layer. And then, the 3x3 feature map in the previous step is subjected to up-sampling processing by using a double-line interpolation method again, a 5x5 feature map is generated, the same parameters in the previous step are adopted for convolution, and feature fusion is carried out on the feature map which is output to be 5x5 and conv9_2 to obtain a 5x5 feature map in the prediction layer. The 10x10, 19x19 and 38x38 feature maps in the SSD prediction layer are generated sequentially by upsampling and feature fusion using the same method as described above. Thus, 6 feature maps are successfully obtained, and the feature maps are used for constructing a feature pyramid to detect the target.
Further, when performing model training, the true value should be matched with the predicted value information. The positive sample is set when the intersection ratio of the prediction frame and the real frame is greater than 0.5, because the positive and negative samples are prevented from being extremely unbalanced. But the true value is too small, the problem cannot be solved well. The SSD algorithm adopts a hard-case mining method with respect to the above problem, and keeps the positive and negative samples 3: the training is performed at a ratio of 1. The trained model can be better by the method.
Furthermore, information in each layer can be fully utilized, the bottom layer features have a good effect on positioning information of the targets, the high layer features have a good classification result on the targets, the positioning information and the classification information of the targets can be combined by performing feature fusion on the bottom layer features and the high layer features, prediction is performed on different feature layers, and the identification accuracy is greatly improved.
The invention has stronger detection capability aiming at small targets and better overall detection effect, and is more suitable for detecting traffic signs.
Drawings
FIG. 1 is a flow chart of the steps of the present invention.
Fig. 2 is a diagram of the algorithm structure of the present invention.
Fig. 3 is a comparison graph after the traffic sign is detected by the invention and the original algorithm.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the present invention is a flow chart of steps of a traffic sign target detection method based on SSD algorithm, which specifically includes the following steps:
step 1, processing a traffic sign data set before training;
step 2, dividing the processed data set into a training set and a testing set respectively;
step 3, improving the SSD algorithm: the detection method of the SSD comprises the steps of detecting feature maps of multiple scales, detecting by using a feature pyramid to replace a multi-scale feature layer on the basis, classifying and regressing to obtain a target position, and finally obtaining a result through non-maximum value inhibition;
step 4, training a model by using an improved algorithm;
and 5, detecting the traffic sign by using the trained model.
The data set in step (1) is selected from a traffic panel data set TPD selected from a chinese traffic sign data set, wherein the label information of the data is stored in a txt file, and the data label needs to be converted into a format of PASCAL VOC. When the data set is divided into a training set and a test set, 1750 data are respectively used as the training set 750 as the test set according to the proportion of 7: 3.
As shown in fig. 2, is a structural diagram of the algorithm of the present invention, and is also an improvement of the algorithm in step 3. The method specifically comprises the following steps.
Step 3.1, first the picture is input into the network.
Step 3.2, the 1x1 feature map generated by Conv11_2 is used as the 1x1 feature map in the prediction layer.
And 3.3, performing upsampling processing on the 1x1 feature map generated by Conv11_2 by using a double-line interpolation method to generate a 3x3 feature map, performing convolution on the feature map, wherein the size of a convolution kernel is 3x3, the coding and strip are both 1, and performing feature fusion on the generated 3x3 feature map and Conv10_2 after convolution to obtain a 3x3 feature map in the prediction layer.
And 3.4, performing up-sampling processing on the 3x3 feature map in the previous step by using a two-line interpolation method again to generate a 5x5 feature map, performing convolution by adopting the same parameters in the previous step, and performing feature fusion on the feature map output by 5x5 and conv9_2 to obtain the 5x5 feature map in the prediction layer.
And 3.5, performing up-sampling processing on the 5x5 feature map in the previous step by using a two-line interpolation method again to generate a 10x10 feature map, performing convolution by adopting the same parameters in the previous step, and performing feature fusion on the feature map which is output by 10x10 and conv8_2 to obtain a 10x10 feature map in the prediction layer.
And 3.6, performing up-sampling processing on the 10x10 feature map in the previous step by using a two-line interpolation method again to generate a 19x19 feature map, performing convolution by adopting the same parameters in the previous step, and performing feature fusion on the feature map output by 19x19 and conv7_2 to obtain the 19x19 feature map in the prediction layer.
And 3.7, performing up-sampling processing on the 19x19 feature map in the previous step by using a two-line interpolation method again to generate a 38x38 feature map, performing convolution by adopting the same parameters in the previous step, and performing feature fusion on the feature map output by 38x38 and conv7_2 to obtain a 38x38 feature map in the prediction layer. And sequentially generating 10x10, 19x19 and 38x38 feature maps in the SSD prediction layer. Thus, 6 feature maps are successfully obtained, and the feature maps are used for constructing a feature pyramid to detect the target.
The VGG16 convolutional neural network obtained from the ImageNet classification pre-training is selected to initialize the weights of the feature extraction network convolutional layers when the model is trained in step 4. When the model training is performed, the true value should be matched with the predicted value information. The positive sample is set when the intersection ratio of the prediction frame and the real frame is greater than 0.5, because the positive and negative samples are prevented from being extremely unbalanced. But the true value is too small, the problem cannot be solved well. The SSD algorithm adopts a hard-case mining method with respect to the above problem, and keeps the positive and negative samples 3: the training is performed at a ratio of 1. Setting various parameters: the weight attenuation _ decay is 0.0005, the learning rate learning _ rate is 0.0001, the momentum is 0.9, and the batch _ size is 4, for 5 ten thousand iterative trainings. The experiment platform is used for improving the SSD algorithm under a window 1064-bit operating system and a Pythrch framework and verifying the TPD data set.
The average accuracy of the final experimental result of the invention is 5.4% higher than that of the original SSD algorithm, as shown in FIG. 3, a is the result of detecting the picture by the original SSD algorithm, and b is the result of detecting the picture by the invention. The traffic sign board detection method has the advantages that the traffic sign board detection method is higher in detection capability and higher in detection accuracy for the traffic sign boards which are far away from the vehicle and are smaller. The original SSD algorithm is not easily detectable for smaller targets and is less accurate than the present invention.

Claims (6)

1. A traffic sign target detection method based on an SSD algorithm is characterized by comprising the following steps:
firstly, processing a traffic sign data set before training;
secondly, dividing the processed data set into a training set and a testing set respectively;
thirdly, improving the SSD algorithm: the detection method of the SSD comprises the steps of detecting feature maps of multiple scales, detecting by using a feature pyramid to replace a multi-scale feature layer on the basis, classifying and regressing to obtain a target position, and finally obtaining a result through non-maximum value inhibition;
fourthly, training a model by using an improved algorithm;
and fifthly, detecting the traffic sign by using the trained model.
2. The method of claim 1, wherein the data labels in the txt file collected from the original data set are converted into PASCAL VOC format.
3. The method for detecting the traffic sign target based on the SSD algorithm is characterized in that 2500 pictures in a data set are divided into a training set and a testing set according to the proportion of 7:3, wherein the training set is 1750 pictures, and the testing set is 750 pictures.
4. The method as claimed in claim 1, wherein the method for detecting the traffic sign target based on the SSD algorithm comprises the steps of performing up-sampling processing on a 1x1 feature map generated by Conv11_2 in the SSD algorithm by using a two-line interpolation method to generate a 3x3 feature map, performing convolution on the feature map, wherein the convolution kernel size is 3x3, the coding and the strip are both 1, performing feature fusion on the 3x3 feature map and Conv10_2 generated after convolution to obtain a 3x3 feature map in the prediction layer, performing up-sampling processing on the 3x3 feature map in the previous step by using the interpolation method again to generate a 5x5 feature map, performing convolution by using the same parameters in the previous step, outputting the feature map of the 5x5 and Conv9_2 to obtain a 5x5 feature map in the prediction layer, and sequentially generating an up-sampling and feature fusion by using the same method to obtain a 10x10 feature map, a 19x19 and a 19x19 feature map in the SSD prediction layer, 38x38 feature maps, thus obtaining 6 feature maps successfully, and using the feature maps to construct a feature pyramid to detect the target.
5. The method for detecting the traffic sign target based on the SSD algorithm is characterized in that when model training is carried out, a real value and predicted value information are matched, when the intersection ratio of a prediction frame and a real frame is more than 0.5, the result is a positive sample, the positive sample and the negative sample are prevented from being extremely unbalanced, but the problem cannot be well solved due to the fact that the real value is too few, the SSD algorithm adopts a method which is difficult to excavate, and the positive sample and the negative sample are kept 3: the proportion of 1 is used for training, and the trained model can be better by the method.
6. The traffic sign target detection method based on the SSD algorithm is characterized in that information in each layer can be fully utilized, bottom layer features have good effect on positioning information of targets, high layer features have good classification results of the targets, the positioning information and the classification information of the targets can be combined by performing feature fusion on the bottom layer features and the high layer features, prediction is performed on different feature layers, and identification accuracy is greatly improved.
CN202010877065.XA 2020-08-27 2020-08-27 Traffic sign target detection method based on SSD algorithm Pending CN111985476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010877065.XA CN111985476A (en) 2020-08-27 2020-08-27 Traffic sign target detection method based on SSD algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010877065.XA CN111985476A (en) 2020-08-27 2020-08-27 Traffic sign target detection method based on SSD algorithm

Publications (1)

Publication Number Publication Date
CN111985476A true CN111985476A (en) 2020-11-24

Family

ID=73441429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010877065.XA Pending CN111985476A (en) 2020-08-27 2020-08-27 Traffic sign target detection method based on SSD algorithm

Country Status (1)

Country Link
CN (1) CN111985476A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668644A (en) * 2020-12-28 2021-04-16 燕山大学 Unmanned aerial vehicle aerial photography target detection method based on SSD improved algorithm

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668644A (en) * 2020-12-28 2021-04-16 燕山大学 Unmanned aerial vehicle aerial photography target detection method based on SSD improved algorithm
CN112668644B (en) * 2020-12-28 2023-03-24 燕山大学 Unmanned aerial vehicle aerial photography target detection method based on SSD improved algorithm

Similar Documents

Publication Publication Date Title
CN108647585B (en) Traffic identifier detection method based on multi-scale circulation attention network
CN109800628B (en) Network structure for enhancing detection performance of SSD small-target pedestrians and detection method
CN110929577A (en) Improved target identification method based on YOLOv3 lightweight framework
CN110782420A (en) Small target feature representation enhancement method based on deep learning
CN114495029B (en) Traffic target detection method and system based on improved YOLOv4
CN113723377B (en) Traffic sign detection method based on LD-SSD network
CN109886147A (en) A kind of more attribute detection methods of vehicle based on the study of single network multiple-task
CN111178451A (en) License plate detection method based on YOLOv3 network
CN112784756B (en) Human body identification tracking method
CN114898327B (en) Vehicle detection method based on lightweight deep learning network
CN112528934A (en) Improved YOLOv3 traffic sign detection method based on multi-scale feature layer
CN111340034A (en) Text detection and identification method and system for natural scene
CN112434618A (en) Video target detection method based on sparse foreground prior, storage medium and equipment
CN113239753A (en) Improved traffic sign detection and identification method based on YOLOv4
CN110852317A (en) Small-scale target detection method based on weak edge
CN112733942A (en) Variable-scale target detection method based on multi-stage feature adaptive fusion
Qin et al. A specially optimized one-stage network for object detection in remote sensing images
CN113963333B (en) Traffic sign board detection method based on improved YOLOF model
CN110659601A (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN113298817A (en) High-accuracy semantic segmentation method for remote sensing image
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN116597326A (en) Unmanned aerial vehicle aerial photography small target detection method based on improved YOLOv7 algorithm
CN111985476A (en) Traffic sign target detection method based on SSD algorithm
CN112597996A (en) Task-driven natural scene-based traffic sign significance detection method
CN111832463A (en) Deep learning-based traffic sign detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201124