CN110163187A - Remote road traffic sign detection recognition methods based on F-RCNN - Google Patents

Remote road traffic sign detection recognition methods based on F-RCNN Download PDF

Info

Publication number
CN110163187A
CN110163187A CN201910474058.2A CN201910474058A CN110163187A CN 110163187 A CN110163187 A CN 110163187A CN 201910474058 A CN201910474058 A CN 201910474058A CN 110163187 A CN110163187 A CN 110163187A
Authority
CN
China
Prior art keywords
traffic sign
training
rcnn
output
elm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910474058.2A
Other languages
Chinese (zh)
Other versions
CN110163187B (en
Inventor
杜娟
刘志刚
刘贤梅
王辉
刘苗苗
王梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Jiuzhou Longteng Scientific And Technological Achievement Transformation Co ltd
Original Assignee
Northeast Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Petroleum University filed Critical Northeast Petroleum University
Priority to CN201910474058.2A priority Critical patent/CN110163187B/en
Publication of CN110163187A publication Critical patent/CN110163187A/en
Application granted granted Critical
Publication of CN110163187B publication Critical patent/CN110163187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/582Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of traffic signs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to the remote road traffic sign detection recognition methods based on F-RCNN, it includes: that an pre-processes Traffic Sign Images sample set;Two, carry out pre-training to the VGG-16 in F-RCNN;Traffic sign training dataset is input to VGG-16 by three, completes feature extraction;Four, construct fusion feature figure;Area generation network RPN in five .F-RCNN carries out Area generation according to fusion feature figure, obtains the candidate region of traffic sign;All candidate regions six, are input to the layer of the RoI-Pooling in F-RCNN, generate fixed-size feature vector;Feature vector is sent into extreme learning machine network by seven, exports classification and the position of traffic sign;Eight, train F-RCNN model using adaptive loss function is contributed;Nine, complete the road traffic sign detection identification of actual scene.The present invention realizes remote road traffic sign detection identification, and accuracy of identification is high.

Description

Remote road traffic sign detection recognition methods based on F-RCNN
One, technical field:
The present invention relates to towards it is unmanned and auxiliary drive intelligent transportation field, solving road traffic sign Remote detection and recognition methods, and in particular to be the remote road traffic sign detection recognition methods based on F-RCNN.
Two, background technique:
In intelligent transportation field, road traffic sign detection and identification are that the important of systems such as unmanned, auxiliary driving is ground Study carefully problem.Many research work have been carried out to this both at home and abroad, but there are still very big deficiency, can not have been applied in practice.It is former Because as follows: (1) some detection recognition methods, using disclosed German traffic sign data set GTSRB and GTSDB, traffic sign The ratio for occupying image is very big, and road traffic sign detection is difficult to adapt to galloping situation, and GTSDB data apart from short It concentrates detectable mark type less, is unable to satisfy actual needs;(2) some detection methods use the data set of oneself shooting Carry out training pattern, mark variation wretched insufficiency, quantity are very few, relative to German traffic sign data set GTSRB and GTSDB training Model, be more difficult to adapt to complicated traffic conditions, adaptive capacity to environment is worse;(3) some detection methods be based on color, The simple features such as shape, in actual traffic scene mark deformation, motion blur situations such as, these detection method robustness It is poor, it is difficult to be applied in practice;(4) remote in progress in addition, existing invention is all closely detection identification mostly Precision when road traffic sign detection identifies is lower, can not practical application.
Remote road traffic sign detection timely responds to have with driving safety with identification for intelligent transportation system Significance.Since tag distance intelligence system distance of taking pictures decides ratio of the traffic sign in actual scene, examine Ranging is from remoter, and the size of traffic sign in the scene is with regard to smaller.In computer vision field, remote road traffic sign detection Belong to small target deteection identification problem with identification problem, which is the difficulties of current computer vision field, existing Method is difficult to obtain higher detection accuracy of identification.
Three, summary of the invention:
The object of the present invention is to provide the remote road traffic sign detection recognition methods based on F-RCNN, it is existing for solving There is short distance detection recognition method low problem of precision when carrying out the identification of remote road traffic sign detection.
The technical solution adopted by the present invention to solve the technical problems is: this remote traffic sign based on F-RCNN Detection recognition method:
Step 1. pre-processes Traffic Sign Images sample set;
Step 2. carries out pre-training to the feature extraction network VGG-16 in F-RCNN using image classification benchmark dataset;
Pre-training is carried out to feature extraction network VGG-16 using ImageNet common image data set, after training Original state of the parameter as network;VGG-16 network includes 5 convolutional layers, 5 pond layers, and using ReLU as activation letter Number;
Traffic sign training dataset is input to the feature extraction network VGG-16 of pre-training by step 3., is rolled up to image Product, pondization operation, complete feature extraction;
By the way of mini-batch, by the Traffic Sign Images Segmentation of Data Set of training at several batches;Step 4. Fusion feature figure is constructed using maximum pond, empty convolution, regularization, converging operation;
(1) output of each convolutional layer of feature extraction network VGG-16, be successively denoted as from front to back conv1, conv2, Conv3, conv4 and conv5;For every traffic scene with 2048 × 2048 resolution ratio, it is converted first into 2048 × VGG-16 is input to after 2048 × 3 numerical matrix, wherein picture size constantly reduces, and port number constantly increases, each convolutional layer Output matrix dimension is followed successively by conv1:2048 × 2048 × 64;Conv2:1024 × 1024 × 128;Conv3:512 × 512 × 256;Conv4:256 × 256 × 512;Conv5:128 × 128 × 512;
(2) being expanded using empty convolution to conv5, expansion rate 3, the size after expansion is 512*512* 512, convolutional calculation is then carried out, wherein the parameter of convolution kernel are as follows: size 3 × 3, step-length are 1, are filled with 1, quantity 256, meter It is 512 × 512 × 256 that output after calculation, which is denoted as dilated-conv5 matrix dimension,;
(3) down-sampling is carried out using maximum pondization to conv1, channel duplication expands 4 times, and the output after down-sampling is denoted as Pooling-conv1, its matrix dimension are 512 × 512 × 256;
(4) it carries out L2 regularization calculating respectively to pooling-conv1, conv3, dilated-conv5, eliminates scale shadow It rings, wherein the calculating process of L2 regularization is specific as follows:
WhereinPixel after indicating regularization, d indicate the port number of pixel;
(5) by pooling-conv1, conv3 and dilated-conv5, this three groups of eigenmatrixes are spatially directly carried out Polymerization calculates, and generates while including the fusion feature figure of resolution ratio and abstract semantics information, its matrix dimension is 512 × 512 ×256;
Area generation network RPN in step 5.F-RCNN carries out Area generation according to fusion feature figure, obtains traffic mark The candidate region of will;
(1) it is slided on fusion feature figure using 3 × 3 convolution kernel, for each pixel on characteristic pattern, with this Centered on point, and using 1:1,1:2,2:1 dimension scale and 4 kinds of areas 16,32,64,128, produced on original input picture Raw 12 anchor frames;
(2) after sliding, the generation quantity of anchor frame is 512 × 512 × 12;
(3) removal is more than the anchor frame on original input picture boundary;
(4) method, threshold value 0.7 are inhibited using maximum value, removal repeats more anchor frame;
(5) according to the friendship of real goal in anchor frame and sample and than IoU, positive sample and negative sample are determined, wherein IoU >0.7 is positive sample, and IoU<0.3 is negative sample, the anchor frame between removal 0.3 to 0.7, wherein the calculation formula of IoU It is as follows:
(6) according to translation invariance, each anchor frame corresponds to a region Suggestion box on fusion feature figure;
(7) all areas Suggestion box obtains object candidate area after the full articulamentum of Area generation network RPN;
All candidate regions of step 6. are input in the layer of the RoI-Pooling in F-RCNN, generate fixed-size feature Vector;
(1) for each object candidate area, it is divided into 8 parts in horizontal, vertical direction respectively by RoI-Pooling layers, And every a down-sampling for all carrying out maximum value pond is handled;
(2) in this manner, even if candidate region size is different, but sampled result is consistent, generation fixed dimension 8 × 8 × 256 feature vector;
Feature vector is sent into the extreme learning machine network for being used to classify and return by step 7., exports the classification of traffic sign The position and;
The extreme learning machine structure of use: (1) ELM: 4096 nodes of hidden layer for traffic sign classification exports 44 Node, each output node represent a kind of traffic sign, and for codomain between (0,1), when classification takes maximum output node to make For traffic sign classification;(2) ELM: 4096 nodes of hidden layer returned for traffic sign position exports 4 nodes, respectively generation The center point coordinate and width of table traffic sign;
The learning algorithm of the extreme learning machine of use is specific as follows:
(1) input/output relation of extreme learning machine ELM can indicate are as follows:
Wherein X=(x1,x2,...,xN) it is the RoI-Pooling layer feature vector exported, for j-th feature vector Desired output is Tj=(tj1,tj2,...,tjk)T, the reality output O of ELMj=[oj1,oj2,...,ojk]T;For the ELM of classification Network, k=44, for the ELM network of recurrence, k=4;wi=[wi1,wi2,...,win]TFor i-th of hidden neuron and input Weight vector between neuron, βi=[βi1i2,...,βik]TFor the power between i-th of hidden neuron and output neuron k It is worth vector, θiIt is the threshold value of i-th of hidden node, i=1,2 ..., 4096, g () be activation primitive;
(2) learning objective of ELM is so that error function E is minimum, and wherein E is the square-error of target and desired output With expression are as follows:
There is βi, wiAnd bi, so that
The output matrix H of ELM is as follows:
Therefore ELM output indicates are as follows:
H β=T
(3) according to the principle of least square, the calculating that hidden layer exports weight β is as follows:
Wherein H is the output matrix of ELM hidden layer, HTFor the transposed matrix of H, I is unit matrix, and C is constant, and O is ELM's Reality output matrix.
The output weight being calculated is updated toCalculate the value of each output node of ELM; For the ELM to classify for traffic sign, the corresponding node serial number of maximum output valve in 44 output nodes is taken, is traffic mark The classification of will;For the ELM for road traffic sign detection, 4 export 4 positional parameters for respectively representing the mark, respectively Center point coordinate and width are high;
Step 8., which uses, contributes adaptive loss function (Contribution Adaptive Loss Function, CA), Training F-RCNN model;
(1) region in F-RCNN model suggests that the training objective of network RPN is to minimize classification and positioning loss, it The description of loss function formalization are as follows:
Wherein piIndicate that i-th of anchor frame is the prediction probability of target object,Indicate the true tag of target object, tiFor the coordinate information of prediction block, including centre coordinate (xi,yi), width wiWith high hi,For the coordinate information of true frame, also include Centre coordinateIt is wideAnd heightLRPN-CAIndicate the adaptive Classification Loss of contribution of RPN network, NclsIndicate anchor The sum of frame, NregIndicate the size of characteristic pattern, λ is adjustment factor, is takenLregIndicate all recurrence damages for surrounding frame It loses, it has used L1Loss, is specifically defined are as follows:
Wherein
Therefore, the adaptive loss function of the contribution of RPN may be defined as:
Wherein (1-pt)3To contribute adaptive loss adjustment factor, since difficulty divides easily wrong point of negative sample, class probability pt→ 0, contribution at this time adaptively loses adjustment factor and tends to 1, so that such sample is unaffected to the contribution of total losses;But It is easily point positive sample class probability pt→ 1, contribute adaptive loss adjustment factor to tend to 0, so that easily dividing positive sample to total losses Contribution drop to 0, utilize the tribute to total losses of the adaptive loss adaptive dynamic regulation difficulty or ease sample of adjustment factor of contribution It offers, allows F-RCNN training to focus more on difficulty and divide negative sample, effective training for promotion efficiency;
(2) the adaptive loss function of contribution of layer network is connected entirely is defined as:
Wherein LFC-CAIndicate that focal loss of classifying, k indicate kth class target, q morekIt is pre- to indicate that sample belongs to kth classification target Survey probability;
Step 9. starts colour TV camera, takes pictures, to carry out before the scene input model pre- to actual traffic scene Processing, sets 2048 × 2048 for resolution ratio, is then input in FR-CNN, repeats step 3 to step 7, completes actual field The road traffic sign detection of scape identifies.
Step 1 in above scheme specifically:
(1) the Tsinghua-Tencent 100K data set announced using Tsinghua University and Tencent's joint, selects 44 Class often uses traffic sign as remote detection identification object;
(2) Tsinghua-Tencent 100K data set is divided into training set and test set according to the ratio of 1:2;
(3) to guarantee sample balance when model training, the Scene case of every class traffic sign is 100 in training set More than, if the Scene case of certain class mark is lower than 100, it is filled using the method for repeated sampling.
The utility model has the advantages that
1, the present invention does not need hand-designed feature, and the remote detection of 44 kinds of common traffic signs may be implemented, have Higher accuracy of identification prevents traffic accident so that facilitating intelligence system timely responds to and improve drive safety.
2, the training sample set that the present invention uses and the difference and advantage invented in the past.
The analysis of causes: in training pattern, there are mainly two types of sample sets for existing invention, and the 1st kind is acquisition of taking pictures by hand Sample set, the 2nd kind is German road traffic sign detection data set GTSRB and GTSDB.Due to traffic sign is many kinds of, influence because Plain multiplicity, the sample set for acquisition of taking pictures by hand, be difficult include comprehensively traffic sign various situations, such as illumination, motion blur, A variety of variations such as weather, visual angle.On the other hand, GTSDB and GTSRB will test identification mission and be divided into two independent data Collection causes to be connected between two tasks undesirable and (needs individually to train detection model with GTSDB, and with GTSRB training identification mould Type, two models are merged according still further to certain mechanism, just the detection and identification of achievable traffic sign).In addition GTSDB is provided The traffic sign type of detection is less, can not adapt to actual requirement, and the resolution ratio of traffic scene is only 1360 × 800, hands over Logical mark is larger, belongs to and closely detects.Simultaneously in GTSRB, traffic sign occupies the 90% of scene.Therefore using GTSDB and The model of GTSRB training can not adapt to remote road traffic sign detection identification in practice.
Benefit analysis: based on the above analysis, the present invention combines publication using Tsinghua University with Tencent Tsinghua-Tencent 100K mass transportation flag data collection.Advantage is specific as follows:
(1) it is split to form by Tencent's real scene shooting streetscape figure, altogether include 100,000 scene pictures and 30,000 traffic signs, Comprising ban, warning and the traffic sign for indicating 3 major class, cover the change conditions such as most of illumination, weather, comprising more Traffic sign type.And every traffic scene photo resolution reaches 2048 × 2048, and (0,32] pixel and (32,96] The traffic sign of pixel occupies the 41.6% and 49.1% of data set respectively, therefore is very suitable to remote road traffic sign detection and knows The training of other model;(2) this data set training objective detection model is used, model more can adapt to complicated and changeable Remote road traffic sign detection identification, for it is unmanned and auxiliary driving in intelligent navigation equipment more safeties are provided With the equipment response time
3, it has invented fusion feature figure (Fusion Feature Map), analysis and advantage the reason of invention.
The analysis of causes: existing target detection model has preferable detection identification effect for target biggish in picture Fruit, but the target identification poor effect for accounting for image scaled very little.The main body of remote Traffic Sign Recognition test problems Present traffic scene mark very little that is larger, and identifying, existing target detection model accuracy rate when solving the problems, such as this is very low, Small target deteection identification problem is the Research Challenges of current computer vision field at present.The essential reason of the problem is traffic mark Will is after being responsible for the convolutional neural networks VGG-16 fl transmission of feature extraction in target detection model, by multiple convolution It is operated with pondization, the resolution ratio in characteristic pattern (output of the last one convolutional layer) is sharply lower, and size is only original mark The 1/16 of will, while containing many background noise informations unrelated with mark.
Benefit analysis: it is insufficient for this, the characteristics of present invention is according to different convolutional layers (shallow convolutional layer have high-resolution, The characteristics of low semantic information, and deep convolutional layer then has low resolution, high semantic information), invent a kind of fusion feature figure skill The feature of different convolutional layers is integrated into fusion feature figure by empty convolution, Chi Hua, regularization, polymerization by art.This fusion The advantages of characteristic pattern, is as follows:
(1) technology that fusion feature figure is constructed using empty convolution sum maximum pondization, be present invention firstly provides.It utilizes Empty convolution expands deep convolutional layer, does not lose the abstract semantics information of any high level, is carried out according to the coefficient of expansion Expansion, the deep convolutional layer after expansion can have identical dimensional with preceding convolution, to complete to polymerize;
(2) size of the characteristic pattern of common VGG-16 is the 1/16 of input picture, and the dimension enlargement of fusion feature figure is The 1/4 of input picture, meanwhile, fused each pixel not only has advanced abstract semantics information, but also includes higher Resolution characteristics, so that the detection for small traffic sign provides great advantage.4, invention limit of utilization learning machine model structure Build the full articulamentum of F-RCNN, analysis and advantage the reason of invention.
The analysis of causes: the present invention is using extreme learning machine (Extreme Learning Machine, ELM) as full connection The reason of layer network model is: the number of parameters that F-RCNN connects layer network entirely is huge, reaches 4096 × 4096.If using general Logical neural network model, such as BP neural network, not only training speed is extremely slow, training is unstable, but also is easy to over-fitting, sternly Ghost image rings model performance.
Benefit analysis: extreme learning machine is a kind of neural network model with Fast Learning ability, in the training process, After input sample is mapped to the random number space of hidden layer by the model, quickly calculated using Moore-Penrose generalized inverse hidden Layer output weight, pace of learning are exceedingly fast, simplify the training process of full articulamentum, the training time of model, Lifting Modules are effectively reduced The training speed of type.
5, it has invented the adaptive loss function of contribution (Contribution Adaptive Loss Function, CA), has sent out Bright reason analysis and advantage.
The analysis of causes: during target detection model training, most of sample be easily divide positive sample, be easy to get compared with High accuracy rate, and a few sample is that difficulty divides negative sample, be easy to cause detection identification mistake.It is similar with the learning process of student, Raising of the problem to do wrong together to learning level is grasped, the easily calculation question of repetition training tens is far longer than.Mesh Mark detection model belongs to one kind of artificial intelligence model, and training process is also that the learning process of the mankind is copied to construct.But The learning algorithm of existing target detection model, indistinguishable difficulty divide negative sample and easy point positive sample, and due to easily dividing positive sample This quantity divides negative sample much higher than difficulty, thus cause these model trainings many times after, although training with higher is accurate Rate, but accuracy of identification when practical application is still lower.
Benefit analysis: based on the above reasons, it is adaptive to have invented a kind of contribution in terms of the training algorithm of F-RCNN by the present invention Loss function (Contribution Adaptive Loss Function, CA) is answered, algorithm advantage is as follows:
(1) situation is identified according to the detection of each sample, automatically adjusts its contribution to loss function, improve difficult point of negative sample This contribution lost to F-RCNN model training reduces in contrast and easily divides influence of the positive sample to model training.
(2) CA loss function can effectively distinguish two kinds of samples, so that the training of target detection model F-RCNN is more closed Note difficulty divides negative sample, training of the model to such sample is continually strengthened, until correctly detecting identification, to constantly effectively mention The learning ability and accuracy of identification of high model.
(3) in addition, this adjustment process is that dynamically, after difficulty divides negative sample classification correct, CA loss function can lead to Contribution adaptation coefficient is overregulated, easily point positive sample is classified as.In contrast, since training is shaken, easily divide positive sample point After class mistake, CA loss function can be adjusted to again difficulty and divide negative sample.This Dynamic Regulating Process, has been effectively ensured mould The convergence of type training process.
Four, Detailed description of the invention:
Fig. 1 is the internal structure chart of F-RCNN target detection model of the invention.
Fig. 2 is the method flow diagram that remote road traffic sign detection of the invention identifies.
Fig. 3 is 44 kinds of common traffic signs of remote detection identification of the invention, these traffic signs are divided into: instruction, police Three kinds of classifications are accused and forbid, * indicates a kind of mark in figure, wherein il*:il100, il60, il80;Ph*:ph4, ph4.5, ph5; Pm*:pm20, pm30, pm55;Pl*:pl5, pl20, pl30, pl40, pl50, pl60, pl70, pl80, pl100, pl120.
Fig. 4 is the detection accuracy of identification of F-RCNN of the present invention, and exists with common target detection model Faster R-CNN Small size (0,32] traffic sign of pixel carries out Accuracy-Recall comparison diagram.
Fig. 5 is the detection accuracy of identification of F-RCNN of the present invention, and exists with common target detection model Faster R-CNN Small size (32,96] traffic sign of pixel carries out Accuracy-Recall comparison diagram.
Fig. 6 is the detection accuracy of identification of F-RCNN of the present invention, and exists with common target detection model Faster R-CNN Small size (96,200] traffic sign of pixel carries out Accuracy-Recall comparison diagram.
Fig. 7 is using method of the invention for the road traffic sign detection recognition result in actual traffic scene.
Five, specific embodiment:
Following further describes the present invention with reference to the drawings:
The invention proposes a kind of novel target detection models, are named as integration region convolutional neural networks (Fusion Region Convolutional Neural Networks, F-RCNN), learning algorithm aspect proposes the adaptive loss of contribution Function (Contribution Adaptive Loss Function, CA) mainly includes 5 component parts in model structure, It is briefly described as follows that (wherein (2) and (5) and other target detection models have significant difference, are for for remote traffic mark Will detection identification problem is individually invented):
(1) convolutional neural networks VGG-16: being mainly responsible for the traffic scene to input model, by convolution sum pond, by Layer calculates picture feature;
(2) fusion feature figure (Fusion Feature Map): this is proposed by the present invention a kind of for remote traffic The peculiar technology of Mark Detection.Traffic scene passes through maximum pond, empty convolution, regularization, polymerization after VGG-16 is transmitted The fusion feature figure of generation, while resolution characteristics with higher and high-level semantics information abundant are target detection model The feature extraction of traffic sign is provided, is a component part mostly important in F-RCNN network, it is accurate for model inspection Rate has great influence;
(3) network (Region Proposal Network, RPN) is suggested in region: according to fusion feature figure, generating certain The target suggestion areas of quantity;
(4) interest pool area layer (Region of Intertest Pooling Layer, RoI-Pooling), into one Step extracts the feature of traffic sign, the target suggestion areas that RPN is calculated, and pond chemical conversion is the feature vector of regular length;
(5) fully-connected network (Fully Connected Network, FC): it is responsible for specific classification and the position of traffic sign Calculating is set, for the training effectiveness for improving model, the present invention, as FC, can save computing resource using extreme learning machine model, Effectively shorten training time of model.
As shown in Figure 1, Figure 2, this remote road traffic sign detection recognition methods based on F-RCNN is specific as follows:
Step 1. pre-processes Traffic Sign Images sample set;
(1) the Tsinghua-Tencent 100K data set announced using Tsinghua University and Tencent's joint, selects 44 Class often uses traffic sign as remote detection identification object;
(2) Tsinghua-Tencent 100K data set is divided into training set and test set according to the ratio of 1:2;
(3) to guarantee sample balance when model training, the Scene case of every class traffic sign is 100 in training set More than, if the Scene case of certain class mark is lower than 100, it is filled using the method for repeated sampling;
Step 2. carries out pre-training to the feature extraction network VGG-16 in F-RCNN using image classification benchmark dataset;
The step plays a significant role the training of F-RCNN model.If not using traffic directly to VGG-16 pre-training Flag data collection training, then F-RCNN is difficult detection accuracy of identification with higher.Therefore the present invention is public using ImageNet Image data set carries out pre-training to VGG-16 network, original state of the parameter as network after training.The network packet 5 convolutional layers, 5 pond layers are included, and using ReLU as activation primitive;
Traffic sign training dataset is input to VGG-16 by step 3., does convolution to image, pondization operates, completion feature It extracts;
Detailed process are as follows: by the way of mini-batch, by the Traffic Sign Images Segmentation of Data Set Cheng Ruo of training Dry batch, the quantity of the picture in every batch of is all seldom, can not only substantially reduce the calculation amount of F-RCNN, and model parameter according to The error of this group of data carries out gradient updating, reduces randomness, to accelerate model training convergence;
Step 4. constructs fusion feature figure using maximum pond, empty convolution, regularization, converging operation;
Fusion feature figure proposed by the present invention, its specific calculating step are as follows:
(1) output of each convolutional layer of VGG-16 model, be successively denoted as from front to back conv1, conv2, conv3, Conv4 and conv5.For every traffic scene with 2048 × 2048 resolution ratio, it is converted first into 2048 × 2048 × 3 VGG-16 is input to after numerical matrix, wherein picture size constantly reduces, and port number constantly increases.Wherein each convolutional layer output Matrix dimension is followed successively by conv1:2048 × 2048 × 64;Conv2:1024 × 1024 × 128;Conv3:512 × 512 × 256; Conv4:256 × 256 × 512;Conv5:128 × 128 × 512;
(2) being expanded using empty convolution to conv5, expansion rate 3, the size after expansion is 512*512* 512, convolutional calculation is then carried out, wherein the parameter of convolution kernel are as follows: size 3 × 3, step-length are 1, are filled with 1, quantity 256.Meter It is 512 × 512 × 256 that output after calculation, which is denoted as dilated-conv5 matrix dimension,;
(3) down-sampling is carried out using maximum pondization to conv1, channel duplication expands 4 times, and the output after down-sampling is denoted as Pooling-conv1, its matrix dimension are 512 × 512 × 256;
(4) it carries out L2 regularization calculating respectively to pooling-conv1, conv3, dilated-conv5, eliminates scale shadow It rings, wherein the calculating process of L2 regularization is specific as follows:
WhereinPixel after indicating regularization, d indicate the port number of pixel.
(5) by pooling-conv1, conv3 and dilated-conv5, this three groups of eigenmatrixes are spatially directly carried out Polymerization calculates, and generates while including the fusion feature figure of resolution ratio and abstract semantics information, its matrix dimension is 512 × 512 ×256。
Area generation network RPN in step 5.F-RCNN carries out Area generation according to fusion feature figure, obtains traffic mark The candidate region of will;
(1) it is slided on fusion feature figure using 3 × 3 convolution kernel, for each pixel on characteristic pattern, with this Centered on point, and use 3 kinds of dimension scales (1:1,1:2,2:1) and 4 kinds of areas (16,32,64,128), in original input picture 12 anchor frames of upper generation;
(2) after sliding, the generation quantity of anchor frame is 512 × 512 × 12;
(3) removal is more than the anchor frame on original input picture boundary;
(4) method, threshold value 0.7 are inhibited using maximum value, removal repeats more anchor frame;
(5) according to the friendship of real goal in anchor frame and sample and than IoU, positive sample and negative sample are determined, wherein IoU >0.7 is positive sample, and IoU<0.3 is negative sample, the anchor frame between removal 0.3 to 0.7.The wherein calculation formula of IoU It is as follows:
(6) according to translation invariance, each anchor frame corresponds to a region Suggestion box on fusion feature figure;
(7) finally, all areas Suggestion box obtains target candidate area after the full articulamentum of Area generation network RPN Domain.
All candidate regions of step 6. are input in the layer of the RoI-Pooling in F-RCNN, generate fixed-size feature Vector;
(1) for each object candidate area, it is divided into 8 parts in horizontal, vertical direction respectively by RoI-Pooling layers, And every a down-sampling for all carrying out maximum value pond is handled;
(2) in this manner, even if candidate region size is different, but sampled result is consistent, generation fixed dimension 8 × 8 × 256 feature vector;
Feature vector is sent into the extreme learning machine network for being used to classify and return by step 7., exports the classification of traffic sign The position and;
The extreme learning machine structure that the present invention uses: (1) ELM: hidden layer 4096 nodes for traffic sign classification, it is defeated 44 nodes out, each output node represent a kind of traffic sign, and for codomain between (0,1), when classification takes maximum output Node is as traffic sign classification;(2) ELM: 4096 nodes of hidden layer returned for traffic sign position exports 4 nodes, Center point coordinate and the width for respectively representing traffic sign are high.
The learning algorithm for the extreme learning machine that the present invention uses is specific as follows:
(1) input/output relation of extreme learning machine ELM can indicate are as follows:
Wherein X=(x1,x2,...,xN) it is the RoI-Pooling layer feature vector exported, for j-th feature vector Desired output is Tj=(tj1,tj2,...,tjk)T, the reality output O of ELMj=[oj1,oj2,...,ojk]T.For the ELM of classification Network, k=44, for the ELM network of recurrence, k=4.wi=[wi1,wi2,...,win]TFor i-th of hidden neuron and input Weight vector between neuron, βi=[βi1i2,...,βik]TFor the power between i-th of hidden neuron and output neuron k It is worth vector, θiIt is the threshold value of i-th of hidden node, i=1,2 ..., 4096, g () be activation primitive.
(2) learning objective of ELM is so that error function E is minimum, and wherein E is the square-error of target and desired output With may be expressed as:
There is βi, wiAnd bi, so thatIn addition, the output matrix H of ELM is as follows:
Therefore ELM output may be expressed as:
H β=T
(3) according to the principle of least square, the calculating that hidden layer exports weight β is as follows:
Wherein H is the output matrix of ELM hidden layer, HTFor the transposed matrix of H, I is unit matrix, and C is constant, and O is ELM's Reality output matrix.
The output weight being calculated is updated toThe each output node of ELM can be calculated Value.For the ELM to classify for traffic sign, the corresponding node serial number of maximum output valve in 44 output nodes is taken, i.e., For the classification of traffic sign.For the ELM for road traffic sign detection, 4 export the 4 positioning ginseng for respectively representing the mark Number, respectively center point coordinate and width are high.
Step 8., which uses, contributes adaptive loss function (Contribution Adaptive Loss Function, CA), Training F-RCNN model
(1) region in F-RCNN model suggests that the training objective of network RPN is to minimize classification and positioning loss.It The description that loss function can formalize are as follows:
Wherein piIndicate that i-th of anchor frame is the prediction probability of target object,Indicate the true tag of target object, tiFor the coordinate information of prediction block, including centre coordinate (xi,yi), width wiWith high hi,For the coordinate information of true frame, also include Centre coordinateIt is wideAnd heightLRPN-CAIndicate the adaptive Classification Loss of contribution of RPN network, NclsIndicate anchor The sum of frame, NregIndicate the size of characteristic pattern, λ is adjustment factor, Ke YiquLregIt indicates all and surrounds returning for frame Return loss, it has used L1Loss, is specifically defined are as follows:
Wherein
Therefore, the adaptive loss function of the contribution of RPN may be defined as:
Wherein (1-pt)3To contribute adaptive loss adjustment factor.Since difficulty divides easily wrong point of negative sample, class probability pt→ 0, contribution at this time adaptively loses adjustment factor and tends to 1, so that such sample is to the contribution of total losses substantially not by shadow It rings.But easily divide positive sample class probability pt→ 1, contribute adaptive loss adjustment factor to tend to 0, so that easily dividing positive sample to total The contribution of loss drops to 0.Using the contribution to total losses of the adaptive dynamic regulation difficulty or ease sample of the coefficient, to allow F- RCNN training focuses more on difficulty and divides negative sample, effective training for promotion efficiency.
(2) the adaptive loss function of contribution of layer network is connected entirely is defined as:
Wherein LFC-CAIndicate that focal loss of classifying, k indicate kth class target, q morekIt is pre- to indicate that sample belongs to kth classification target Survey probability.
Step 9. starts colour TV camera, takes pictures to actual traffic scene.It to be carried out before the scene input model pre- Processing, sets 2048 × 2048 for resolution ratio, is then input in model, repeats step 3 to step 7, completes actual scene Road traffic sign detection identification.
Embodiment:
The present invention uses the data set that learns as model training of Tsinghua-Tencent 100K, common to 44 kinds Traffic sign (refering to Fig. 3) has carried out detection identification.In the data set, the resolution ratio of every traffic scene is 2048 × 2048, The size of traffic sign accounts for the 41.6% and 49.1% of data set between 0-32 pixel, 32-96 pixel respectively, i.e., and 90.7% Traffic sign size account for traffic scene ratio less than 1%, belong to remote road traffic sign detection identification situation.
F-RCNN model training and site testing data explanation (test be in order to grasp step 1 of the present invention it is rapid~step 8 is rapid Feasibility)
During training F-RCNN model, for the imbalance for eliminating sample set, for being less than 100 classifications, every Method when secondary trained iteration using resampling makes number of pictures be more than 1000.The ratio of training set and test set is 2:1. The present invention in the specific implementation process, is referred to using the common Measure Indexes F1-measure of accuracy rate is measured as detection identification Mark, this refers to that target value is bigger, then illustrates that detection accuracy of identification is higher.
Table 1 gives F-RCNN of the invention and common target detection model Faster R-CNN and is identifying remote traffic Comparative result on mark.In addition, the present embodiment successive contrast removes the accuracy of identification feelings after the relevant technologies in the present invention Condition.
For convenience of description, make following denotational description:
(1) record F-RCNN detection model is F0;
(2) if in F-RCNN detection model, fusion feature diagram technology is not used, frame is denoted as F1 at this time.On the basis of F1, The adaptive loss function of contribution is not used to train network again, frame at this time is denoted as F2.
The result data provided from table 1 can be seen that us and invent the detection recognition method of F-RCNN, to 44 kinds of common friendships The remote detection accuracy of identification of logical mark is clearly higher than common Faster R-CNN, reaches 30~40%, to have Effect demonstrates the validity and practicability of our inventive methods.Fig. 4, Fig. 5, Fig. 6 each provide various sizes of traffic sign Detect the RC curve comparison of recognition result, hence it is evident that find out the method for our inventions in 0-32 pixel, 32-96 pixel, 96-200 picture The Accuracy-Recall curve of element is substantially better than Faster R-CNN model.
If not identified using the detection of the F1 of fusion feature figure in addition, can be seen that from the comparative situation of F0, F1, F2 Precision averagely has dropped 10 percentage points or so, if simultaneously without using fusion feature figure and the adaptive loss function of contribution, inspection It surveys accuracy of identification and declines 16 percentage points or so again.Therefore can be seen that from actually detected Comparative result it is proposed that melt It closes characteristic pattern and contributes adaptive loss function that there is obvious effect for the accuracy of identification for improving remote traffic sign.
Table 1. is often with the recognition result accuracy comparison (%) of 44 kinds of traffic signs
In addition, the present embodiment compared training time and detection time, the training time of Faster R-CNN is 107 small When, the training time of F0 (F-RCNN) of the present invention is 68 hours, and the training time of F1 is 66 hours, and the training time of F2 is 63 hours.This explanation: (1) F-RCNN model, due to having used extreme learning machine ELM as fully-connected network, by model Training time improves 30% or so;(2) the training time comparison of F0, F1 and F2, effectively illustrates fusion proposed by the present invention Characteristic pattern and the adaptive loss function of contribution, it is increased for the calculation amount of F-RCNN model seldom, improving remote identification While precision, computing resource is effectively saved.
The invention discloses one kind to be based on integration region convolutional neural networks (Fusion Region Convolutional Neural Networks, F-RCNN) remote road traffic sign detection recognition methods, mainly solve existing road traffic sign detection The detecting distance of recognition methods is short, detects the deficiency that type is few, accuracy of identification is low.Deep learning, computer vision are used first Method, the detection of remote traffic sign and identification mission are uniformly integrated into F-RCNN target detection model, and is directed to The few deficiency of existing method (based on GTSDB, GTSRB data set or collected by hand data set) detection type, the present invention is using clear Hua Da combines the Tsinghua-Tencent 100K announced with Tencent as model training data set;Secondly, to improve F-RCNN has invented a kind of fusion feature figure skill to the character representation ability of small size traffic sign in remote detection process Art is somebody's turn to do by the Fusion Features of different convolutional layers into new characteristic pattern by maximum pond, empty convolution, regularization, polymerization Characteristic pattern resolution characteristics with higher and high-level semantics information abundant.Meanwhile to improve the study energy of F-RCNN model Power further increases detection accuracy of identification, has invented a kind of adaptive loss function of contribution, has been distinguished by adjusting sample losses Difficulty or ease sample allows in model learning training process focusing more on difficulty and divide negative sample, to effectively improve training effectiveness.In addition, adopting Extreme learning machine is used to effectively shorten the training time of model as the full connection layer network of F-RCNN model, save calculating Resource.It was proved that present invention detection accuracy of identification with higher, and can be to 44 kinds in actual life common traffic marks Will carries out remote detection identification.

Claims (2)

1. a kind of remote road traffic sign detection recognition methods based on F-RCNN, it is characterised in that:
Step 1. pre-processes Traffic Sign Images sample set;
Step 2. carries out pre-training to the feature extraction network VGG-16 in F-RCNN using image classification benchmark dataset;
Pre-training is carried out to feature extraction network VGG-16 using ImageNet common image data set, the parameter after training Original state as network;VGG-16 network includes 5 convolutional layers, 5 pond layers, and using ReLU as activation primitive;
Traffic sign training dataset is input to the feature extraction network VGG-16 of pre-training by step 3., to image do convolution, Pondization operation, completes feature extraction;
By the way of mini-batch, by the Traffic Sign Images Segmentation of Data Set of training at several batches;
Step 4. constructs fusion feature figure using maximum pond, empty convolution, regularization, converging operation;
(1) output of each convolutional layer of feature extraction network VGG-16, be successively denoted as from front to back conv1, conv2, Conv3, conv4 and conv5;For every traffic scene with 2048 × 2048 resolution ratio, it is converted first into 2048 × VGG-16 is input to after 2048 × 3 numerical matrix, wherein picture size constantly reduces, and port number constantly increases, each convolutional layer Output matrix dimension is followed successively by conv1:2048 × 2048 × 64;Conv2:1024 × 1024 × 128;Conv3:512 × 512 × 256;Conv4:256 × 256 × 512;Conv5:128 × 128 × 512;
(2) being expanded using empty convolution to conv5, expansion rate 3, the size after expansion is 512 × 512 × 512, Then convolutional calculation is carried out, wherein the parameter of convolution kernel are as follows: size 3 × 3, step-length are 1, are filled with 1, quantity 256, after calculating Output to be denoted as dilated-conv5 matrix dimension be 512 × 512 × 256;
(3) down-sampling is carried out using maximum pondization to conv1, channel duplication expands 4 times, and the output after down-sampling is denoted as Pooling-conv1, its matrix dimension are 512 × 512 × 256;
(4) L2 regularization calculating is carried out respectively to pooling-conv1, conv3, dilated-conv5, eliminating scale influences, Wherein the calculating process of L2 regularization is specific as follows:
WhereinPixel after indicating regularization, d indicate the port number of pixel;
(5) by pooling-conv1, conv3 and dilated-conv5, this three groups of eigenmatrixes spatially directly polymerize It calculating, generates fusion feature figure simultaneously comprising resolution ratio and abstract semantics information, its matrix dimension is 512 × 512 × 256;
Area generation network RPN in step 5.F-RCNN carries out Area generation according to fusion feature figure, obtains traffic sign Candidate region;
(1) it is slided on fusion feature figure using 3 × 3 convolution kernel, for each pixel on characteristic pattern, as in Heart point, and using the dimension scale and 4 kinds of areas 16,32,64,128 of 1:1,1:2,2:1,12 are generated on original input picture A anchor frame;
(2) after sliding, the generation quantity of anchor frame is 512 × 512 × 12;
(3) removal is more than the anchor frame on original input picture boundary;
(4) method, threshold value 0.7 are inhibited using maximum value, removal repeats more anchor frame;
(5) according to the friendship of real goal in anchor frame and sample and than IoU, positive sample and negative sample are determined, wherein IoU > 0.7 It is positive sample, IoU < 0.3 is negative sample, and the anchor frame between removal 0.3 to 0.7, wherein the calculation formula of IoU is such as Under:
(6) according to translation invariance, each anchor frame corresponds to a region Suggestion box on fusion feature figure;
(7) all areas Suggestion box obtains object candidate area after the full articulamentum of Area generation network RPN;
All candidate regions of step 6. are input in the layer of the RoI-Pooling in F-RCNN, generate fixed-size feature vector;
(1) for each object candidate area, it is divided into 8 parts in horizontal, vertical direction respectively by RoI-Pooling layers, and right Every a down-sampling processing for all carrying out maximum value pond;
(2) in this manner, even if candidate region size is different, but sampled result is consistent, generation fixed dimension 8 × 8 × 256 feature vector;
Feature vector is sent into the extreme learning machine network for being used to classify and return by step 7., exports classification and the position of traffic sign It sets;
The extreme learning machine structure of use: (1) ELM: 4096 nodes of hidden layer for traffic sign classification exports 44 sections Point, each output node represent a kind of traffic sign, and for codomain between (0,1), when classification takes maximum output node conduct Traffic sign classification;(2) ELM: 4096 nodes of hidden layer returned for traffic sign position exports 4 nodes, respectively represents The center point coordinate and width of traffic sign are high;
The learning algorithm of the extreme learning machine of use is specific as follows:
(1) input/output relation of extreme learning machine ELM indicates are as follows:
Wherein X=(x1,x2,...,xN) it is the RoI-Pooling layers of feature vector exported, the expectation for j-th of feature vector Output is Tj=(tj1,tj2,...,tjk)T, the reality output O of ELMj=[oj1,oj2,...,ojk]T;For the ELM net of classification Network, k=44, for the ELM network of recurrence, k=4;wi=[wi1,wi2,...,win]TFor i-th of hidden neuron and input mind Through the weight vector between member, βi=[βi1i2,...,βik]TFor the weight between i-th of hidden neuron and output neuron k Vector, θiIt is the threshold value of i-th of hidden node, i=1,2 ..., 4096, g () be activation primitive;
(2) learning objective of ELM is so that error function E is minimum, and wherein E is the error sum of squares of target and desired output, table It is shown as:
There are βi, wiAnd bi, so that
The output matrix H of ELM is as follows:
Therefore ELM output indicates are as follows:
H β=T
(3) according to the principle of least square, the calculating that hidden layer exports weight β is as follows:
Wherein H is the output matrix of ELM hidden layer, HTFor the transposed matrix of H, I is unit matrix, and C is constant, and O is the reality of ELM Output matrix.
The output weight being calculated is updated toCalculate the value of each output node of ELM;It is right In the ELM to classify for traffic sign, the corresponding node serial number of maximum output valve in 44 output nodes is taken, is traffic sign Classification;For the ELM for road traffic sign detection, 4 export and respectively represent 4 positional parameters of the mark, respectively in Heart point coordinate and width are high;
Step 8. trains F-RCNN model using adaptive loss function is contributed;
(1) region in F-RCNN model suggests that the training objective of network RPN is to minimize classification and positioning loss, its loss The description of functional form are as follows:
Wherein piIndicate that i-th of anchor frame is the prediction probability of target object,Indicate the true tag of target object, tiFor The coordinate information of prediction block, including centre coordinate (xi,yi), width wiWith high hi,For the coordinate information of true frame, including center CoordinateIt is wideAnd heightLRPN-CAIndicate the adaptive Classification Loss of contribution of RPN network, NclsIndicate anchor frame Sum, NregIndicate the size of characteristic pattern, λ is adjustment factor, is takenLregIndicate all recurrence losses for surrounding frame, It has used L1Loss, is specifically defined are as follows:
Wherein
Therefore, the adaptive loss function of the contribution of RPN is defined as:
Wherein (1-pt)3To contribute adaptive loss adjustment factor, since difficulty divides easily wrong point of negative sample, class probability pt→ 0, contribution at this time adaptively loses adjustment factor and tends to 1, so that such sample is unaffected to the contribution of total losses;But easily Divide positive sample class probability pt→ 1, contribute adaptive loss adjustment factor to tend to 0, so that easily dividing tribute of the positive sample to total losses It offers and drops to 0, utilize the adaptive loss adjustment factor (1-p of contributiont)3, adaptive dynamic regulation difficulty or ease sample to total losses Contribution allows F-RCNN training to focus more on difficulty and divides negative sample, effective training for promotion efficiency;
(2) the adaptive loss function of contribution of layer network is connected entirely is defined as:
Wherein LFC-CAIndicate that focal loss of classifying, k indicate kth class target, q morekIt is general to indicate that sample belongs to the prediction of kth classification target Rate;
Step 9. starts colour TV camera, takes pictures to actual traffic scene, to be pre-processed before the scene input model, 2048 × 2048 are set by resolution ratio, is then input in FR-CNN, step 3 is repeated to step 7, completes the friendship of actual scene Logical Mark Detection identification.
2. the remote road traffic sign detection recognition methods according to claim 1 based on F-RCNN, it is characterised in that: institute The step of stating one specifically:
(1) the Tsinghua-Tencent 100K data set announced using Tsinghua University and Tencent's joint, selects 44 classes normal Use traffic sign as remote detection identification object;
(2) Tsinghua-Tencent 100K data set is divided into training set and test set according to the ratio of 1:2;
(3) for guarantee model training when sample balance, training set in every class traffic sign Scene case be 100 with On, if the Scene case of certain class mark is lower than 100, it is filled using the method for repeated sampling.
CN201910474058.2A 2019-06-02 2019-06-02 F-RCNN-based remote traffic sign detection and identification method Active CN110163187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910474058.2A CN110163187B (en) 2019-06-02 2019-06-02 F-RCNN-based remote traffic sign detection and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910474058.2A CN110163187B (en) 2019-06-02 2019-06-02 F-RCNN-based remote traffic sign detection and identification method

Publications (2)

Publication Number Publication Date
CN110163187A true CN110163187A (en) 2019-08-23
CN110163187B CN110163187B (en) 2022-09-02

Family

ID=67630730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910474058.2A Active CN110163187B (en) 2019-06-02 2019-06-02 F-RCNN-based remote traffic sign detection and identification method

Country Status (1)

Country Link
CN (1) CN110163187B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619889A (en) * 2019-09-19 2019-12-27 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110781744A (en) * 2019-09-23 2020-02-11 杭州电子科技大学 Small-scale pedestrian detection method based on multi-level feature fusion
CN110826544A (en) * 2019-12-23 2020-02-21 深圳市豪恩汽车电子装备股份有限公司 Traffic sign detection and identification system and method
CN110956115A (en) * 2019-11-26 2020-04-03 证通股份有限公司 Scene recognition method and device
CN111062885A (en) * 2019-12-09 2020-04-24 中国科学院自动化研究所 Mark detection model training and mark detection method based on multi-stage transfer learning
CN111209975A (en) * 2020-01-13 2020-05-29 北京工业大学 Ship target identification method based on multitask learning
CN111310615A (en) * 2020-01-23 2020-06-19 天津大学 Small target traffic sign detection method based on multi-scale information and residual error network
CN111461060A (en) * 2020-04-22 2020-07-28 上海应用技术大学 Traffic sign identification method based on deep learning and extreme learning machine
CN111580151A (en) * 2020-05-13 2020-08-25 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111597899A (en) * 2020-04-16 2020-08-28 浙江工业大学 Scenic spot ground plastic bottle detection method
CN111611998A (en) * 2020-05-21 2020-09-01 中山大学 Adaptive feature block extraction method based on candidate region area and width and height
CN111723854A (en) * 2020-06-08 2020-09-29 杭州像素元科技有限公司 Method and device for detecting traffic jam of highway and readable storage medium
CN111738300A (en) * 2020-05-27 2020-10-02 复旦大学 Optimization algorithm for detecting and identifying traffic signs and signal lamps
CN111986125A (en) * 2020-07-16 2020-11-24 浙江工业大学 Method for multi-target task instance segmentation
CN112052778A (en) * 2020-09-01 2020-12-08 腾讯科技(深圳)有限公司 Traffic sign identification method and related device
CN112528977A (en) * 2021-02-10 2021-03-19 北京优幕科技有限责任公司 Target detection method, target detection device, electronic equipment and storage medium
CN113642430A (en) * 2021-07-29 2021-11-12 江苏大学 High-precision visual positioning method and system for underground parking lot based on VGG + NetVLAD
US20210357683A1 (en) * 2020-10-22 2021-11-18 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for determining target anchor, device and storage medium
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment
CN110941970B (en) * 2019-12-05 2023-05-30 深圳牛图科技有限公司 High-speed dimension code positioning and identifying system based on full convolution neural network
CN117274957A (en) * 2023-11-23 2023-12-22 西南交通大学 Road traffic sign detection method and system based on deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372577A (en) * 2016-08-23 2017-02-01 北京航空航天大学 Deep learning-based traffic sign automatic identifying and marking method
CN107122776A (en) * 2017-04-14 2017-09-01 重庆邮电大学 A kind of road traffic sign detection and recognition methods based on convolutional neural networks
CN107301383A (en) * 2017-06-07 2017-10-27 华南理工大学 A kind of pavement marking recognition methods based on Fast R CNN
US20180144202A1 (en) * 2016-11-22 2018-05-24 Ford Global Technologies, Llc Brake Light Detection
CN109492526A (en) * 2018-09-27 2019-03-19 桂林电子科技大学 Traffic sign recognition method based on LDCNN model and NHE algorithm
CN109815906A (en) * 2019-01-25 2019-05-28 华中科技大学 Method for traffic sign detection and system based on substep deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372577A (en) * 2016-08-23 2017-02-01 北京航空航天大学 Deep learning-based traffic sign automatic identifying and marking method
US20180144202A1 (en) * 2016-11-22 2018-05-24 Ford Global Technologies, Llc Brake Light Detection
CN107122776A (en) * 2017-04-14 2017-09-01 重庆邮电大学 A kind of road traffic sign detection and recognition methods based on convolutional neural networks
CN107301383A (en) * 2017-06-07 2017-10-27 华南理工大学 A kind of pavement marking recognition methods based on Fast R CNN
CN109492526A (en) * 2018-09-27 2019-03-19 桂林电子科技大学 Traffic sign recognition method based on LDCNN model and NHE algorithm
CN109815906A (en) * 2019-01-25 2019-05-28 华中科技大学 Method for traffic sign detection and system based on substep deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
QINGPENG LI等: "HSF-Net: Multiscale Deep Feature Embedding for", 《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》 *
ZHIGANG LIU等: "MR-CNN: A Multi-Scale Region-Based Convolutional Neural Network for Small Traffic Sign Recognition", 《IEEE ACCESS》 *
孙伟等: "基于CNN多层特征和ELM的交通标志识别", 《 电子科技大学学报》 *
黄知超等: "基于轻量WACNN的交通标志识别", 《应用激光》 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619889B (en) * 2019-09-19 2022-03-15 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110619889A (en) * 2019-09-19 2019-12-27 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110781744A (en) * 2019-09-23 2020-02-11 杭州电子科技大学 Small-scale pedestrian detection method based on multi-level feature fusion
CN110956115B (en) * 2019-11-26 2023-09-29 证通股份有限公司 Scene recognition method and device
CN110956115A (en) * 2019-11-26 2020-04-03 证通股份有限公司 Scene recognition method and device
CN110941970B (en) * 2019-12-05 2023-05-30 深圳牛图科技有限公司 High-speed dimension code positioning and identifying system based on full convolution neural network
CN111062885A (en) * 2019-12-09 2020-04-24 中国科学院自动化研究所 Mark detection model training and mark detection method based on multi-stage transfer learning
CN111062885B (en) * 2019-12-09 2023-09-12 中国科学院自动化研究所 Mark detection model training and mark detection method based on multi-stage transfer learning
CN110826544A (en) * 2019-12-23 2020-02-21 深圳市豪恩汽车电子装备股份有限公司 Traffic sign detection and identification system and method
CN111209975A (en) * 2020-01-13 2020-05-29 北京工业大学 Ship target identification method based on multitask learning
CN111310615A (en) * 2020-01-23 2020-06-19 天津大学 Small target traffic sign detection method based on multi-scale information and residual error network
CN111597899A (en) * 2020-04-16 2020-08-28 浙江工业大学 Scenic spot ground plastic bottle detection method
CN111597899B (en) * 2020-04-16 2023-08-11 浙江工业大学 Scenic spot ground plastic bottle detection method
CN111461060A (en) * 2020-04-22 2020-07-28 上海应用技术大学 Traffic sign identification method based on deep learning and extreme learning machine
CN111580151A (en) * 2020-05-13 2020-08-25 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111580151B (en) * 2020-05-13 2021-04-20 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111611998A (en) * 2020-05-21 2020-09-01 中山大学 Adaptive feature block extraction method based on candidate region area and width and height
CN111738300A (en) * 2020-05-27 2020-10-02 复旦大学 Optimization algorithm for detecting and identifying traffic signs and signal lamps
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment
CN111723854A (en) * 2020-06-08 2020-09-29 杭州像素元科技有限公司 Method and device for detecting traffic jam of highway and readable storage medium
CN111723854B (en) * 2020-06-08 2023-08-29 杭州像素元科技有限公司 Expressway traffic jam detection method, equipment and readable storage medium
CN111986125A (en) * 2020-07-16 2020-11-24 浙江工业大学 Method for multi-target task instance segmentation
CN112052778A (en) * 2020-09-01 2020-12-08 腾讯科技(深圳)有限公司 Traffic sign identification method and related device
US20210357683A1 (en) * 2020-10-22 2021-11-18 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for determining target anchor, device and storage medium
US11915466B2 (en) * 2020-10-22 2024-02-27 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for determining target anchor, device and storage medium
WO2022170742A1 (en) * 2021-02-10 2022-08-18 北京优幕科技有限责任公司 Target detection method and apparatus, electronic device and storage medium
CN112528977A (en) * 2021-02-10 2021-03-19 北京优幕科技有限责任公司 Target detection method, target detection device, electronic equipment and storage medium
CN112528977B (en) * 2021-02-10 2021-07-02 北京优幕科技有限责任公司 Target detection method, target detection device, electronic equipment and storage medium
CN113642430A (en) * 2021-07-29 2021-11-12 江苏大学 High-precision visual positioning method and system for underground parking lot based on VGG + NetVLAD
CN113642430B (en) * 2021-07-29 2024-05-14 江苏大学 VGG+ NetVLAD-based high-precision visual positioning method and system for underground parking garage
CN117274957A (en) * 2023-11-23 2023-12-22 西南交通大学 Road traffic sign detection method and system based on deep learning
CN117274957B (en) * 2023-11-23 2024-03-01 西南交通大学 Road traffic sign detection method and system based on deep learning

Also Published As

Publication number Publication date
CN110163187B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN110163187A (en) Remote road traffic sign detection recognition methods based on F-RCNN
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
Gao et al. Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment
CN109344736B (en) Static image crowd counting method based on joint learning
CN104978580B (en) A kind of insulator recognition methods for unmanned plane inspection transmission line of electricity
CN112488210A (en) Three-dimensional point cloud automatic classification method based on graph convolution neural network
CN112434672B (en) Marine human body target detection method based on improved YOLOv3
CN108805070A (en) A kind of deep learning pedestrian detection method based on built-in terminal
CN108830188A (en) Vehicle checking method based on deep learning
CN106841216A (en) Tunnel defect automatic identification equipment based on panoramic picture CNN
CN108647741A (en) A kind of image classification method and system based on transfer learning
CN107945153A (en) A kind of road surface crack detection method based on deep learning
CN112949647B (en) Three-dimensional scene description method and device, electronic equipment and storage medium
CN110163836A (en) Based on deep learning for the excavator detection method under the inspection of high-altitude
CN114842208B (en) Deep learning-based power grid harmful bird species target detection method
Ye et al. Real-time object detection network in UAV-vision based on CNN and transformer
CN110197152A (en) A kind of road target recognition methods for automated driving system
CN106991666A (en) A kind of disease geo-radar image recognition methods suitable for many size pictorial informations
CN112434723B (en) Day/night image classification and object detection method based on attention network
CN113807464A (en) Unmanned aerial vehicle aerial image target detection method based on improved YOLO V5
CN109376676A (en) Highway engineering site operation personnel safety method for early warning based on unmanned aerial vehicle platform
CN113870160B (en) Point cloud data processing method based on transformer neural network
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN106780546A (en) The personal identification method of the motion blur encoded point based on convolutional neural networks
CN113095251B (en) Human body posture estimation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240205

Address after: 230000 Room 203, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee after: Hefei Jiuzhou Longteng scientific and technological achievement transformation Co.,Ltd.

Country or region after: China

Address before: 163319 No. 99 Xuefu Street, Daqing Hi-tech Development Zone, Heilongjiang Province

Patentee before: NORTHEAST PETROLEUM University

Country or region before: China

TR01 Transfer of patent right