CN116721403A

CN116721403A - Road traffic sign detection method

Info

Publication number: CN116721403A
Application number: CN202310723460.6A
Authority: CN
Inventors: 王孜健; 周经美; 么新鹏; 高建金; 范颂华; 徐志刚; 荣文; 程鑫; 刘梦菲; 周洲; 张涵
Original assignee: Changan University; Shandong High Speed Group Co Ltd
Current assignee: Changan University; Shandong High Speed Group Co Ltd
Priority date: 2023-06-19
Filing date: 2023-06-19
Publication date: 2023-09-08

Abstract

The invention provides a road traffic sign detection method, which belongs to the technical field of traffic sign image processing and comprises the following steps: acquiring a traffic sign image dataset; generating a corresponding foggy image data set according to the traffic sign image data set; inputting the foggy image data set into a generated countermeasure network model to train the countermeasure network model, and generating the countermeasure network model after training, wherein the generated countermeasure network model is used for converting the foggy image into a corresponding defogging image; training a network model for image recognition by using the traffic sign image dataset to obtain a trained traffic sign detection recognition model; and sending the defogging image to be detected into the trained generated countermeasure network model, generating a defogging image corresponding to the defogging image to be detected, and inputting the defogging image into the trained traffic sign detection and identification model to obtain the identification result of the traffic sign. The method can detect complex road traffic signs.

Description

Road traffic sign detection method

Technical Field

The invention belongs to the technical field of traffic sign image processing, and particularly relates to a road traffic sign detection method.

Background

With the tremendous improvement of the living standard and quality of people, automobiles are rapidly and universally started. The number of automobiles is in a state of rapid growth in recent years, and the huge number of automobiles leads to the saturation of public transportation resources, and meanwhile, more traffic safety accidents are caused.

To cope with various traffic accidents, related technicians have developed driving assistance technologies, which are generally composed of an automatic parking system, a reversing system, a braking system, a lane keeping system, and a driving system. A number of computer vision techniques are applied to the above systems. Through years of development, advanced auxiliary driving technology (ADAS) is produced under the background, the ADAS uses various sensors to collect environmental data, and the data are analyzed and calculated, so that the aims of assisting a driver in perceiving road information and perceiving dangerous information in advance are fulfilled, and the safety of an automobile in the driving process is further ensured.

In a traffic system, traffic signs provide a large amount of information for drivers in the use process of roads, and the driving behaviors of the drivers are standardized, so that accurate detection and identification of traffic sign information are of great significance in helping the drivers to know current road information and conduct correct driving behaviors, and meanwhile, the traffic sign information detection method has positive effects in avoiding traffic safety accidents.

At present, the traffic sign detection and identification has great significance in research and bright application prospect, and automobile manufacturers, scientific research institutions and universities in various countries have conducted intensive researches on the aspect. Through years of development, the current traffic sign detection methods can be generally divided into the following two types:

(1) Based on the traditional method: conventional methods include a color feature-based target detection method, a shape feature-based target detection method, and a machine learning-based target detection method. The three algorithms detect and locate traffic signs according to the characteristics of the colors, the shapes and the like of the traffic signs.

(2) Based on a deep learning method. The deep learning-based method is composed of a candidate region-based detection algorithm and a regression-based detection algorithm. Classical candidate region-based detection algorithms include R-CNN, fast R-CNN, fasterR-CNN, etc., which first generate a number of candidate regions on the basis of the original image, and then classify the candidate regions. It has the property of higher accuracy but slow detection speed. The regression-based detection algorithm comprises YOLO (YouOnlyLookOnce) series algorithm and SSD (Single ShotMultiBoxDetector) algorithm. The algorithm uses the idea of converting detection into regression, and directly completes detection and classification of targets. The detection speed is high, but the precision is poor.

Meanwhile, in the aspect of image processing, an image enhancement algorithm can be used for processing the image, so that the contrast and the definition of the image are increased. Such as histogram equalization enhancement algorithms, retinex image enhancement algorithms, homomorphic filter enhancement algorithms.

However, in heavy fog weather, research on detection and identification of traffic signs in foggy images is lacking, thereby affecting the accuracy of traffic sign identification under complex road conditions.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a road traffic sign detection method.

In order to achieve the above object, the present invention provides the following technical solutions:

a road traffic sign detection method, comprising:

acquiring a traffic sign image dataset;

generating a corresponding foggy image data set according to the traffic sign image data set;

inputting the foggy image data set into a generated countermeasure network model to train the countermeasure network model, wherein the trained generated countermeasure network model is used for converting the foggy image into a corresponding defogging image;

training a network model for image recognition by using the traffic sign image dataset to obtain a trained traffic sign detection recognition model;

and sending the defogging image to be detected into the trained generated countermeasure network model, generating a defogging image corresponding to the defogging image to be detected, and inputting the defogging image into the trained traffic sign detection and identification model to obtain the identification result of the traffic sign.

Further, generating a corresponding foggy image data set according to the traffic sign image data set; comprising the following steps:

randomly selecting a point in the traffic sign image as a central point, and selecting atomization concentration and atomization size;

respectively carrying out synthetic fog treatment on the RGB three channels of the traffic sign image to obtain a foggy image data set corresponding to the traffic sign image data set;

wherein, the expression of the synthetic fog treatment is:

I(x)＝J(x)t(x)+L(1-t(x))

wherein: i (x) is a hazy image; x is the coordinate value of the image pixel; j (x) is the defogging image to be restored; l is a global atmospheric light component; t (x) is transmittance.

Further, the method further comprises the following steps: stretching an image in the traffic sign data set, adding salt and pepper noise, and adjusting the resolution to 600 multiplied by 600 to obtain a TD data set;

the TD data set is divided into a training set, a verification set and a test set, and a network model for image recognition is trained by using the training set, the verification set and the test set.

Further, the atomizing size is set toWhere w, h are the width and height of the image, respectively.

Further, the atomization concentration was selected to be {0.1,0.2, …,0.9}.

Further, the improved FaterRCNN network is utilized as a network model for image recognition;

the improved FasterRCNN network comprises:

replacing the feature extraction network in the FasterRCNN network with a ResNet50 residual error learning network;

an attention mechanism module is added in a residual block of the ResNet50 residual learning network, and the attention mechanism module is used for weighting the output value of the residual block.

Further, the attention mechanism module includes:

the input end of the global average pooling layer is connected with the output end of the characteristic layer in the residual block;

the input ends of the two serially connected full-connection layers are connected with the output end of the global average pooling layer, the number of neurons of the first full-connection layer is smaller than that of neurons of the input characteristic layer, and the number of neurons of the second full-connection layer is the same as that of the input characteristic layer;

the input end of the Sigmoid activation function layer is connected with the output ends of the two serial full-connection layers, and the Sigmoid activation function layer is used for fixing the value between 0 and 1 to obtain a weight value;

multiplying the weight by the input feature layer to obtain the output of the ResNet50 residual block.

Further, the generating the countermeasure network model includes:

the generator is of a U-Net network structure and comprises 8 coding layers and 8 decoding layers;

the coding layer includes: the convolutional layers, the normalization layer and the LeakyReLU activation function layer are connected in sequence; the decoding layer includes: the transpose convolution layer, the normalization layer, the LeakyReLU activation function layer and the tanh activation function layer are connected in sequence; wherein, in the coding layer and the decoding layer, the sizes of all convolution kernels are set to 4*4, and the step length is set to 2;

the discriminator is a PatchGAN network structure and comprises 5 modules; the first 4 modules comprise a convolution layer, a normalization layer and a LeakyReLU activation function layer which are connected in sequence; the 5 th module includes 1 convolution layer with a convolution kernel size of 4x4 and a step size of 2.

The road traffic sign detection method provided by the invention has the following beneficial effects:

according to the invention, the traffic sign is converted into the corresponding hazy image, the anti-network model is generated by utilizing the training of the hazy image, so that the generator in the generated anti-network model can convert the hazy image into the corresponding defogging image, then the defogging image is identified by utilizing the image identification network model, and the high-precision detection and identification of the traffic sign under the foggy weather can be realized by associating the defogging model of the generator and the traffic sign detection and identification model. The method solves the problems that in the prior art, under the condition of heavy fog, the research on the detection and identification of the traffic sign in the foggy image is lacking, so that the identification precision of the traffic sign under the complex road condition is influenced.

Drawings

In order to more clearly illustrate the embodiments of the present invention and the design thereof, the drawings required for the embodiments will be briefly described below. The drawings in the following description are only some of the embodiments of the present invention and other drawings may be made by those skilled in the art without the exercise of inventive faculty.

FIG. 1 is a schematic diagram of a road traffic sign detection method according to the present invention;

FIG. 2 is an exemplary diagram of a traffic sign dataset according to an embodiment of the present invention;

FIG. 3 is an exemplary diagram of a defogging dataset according to an embodiment of the present invention;

FIG. 4 is a diagram of a residual module after attention mechanism addition according to the present invention;

FIG. 5 is a diagram of a generated countermeasure network architecture of the present invention;

FIG. 6 is a diagram of a network architecture of a generator of the present invention;

FIG. 7 is a diagram showing an example of defogging effect according to an embodiment of the present invention;

fig. 8 is a diagram of traffic sign recognition results according to an embodiment of the present invention.

Detailed Description

The present invention will be described in detail below with reference to the drawings and the embodiments, so that those skilled in the art can better understand the technical scheme of the present invention and can implement the same. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

Examples:

the invention provides a road traffic sign detection method, which is specifically shown in fig. 1 and comprises the following steps: acquiring a traffic sign image dataset; generating a corresponding foggy image dataset according to the traffic sign image dataset; inputting the foggy image data set into a generated countermeasure network model, and converting the foggy image into a corresponding defogging image by utilizing the trained generated countermeasure network model; training a network model for image recognition by using the traffic sign image dataset to obtain a trained traffic sign detection recognition model; and sending the defogging image to be detected into a trained defogging model, generating a defogging image corresponding to the defogging image to be detected, and inputting the defogging image into a trained traffic sign detection and identification model to obtain the identification result of the traffic sign.

The following are specific details of the invention:

step one, acquiring a traffic sign data set by consulting a literature, and generating a defogging image data set by a synthetic fog map method:

step S11, selecting a traffic sign data set: the data set selected is a CCTSDB data set, and fig. 2 shows a partial image in the traffic sign data set.

Step S12, defogging an image dataset: the defogging image is obtained by a method of synthesizing fog on the basis of the traffic sign data set, wherein the defogging image has different fog concentrations, the defogging image and the original image form a defogging image data set, and fig. 3 shows part of the defogging data set image (original image above, synthetic fog image below).

The method for synthesizing the fog is a method for synthesizing the fog based on the center point of a standard optical model.

The step S12 specifically includes the following steps:

step S1201 selects 1300 images on the traffic sign dataset TD for synthetic fog processing, defined as the initial dataset ID;

step S1202, performing resolution adjustment on an initial data set and labeling;

step S1203 performs synthetic fog processing according to a standard optical model, which has the following formula:

I(x)＝J(x)t(x)+L(1-t(x))

wherein I (x) is a hazy image; x is the coordinate value of the image pixel; j (x) is the defogging image to be restored; l is a global atmospheric light component; t (x) is transmittance.

Step S1204, randomly selecting a point in an image as a center point;

step S1205, randomly selecting atomization concentration x;

step S1206, setting the atomizing size toWherein w and h are imagesIs the width and height of (2);

in step S1207, the fog synthesis process is performed on the RGB channels of the color image, so as to achieve the effect of fog synthesis on the color image.

Step S1208, batch-synthesizing the foggy images.

In this example, in step S1205, the random atomization concentration x is selected to be {0.1,0.2, …,0.9}.

Step two, image preprocessing:

step S21, stretching 15000 images in the traffic sign data set, adding spiced salt noise and adjusting the resolution to 600 multiplied by 600, and defining the stretched images as a TD data set; dividing the TD data set into a training set containing 11000 images, a verification set of 1200 images and a TD-VOC data set of a test set;

step S22 of adjusting the resolution of the defogging image data set to 256×256, defining an HD data set;

training a model:

step S31, inputting the training set in the step two into a traffic sign detection and identification model for training to obtain a trained network model; and training the defogging data set in the defogging network model to obtain the defogging model.

Since the traffic sign occupies a small proportion in one image, the detection accuracy of FaterRCNN is low for such small target detection.

Specifically, in step S31, the modified FasterRCNN algorithm is used for training, where the main network is replaced with a res net50 residual learning network first, and a SENet attention mechanism is added to the res net50, and the addition of the attention mechanism is specifically shown in fig. 4. In fig. 4, globalPooling represents a global average pooling layer, FC represents a fully connected layer, reLU is an activation function, sigmoid is also an activation function, and Residual represents a Residual block. The attention mechanism is added into the feature extraction network, so that the distribution of weights on each channel can be automatically adjusted, and the extraction capability of effective information is improved.

The defogging network model is a countermeasure network model (GenerativeAdversarialNets, GAN), a generator model is obtained through training, and a corresponding defogging image is generated through inputting the defogging image. The structure of which is shown in fig. 5. The structure of the generator is shown in fig. 6, and the generator comprises 8 coding layers and 8 decoding layers, wherein the encoder is the coding layer, and each coding layer comprises 1 convolution layer, a normalization layer and a LeakyReLU activation function; the decoder is a decoding layer, and each decoding layer comprises a transposed convolution layer, a normalization layer, a LeakyReLU activation function, and a tanh activation function. The convolution kernel sizes are 4*4, and the step size is set to 2.

Outputting a traffic sign label recognition result:

step S41, in the example, firstly, the image with fog to be detected is sent to the generator model in the step three, a corresponding non-fog image is generated according to the image with fog, the defogging image is shown in fig. 7 (the upper row of the image with fog in fig. 7 is the upper row of the image with fog in the lower row of the image with defogging in fig. 7), and the resolution is adjusted to 600 multiplied by 600;

and step S42, sending the defogging image obtained in the step S41 into the traffic sign detection and identification model in the step three to carry out traffic sign identification, and in the output result, selecting a traffic sign label with the confidence degree larger than 0.9 from the output picture, and outputting the category of the traffic sign label, wherein the identification result is shown in fig. 8.

Traffic signs are recognized in the images to be recognized, and the recognition confidence is greater than 0.9.

The following is a comparative example of the present invention with the existing detection algorithm:

comparative example 1:

in the comparative example, a traffic sign detection method is provided, which is performed according to the following steps:

step one, acquiring a traffic sign data set by consulting a literature, and generating a defogging image data set by a synthetic fog map method;

step two, an image preprocessing part, which is the same as the original embodiment;

training a model:

The defogging process is the same as the original embodiment, using the original FasterRCNN algorithm in the traffic sign recognition model.

Outputting a traffic sign label recognition result; step four is the same as the original embodiment.

Comparative example 2:

step two, the image preprocessing parts are the same;

training a model, wherein the training model is the same as that of the original embodiment;

outputting a traffic sign label recognition result; in the comparative example, the image was not defogged and was directly sent to the traffic sign recognition model for recognition, and the recognition result was obtained.

And (3) experimental verification:

the multi-level association detection method for the complex road traffic sign is compared with the detection effect of the traffic sign by a general detection algorithm.

Experimental background: in order to prove the high efficiency and accuracy of the invention for detecting the traffic sign, the invention is compared with the detection result of the traffic sign by using the original FasterRCNN detection algorithm only.

Experimental environment: the operating system is Win10, the GPU is GeforceGTX1080Ti, and the software platform is PyCharm.

The experimental method comprises the following steps: the traffic sign results were compared using the same test set using the method of the present invention, the original FasterRCNN detection algorithm alone as given in comparative example 1, and the defogging process in the removal step S31 as given in comparative example 2. The test set comprises traffic sign signs in complex environments with different fog concentrations, and the proportion of the traffic sign signs to the pictures has a certain gap, so that the pictures of the test set are close to the pictures of actual scenes acquired by the vehicle-mounted cameras when the vehicle runs; the detection results include detection accuracy and average detection accuracy (mAP) of different types of traffic signs, and the final results are shown in Table 1.

Table 1 traffic sign detection results for different methods

Analysis of results:

as shown in table 1, when the conventional detection algorithm, namely the original FasterRCNN algorithm, is used for detecting the traffic sign, the detection accuracy is not high; when the image with fog in the large fog weather is not processed, the detection precision is obviously reduced by about 20%, which is far lower than the method of the invention after defogging treatment, and meanwhile, the recognition precision can be greatly improved after defogging treatment is carried out on the image in the large fog weather. In the method, the improved FasterRCNN algorithm is used, the identification effect is better than that of the algorithm before improvement, meanwhile, defogging treatment is carried out on the foggy image, the influence of fog on the identification performance is reduced, and the detection and identification performance of traffic signs in complex scenes, particularly in large fog scenes, can be effectively improved.

(2) By improving the FasterRCNN algorithm, the test is performed on a test set collected under the actual environment, the detection precision is high, and the detection and identification of traffic signs under the actual traffic scene are satisfied;

the above embodiments are merely preferred embodiments of the present invention, the protection scope of the present invention is not limited thereto, and any simple changes or equivalent substitutions of technical solutions that can be obviously obtained by those skilled in the art within the technical scope of the present invention disclosed in the present invention belong to the protection scope of the present invention.

Claims

1. A method for detecting road traffic signs, comprising:

acquiring a traffic sign image dataset;

2. The method of claim 1, wherein the generating a corresponding foggy image dataset from the traffic sign image dataset; comprising the following steps:

wherein, the expression of the synthetic fog treatment is:

I(x)＝J(x)t(x)+L(1-t(x))

3. The method for detecting road traffic sign according to claim 1, further comprising: stretching an image in the traffic sign data set, adding salt and pepper noise, and adjusting the resolution to 600 multiplied by 600 to obtain a TD data set;

4. The road traffic sign detection method according to claim 2, wherein the fog size is set toWhere w, h are the width and height of the image, respectively.

5. The method of claim 1, wherein the concentration of fog is selected from {0.1,0.2, …,0.9}.

6. The road traffic sign detection method according to claim 1, characterized in that a modified FasterRCNN network is utilized as a network model for image recognition;

the improved FasterRCNN network comprises:

7. The method of claim 6, wherein the attention mechanism module comprises:

8. The method of claim 1, wherein generating the countermeasure network model comprises:

the discriminator is a PatchGAN network structure and comprises 5 modules; the first 4 modules are connected in turn

The connected convolution layer, normalization layer and LeakyReLU activation function layer; the 5 th module comprises 1 convolution layer,

the convolution kernel size is 4x4, and the step size is 2.