CN114973207B

CN114973207B - Road sign identification method based on target detection

Info

Publication number: CN114973207B
Application number: CN202210913244.3A
Authority: CN
Inventors: 金长江; 冉燕辉
Original assignee: Chengdu Aeronautic Polytechnic
Current assignee: Chengdu Aeronautic Polytechnic
Priority date: 2022-08-01
Filing date: 2022-08-01
Publication date: 2022-10-21
Anticipated expiration: 2042-08-01
Also published as: CN114973207A

Abstract

The invention discloses a road sign identification method based on target detection, which comprises the following steps: s1, collecting road sign images, and preprocessing the road sign images to obtain contour data; s2, extracting feature data of the contour data through an LSTM feature extraction module; s3, constructing the feature data and the corresponding labels into a training data set; s4, training the target detection model by adopting a training data set to obtain a trained target detection model; s5, processing the feature data of the road sign image to be recognized by adopting the trained target detection model to obtain a corresponding road sign type; the invention solves the problem of low identification accuracy of the existing target detection method.

Description

Road sign identification method based on target detection

Technical Field

The invention relates to the technical field of image processing, in particular to a road sign identification method based on target detection.

Background

With the development of society, more and more people begin to pay attention to the development of unmanned technology, and as one of the components of comprehensive technology, the technology of identifying objects through algorithm detection has kept its own status. Because deep learning technology has made a certain breakthrough in recent years, the application technology of convolutional neural networks is more and more mature. Meanwhile, because related neural networks such as CNN have own unique advantages in the intelligent identification process, the development of a target detection algorithm combined with deep learning in recent years becomes an important development direction of the current identification detection algorithm. However, most of the existing target detection methods directly process the target image by using the CNN neural network, and the recognition accuracy is not high.

Disclosure of Invention

Aiming at the defects in the prior art, the road sign identification method based on target detection provided by the invention solves the problem of low identification accuracy of the existing target detection method.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that: a road sign identification method based on target detection comprises the following steps:

s1, collecting road sign images, and preprocessing the road sign images to obtain contour data;

s2, extracting feature data of the contour data through an LSTM feature extraction module;

s3, constructing the feature data and the corresponding labels into a training data set;

s4, training the target detection model by adopting a training data set to obtain a trained target detection model;

and S5, processing the feature data of the road sign image to be recognized by adopting the trained target detection model to obtain the corresponding road sign type.

Further, the step S1 includes the following sub-steps:

s11, collecting road sign images;

and S12, extracting the contour of the road sign image to obtain contour data.

Further, the step S12 includes the following sub-steps:

s121, selecting any pixel point from the road sign image as a standard point;

s122, calculating the color distance between other pixel points in the road sign image and the standard point to obtain a plurality of color distance values;

s123, carrying out gray processing on the pixel points corresponding to the color distance values lower than the color threshold, and giving the same gray value to the pixel points corresponding to the color distance values lower than the color threshold;

s124, taking the pixel point corresponding to the color distance value higher than the color threshold value as a new standard point;

s125, calculating the color distance value between the pixel point of the rest part after graying and the new standard point, and skipping to the step S123 until the road sign image is grayed into a gray image with different gray value areas;

s126, selecting pixel points in a non-edge area in the gray image as undetermined points;

s127, judging whether the gray value of 9 pixel points close to the undetermined point is the same as the gray value of the undetermined point, if so, determining the undetermined point as a point to be deleted, and jumping to the step S128, otherwise, reserving the undetermined point, and jumping to the step S129;

s128, randomly finding a pixel point from the neighborhood of the point to be deleted as a new point to be deleted, and jumping to the step S127 until all pixel points in the non-edge area in all gray level images are traversed;

and S129, deleting the points to be deleted, wherein all the pixels of the points to be deleted and the edge area form contour data.

The beneficial effects of the above further scheme are as follows: the method comprises the steps that the color presenting degrees of the road sign image are slightly different due to the influence of illumination, a gray value is given to the area with the same type of color by setting a color threshold value, the color range to which the pixel point does not belong is found in an iterative mode, and therefore another gray value is given to the pixel point, the graying of the road sign image is realized, and each color area is grayed into different gray values; by searching 9 adjacent pixel points around the undetermined point, if the gray values of the 9 pixel points are the same as the gray value of the undetermined point, the non-contour area is indicated, and finally, all the points in the non-contour area need to be deleted, and the rest is contour data.

Further, the calculation formula of the color distance value in step S122 is:

wherein the content of the first and second substances,

is the color distance value between the pixel point and the standard point,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

The passage is provided with a plurality of channels,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

The passage is provided with a plurality of channels,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

A channel.

Further, the structure of the target detection model in step S4 includes: the first residual block, the second residual block, the third residual block, the fourth residual block, the first Maxpool, the second Maxpool, the third Maxpool, the Concat layer, the first Conv, the BN layer and the second Conv;

the input end of the first residual block is used as the input end of the target detection model, and the output end of the first residual block is respectively connected with the input end of the second residual block and the input end of the first Maxpool; the output end of the second residual block is respectively connected with the input end of a third residual block and the input end of a second Maxpool; the output end of the third residual block is respectively connected with the input end of the fourth residual block and the input end of the third Maxpool; a first input end of the Concat layer is connected with an output end of the first Maxpool, a second input end of the Concat layer is connected with an output end of the second Maxpool, a third input end of the Concat layer is connected with an output end of the third Maxpool, a fourth input end of the Concat layer is connected with an output end of the fourth residual block, and an output end of the Concat layer is connected with an input end of the first Conv; the input end of the BN layer is connected with the output layer of the first Conv, and the output end of the BN layer is connected with the input end of the second Conv; and the output end of the second Conv is used as the output end of the target detection model.

Further, the window size of the first Maxpool is 3 x 3; the window size of the second Maxpool is 5 x 5; the window size of the third Maxpool is 7 × 7.

The beneficial effects of the above further scheme are as follows: the characteristics are extracted layer by layer through the first residual block, the second residual block, the third residual block and the fourth residual block, the extracted characteristics of each layer are input into the maximum pooling layer, the characteristics of different degrees are reserved through windows of different sizes, the characteristics are collected through the Concat layer, the richness of the characteristics is reserved to the maximum degree, and the accuracy of target identification is improved.

Further, the loss function of the training process in step S4 is:

wherein the content of the first and second substances,

in order to obtain the value of the loss,

is the actual output of the object detection model,

is the predicted output of the object detection model,

the abscissa of the geometric center of the region output for prediction of the object detection model,

the ordinate of the geometric center of the region that is the prediction output of the object detection model,

is the abscissa of the geometric center of the region of the actual output of the object detection model,

is the ordinate of the geometric center of the region of the actual output of the object detection model,

to cover the actual output of the target detection model

And the predicted output of the object detection model

The linear distance between the two farthest pixels in the region of (a),

detecting the actual output of the model for the object

And the prediction output of the target detection model

Overlap rate of change.

The beneficial effects of the above further scheme are as follows: according to the method, the difference between the actual output and the predicted output in the training process is measured through the intersection of the actual output and the predicted output, the union ratio of the actual output and the predicted output, the ratio of the distance between the actual output center and the actual output center to the linear distance between the two farthest pixel points in the area covering the actual output and the predicted output, and the change rate of the overlapping area, so that the actual output approaches the predicted output.

In conclusion, the beneficial effects of the invention are as follows: according to the method, the road sign image is preprocessed, the key outline data is extracted, the LSTM characteristic extraction module is used for extracting the characteristic data, the characteristic data and the corresponding label are used for training the target detection model, on one hand, the data volume is reduced, on the other hand, the target detection model accurately captures the corresponding relation between the characteristic data and the corresponding label through the characteristic data and the corresponding label, and the target identification accuracy is improved.

Drawings

FIG. 1 is a flow chart of a landmark identification method based on target detection;

FIG. 2 is a schematic diagram of the structure of a cell unit of the LSTM feature extraction module;

fig. 3 is a schematic structural diagram of an object detection model.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

As shown in fig. 1, a landmark identification method based on target detection includes the following steps:

the step S1 comprises the following sub-steps:

s11, collecting road sign images;

and S12, extracting the contour of the road sign image to obtain contour data.

The step S12 includes the following sub-steps:

s121, selecting any pixel point from the road sign image as a standard point;

the formula for calculating the color distance value in step S122 is:

wherein, the first and the second end of the pipe are connected with each other,

is the color distance value between the pixel point and the standard point,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

The passage is provided with a plurality of channels,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

The passage is provided with a plurality of channels,

of pixel colour

The passage is provided with a plurality of channels,

of standard dot colour

A channel.

S123, carrying out gray processing on the pixel points corresponding to the color distance values lower than the color threshold value, and giving the same gray value to the pixel points corresponding to the color distance values lower than the color threshold value;

s127, judging whether the gray value of the adjacent 9 pixel points of the undetermined point is the same as the gray value of the undetermined point, if so, determining the undetermined point as a point to be deleted, and jumping to the step S128, otherwise, reserving the undetermined point and jumping to the step S129;

s128, randomly finding a pixel point from the neighborhood of the point to be deleted to serve as a new point to be deleted, and skipping to the step S127 until all pixel points in the non-edge area in all the gray level images are traversed;

The road sign image has slightly different color presenting degrees due to the influence of illumination, one gray value is given to the area of the same type of color by setting a color threshold value, and the color range to which the pixel point does not belong is found in an iterative mode, so that the other gray value is given to the pixel point, the graying of the road sign image is realized, and each color area is grayed into different gray values; by searching 9 adjacent pixel points around the undetermined point, if the gray values of the 9 pixel points are the same as the gray value of the undetermined point, the non-contour area is indicated, and finally, all the points in the non-contour area need to be deleted, and the rest is contour data.

fig. 2 shows a cell unit of the LSTM feature extraction module in step S2, where input and output relationships of the cell unit are as follows:

resetting a gate:

an input gate:

a memory gate:

an output gate:

wherein the content of the first and second substances,

in order to reset the output of the gate,

is composed of

The output of the cell unit at the time of day,

is the input of the cell unit and is,

in order to reset the weight of the gate,

in order to reset the bias of the gate,

is the output of the input gate or gates,

to enter the weight of the gate(s),

in order to input the offset of the gate,

in order to memorize the output of the gate,

in order to memorize the weight of the gate,

in order to memorize the offset of the door,

is composed of

The output of the cell unit at the time of day,

to be the weight of the output gate,

in order to output the offset of the gate,

for the purpose of the hyperbolic tangent activation function,

is an activation function.

as shown in fig. 3, the structure of the target detection model in step S4 includes: a first residual block, a second residual block, a third residual block, a fourth residual block, a first Maxpool, a second Maxpool, a third Maxpool, a Concat layer, a first Conv, a BN layer, and a second Conv;

the input end of the first residual block is used as the input end of the target detection model, and the output end of the first residual block is respectively connected with the input end of the second residual block and the input end of the first Maxpool; the output end of the second residual block is respectively connected with the input end of the third residual block and the input end of the second Maxpool; the output end of the third residual block is respectively connected with the input end of the fourth residual block and the input end of the third Maxpool; a first input end of the Concat layer is connected with an output end of the first Maxpool, a second input end of the Concat layer is connected with an output end of the second Maxpool, a third input end of the Concat layer is connected with an output end of the third Maxpool, a fourth input end of the Concat layer is connected with an output end of the fourth residual block, and an output end of the Concat layer is connected with an input end of the first Conv; the input end of the BN layer is connected with the output layer of the first Conv, and the output end of the BN layer is connected with the input end of the second Conv; and the output end of the second Conv is used as the output end of the target detection model.

The window size of the first Maxpool is 3 x 3; the window size of the second Maxpool is 5 x 5; the window size of the third Maxpool is 7 × 7.

The characteristics are extracted layer by layer through the first residual block, the second residual block, the third residual block and the fourth residual block, the extracted characteristics of each layer are input into the maximum pooling layer, the characteristics of different degrees are reserved through windows of different sizes, the characteristics are collected through the Concat layer, the richness of the characteristics is reserved to the maximum degree, and the accuracy of target identification is improved.

The loss function of the training process in step S4 is:

wherein the content of the first and second substances,

in order to obtain the value of the loss,

is the actual output of the object detection model,

for the predicted output of the object detection model,

the abscissa of the geometric center of the region that is the prediction output of the object detection model,

region table for prediction output of target detection modelThe ordinate of which center is,

to cover the actual output of the target detection model

And the predicted output of the object detection model

The linear distance between the two farthest pixels in the region of (a),

detecting the actual output of the model for the object

And the prediction output of the target detection model

The area of (2) is the image area formed by the pixel data.

According to the method, the difference between the actual output and the predicted output in the training process is measured through the intersection of the actual output and the predicted output, the union ratio of the actual output and the predicted output, the ratio of the distance between the actual output center and the actual output center to the linear distance between the two farthest pixel points in the area covering the actual output and the predicted output, and the change rate of the overlapping area, so that the actual output approaches the predicted output.

Claims

1. A road sign identification method based on target detection is characterized by comprising the following steps:

s5, processing the feature data of the road sign image to be recognized by adopting the trained target detection model to obtain a corresponding road sign type;

the step S1 comprises the following sub-steps:

s11, collecting road sign images;

s12, extracting the contour of the road sign image to obtain contour data;

the step S12 comprises the following sub-steps:

s121, selecting any pixel point from the road sign image as a standard point;

s125, calculating the color distance value between the pixel point of the rest part after graying and the new standard point, and jumping to the step S123 until the road sign image is grayed into a grayscale image with different grayscale value areas;

s129, deleting the points to be deleted, wherein all the points to be deleted and the pixels of the edge area form contour data;

the calculation formula of the color distance value in step S122 is:

wherein d is the color distance value between the pixel point and the standard point, P _1,R R channel, P, of pixel color _2,R R channel, P, for standard dot color _1,G G channel, P, of pixel color _2,G G channel, P, of standard dot color _1,B B channel, P, of pixel color _2,B A B channel of standard dot color;

the loss function of the training process in step S4 is:

wherein L is a loss value,

is the actual output of the target detection model, y is the predicted output of the target detection model, x _* Abscissa, y, of geometric center of region output for prediction of object detection model _* The ordinate of the geometric center of the region output for prediction of the object detection model,

the abscissa of the geometric center of the region that is the actual output of the object detection model,

ordinate of geometric center of region as actual output of target detection model, di as actual output of target detection model

And the linear distance between the farthest two pixel points in the region of the predicted output y of the target detection model, and v is the actual output of the target detection model

And the area of the predicted output y of the object detection model.

2. A landmark recognition method based on object detection according to claim 1, wherein the structure of the object detection model in step S4 includes: the first residual block, the second residual block, the third residual block, the fourth residual block, the first Maxpool, the second Maxpool, the third Maxpool, the Concat layer, the first Conv, the BN layer and the second Conv;

3. A method as claimed in claim 2, wherein the window size of the first Maxpool is 3 x 3; the window size of the second Maxpool is 5 x 5; the third Maxpool has a window size of 7 × 7.