CN112101283A

CN112101283A - Intelligent identification method and system for traffic signs

Info

Publication number: CN112101283A
Application number: CN202011023022.1A
Authority: CN
Inventors: 黄北贵; 赵汇诗; 张锦鑫
Original assignee: Shenzhen Technology University
Current assignee: Shenzhen Technology University
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2020-12-18

Abstract

The invention discloses an intelligent traffic sign identification method, which is used for identifying traffic signs and solving the technical problems that the acquired traffic sign images in the prior art are distorted, so that the subsequent processing result is inaccurate and the accuracy of traffic sign identification is reduced, and comprises the following steps: carrying out noise reduction pretreatment on the acquired image to obtain a pretreated image; detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image; identifying the image to be identified; by carrying out noise reduction processing on the acquired images, the probability of acquired image distortion is reduced, so that the accuracy of subsequent processing results is improved, and the accuracy of traffic sign identification is improved.

Description

Intelligent identification method and system for traffic signs

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an intelligent identification method and system for a traffic sign.

Background

The traffic sign recognition technology is an important component of an intelligent traffic system, and has wide application prospects in the fields of driving assistance systems, unmanned automobiles and the like.

The traffic sign recognition technology needs an image processing method, so that useless information in the collected traffic sign image is eliminated.

However, the traffic sign in the natural scene is affected by a plurality of outdoor complex natural environments, motion blur and geometrical structures of the sign, and the acquired traffic sign image is distorted, so that the subsequent processing result is inaccurate, and the accuracy of the traffic sign identification is reduced.

Disclosure of Invention

The invention mainly aims to provide an intelligent traffic sign identification method and system, and aims to solve the technical problems that in the prior art, acquired traffic sign images are distorted, so that subsequent processing results are inaccurate, and the accuracy of traffic sign identification is reduced.

In order to achieve the above object, a first aspect of the present invention provides a traffic sign intelligent recognition method, including: carrying out noise reduction pretreatment on the acquired image to obtain a pretreated image; detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image; and identifying the image to be identified.

A second aspect of the present application provides a traffic sign intelligent recognition system, including: the preprocessing module is used for carrying out noise reduction preprocessing on the acquired image to obtain a preprocessed image; the detection and segmentation module is used for detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image; and the image identification module is used for identifying the image to be identified.

The invention provides an intelligent identification method and system for traffic signs, which have the beneficial effects that: by carrying out noise reduction processing on the acquired images, the probability of acquired image distortion is reduced, so that the accuracy of subsequent processing results is improved, and the accuracy of traffic sign identification is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a traffic sign intelligent recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic interpolation diagram of a size reduction method of the traffic sign intelligent recognition method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the comparison between an original image and a reduced image of a size reduction method of an intelligent traffic sign recognition method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a triangular sign detection algorithm of the traffic sign intelligent recognition method according to the embodiment of the invention;

fig. 5 is an original image, a connected domain mark schematic diagram and a final positioning diagram obtained by the traffic sign intelligent identification method according to the embodiment of the invention;

FIG. 6 is a final segmentation chart and an image to be recognized of the traffic sign intelligent recognition method according to the embodiment of the invention;

fig. 7 is a schematic diagram of standardization of a standard image of the traffic sign intelligent recognition method according to the embodiment of the invention.

Detailed Description

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, an intelligent traffic sign recognition method includes: s1, carrying out noise reduction preprocessing on the collected image to obtain a preprocessed image; s2, detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image; and S3, identifying the image to be identified.

In the process of identifying the traffic sign, the collected image is subjected to noise reduction processing, so that the probability of distortion of the collected image is reduced, the accuracy of a subsequent processing result is improved, and the accuracy of traffic sign identification is improved.

The noise reduction preprocessing of the acquired image comprises the following steps: reducing the size of the acquired image by using a bilinear interpolation method; the reduced image is subjected to increment by using a histogram equalization method; and performing noise reduction processing on the incremental image by using a median filtering method to obtain a preprocessed image.

In order to improve the speed of detection and identification, firstly, on the premise of ensuring the completeness of the mark information, a down-sampling method is adopted to sample the acquired traffic mark live-action image, so that the image resolution is reduced, the processed data volume is reduced, and meanwhile, the image is subjected to normalization processing, so that the subsequent detection and identification are facilitated: secondly, in order to solve the problem of insufficient contrast of the acquired traffic sign image, on the basis of researching and analyzing a common image enhancement algorithm, a histogram equalization method is adopted to carry out brightness enhancement processing on the image; and finally, researching an image denoising method, and finally selecting a median filtering algorithm to denoise the traffic sign live-action image through a comparison experiment, thereby effectively reducing the noise of the image.

The down-sampling method is a size reduction method, and has two functions, firstly, the size of the original image is made to be consistent with the size of the display area through size reduction; secondly, the zoomed image has clearer quality than the original image through the reduction of the size, and the detection rate is greatly improved.

The principle and process of size reduction is as follows: an original image M with size a b is down-sampled by N times to obtain a scaled image with size (a/N) b/M, where N is the common divisor of a and b. When the reduced image is represented in a matrix form, the pixel values of the image in the N x N window of the original image are represented by the average value of all pixels in the window. At this time, the average of all pixels in the window is:

in this embodiment, a bilinear interpolation method is adopted as a size reduction method, and the principle and implementation process of the bilinear interpolation method are as follows: the pixel values of four points in the original image are used to obtain the pixels of the whole scaled image. Assuming that the unknown function f is evaluated at the point P ═ x, y, that is, the O-point pixel value is evaluated by the pixel values of four Q points as shown in the difference diagram of fig. 2; q11 ═ x1, y1, Q12 ═ x1, y2, Q21 ═ x2, y1, and Q22 ═ x2, y2 are known as four-point values.

Linear interpolation in the x-direction is:

the y direction interpolation has:

f (x, y) is obtained according to the three formulas:

bilinear interpolation of an image uses only four neighboring points, so the denominator of the above equation is 1. f (x, y) ≈ f (0,0) (1-x) (1-y) + f (1,0) x (1-y) + f (0,1) (1-x) y + f (1,1) xy

Or expressed as:

the image size is reduced through a down-sampling process, and after a road image in an actual scene is subjected to linear operation mainly by using a bilinear interpolation algorithm, it is found through experiments that the original image size 2500 × 1550 is reduced to 500 × 310, that is, to 1/6 times the original size, (N is 6), and the effect is the best, so in this embodiment, the collected image is reduced by 6 times, the size reduction is as shown in fig. 3(a) and 3(b), fig. 3(a) is an original image, and the four images in fig. 3(b) are reduced images of original 1/2(N is 2), 1/4(N is 4), 1/6(N is 6), and 1/8(N is 8), respectively.

The above method of image increment adopts a histogram equalization method, i.e. extending an image nonlinearly so that the number of pixel values in a certain gray scale range is approximately equal. The histogram is essentially a mathematical statistical relationship that primarily represents the magnitude of the frequency with which each gray level in the image occurs. It mainly has the following functions, firstly, the gray distribution state of the whole image is reflected, and secondly, the image brightness contrast and the average brightness are known according to the gray frequency.

The gray level histogram is mainly reflected on the number of gray level pixels in an image and is a gray level function. The abscissa generally represents the gray levels of the image, and is denoted by r, and the ordinate represents the frequency pr (r) of each gray level, and the gray level distribution characteristics of the image can be seen in the complete coordinate system, which represents the distribution of the gray levels of the image. That is, if most of the pixels are concentrated in a low gray area, the image appears dark. If the high gray area has more pixels, the image will be very bright.

Histogram equalization of grayscale images uses the opencv image processing toolkit function cv Equalize. After histogram equalization, the detail information of many images can be seen and the images become clearer.

In order to compare the frequency change of each gray level in the images before and after preprocessing, corresponding histograms are drawn for the images before and after equalization respectively, and the histogram distribution after equalization is more uniform than that of the original histogram. Relatively few original grays of pixels are assigned to other grays and the pixels are relatively concentrated. After the equalization processing, the gray scale range is enlarged, the contrast is enlarged, the sharpness is enlarged, and the effective enhancement of the image brightness is realized. Therefore, the traffic sign image is preprocessed by image brightening through a histogram equalization method.

In this embodiment, the noise reduction processing uses a median filtering method, and the median filtering is a nonlinear filtering with a good denoising effect in the image smoothing processing. The basic principle is as follows: the pixel values in the neighborhood are sorted by gray level and the pixel values in the sequence are replaced by the intermediate pixel values of the nearby pixels. This does not create new pixel values, but takes the surrounding pixel values as its output. Pixels on the image are first scanned with a window W, then sorted and formulated as: g (x, y) ═ med { f (x-k, y-l), (k, l ∈ W) } typically the number of pixels in the window is odd, so that the median value of the intermediate pixels can appear. If even, the median is the average of the middle two pixels.

Median filtering is a classic smoothing denoising method, and is most effective for an image with blurred edges because it can well preserve the image and other detailed information.

In other embodiments, the filtering method may also use mean filtering, bilateral filtering, and gaussian filtering.

The method for detecting and segmenting the traffic sign of the preprocessed image to obtain the image to be identified of the traffic sign image comprises the following steps: converting the preprocessed image from an RGB color model to an HSV model representation; detecting the color characteristics of the preprocessed image by using an HSV model and carrying out threshold segmentation to obtain a color threshold segmentation graph with a connected domain; acquiring corner points of each connected domain on the color threshold segmentation graph; determining whether each connected domain is in the shape of a traffic sign or not according to the corner points, if so, keeping the connected domain on the color threshold segmentation graph, and if not, removing the connected domain from the color threshold segmentation graph; finding out a connected domain of the traffic sign image from the connected domain on the color threshold segmentation map; and segmenting and processing the traffic sign image contained in the connected domain from the preprocessed image to obtain the image to be recognized of the traffic sign image.

The RGB model is machine-oriented, while the HSI model is one-to-one corresponding to human color perception, so model conversion between RGB and HSI is often used in practical applications.

Firstly, setting the value ranges of R, G and B components as [0, 1], and then calculating the corresponding H, S and I component values according to the following formula:

at present, the RGB color model and the HSV color model are widely used in image detection and recognition. Computers typically use the RGB color model to store and display images. Therefore, the color information of the image in the HSV space must be obtained through conversion between the RGB color model and the HSV color model. The HSV and RGB color models use three channels to represent color information, and the transformation can be realized by the following method:

converting the preprocessed image from the RGB color model to the HSV model representation includes: normalizing the three components R, G, B of RGB to the range of [0, 1], and defining the value ranges of the three components H, S, V of HSV as H [0,360 ], S [0, 1], V [0, 1 ]; the conversion relationship between the RGB color model and the HSV model is as follows:

V＝max(R,G,B)

S＝(max(R,G,B)-min(R,G,B))/max(R,G,B)

if R ═ max (R, G, B), then

H′＝(G-B)/(max(R,G,B)-min(R,G,B))

If G ═ max (R, G, B), then

H′＝2+(B-R)/(max(R,G,B)-min(R,G,B))

If B is max (R, G, B), then

H′＝4+(R-G)/(max(R,G,B)-min(R,G,B))

Wherein, when H 'is not less than 0, H' x 60; when H '<0, H ═ H' × 60+ 360.

In the embodiment, the HSV color model is used for the mark detection and segmentation, wherein threshold ranges of various parameters H, S, V are shown in table 1:

TABLE 1

Wherein, the image to be detected is also required to be processed by morphology processing, region filling and segmentation, and the basic operation of morphology consists of four parts of corrosion, expansion, opening operation and closing operation. Performing morphological basic operation on the noise binary image after the color segmentation of the traffic sign to obtain a morphological processing image; region filling is to give a boundary of a region, and filling an image requires deleting non-target items through a region threshold. And obtaining a region filling graph after filling, and finally segmenting according to the region filling graph.

When segmenting, firstly, what kind of shape of the traffic sign on the image to be recognized needs to be detected, generally, the shape of the traffic sign is triangular, rectangular or circular, and finally, the traffic sign is segmented.

The method for detecting the traffic sign on the image to be recognized is based on the HSV color model and the corner point shape detection, and is realized on the basis of a traffic sign detection method based on color and a traffic sign detection method based on shape. The road sign detection method combines the two methods and improves the road sign detection method which simultaneously adopts the color characteristics and the shape clue characteristics. So-called corner detection refers to finding corners in the image and then determining the position of the shape by suitably positioned corners.

The detection mode of the triangular mark is as follows:

first, appropriate masks are created for the three corners of the triangular logo. The triangle is marked as a regular triangle with the apex angle upward, and each angle is 60 °. Masks as in tables 2 and 3 below may be established. Since a triangle is an axisymmetric pattern, its lower left and lower right corners are symmetric, so the masks of its lower left and lower right corners are also symmetric. Here only the mask in its lower left corner is given. The lower right corner can be derived by symmetry.

-6	-11	-11	-6	0	-6	-11	-11	-6
									-11	-18	-18	-11	0	-11	-18	-18	-11
-11	-18	-18	-11	0	-11	-18	-18	-11
									-6	-11	-11	-6	0	-6	-11	-11	-6
-6	-11	-11	12	12	12	-11	-11	-6
									-11	-18	-19	19	20	19	-18	-18	-11
-11	-18	17	19	20	19	17	-18	-11
									-6	-11	10	12	12	12	10	-11	-6
0	12	20	20	12	20	20	12	0

TABLE 2 triangle symbol upper vertex angle mask table

-6	-11	-11	-6	0	-6	7	7	4
									-11	-18	-18	-11	0	-11	13	13	8
-11	-18	-18	-11	0	10	17	17	10
									-6	-11	-11	-6	0	12	19	19	12
0	0	0	0	0	12	20	20	12
									-6	-11	-11	-6	0	-6	-11	-11	-6
-11	-18	-18	-11	0	-11	-18	-18	-11
									-11	-18	-18	-11	0	-11	-18	-18	-11
-6	-11	-11	-6	0	-6	-11	-11	-6

TABLE 3 triangle symbol lower left corner mask sheet

The steps for detecting the triangular mark are as follows:

(1) the top corner, bottom left corner and bottom right corner are found through the mask.

(2) The positions of these three corners were studied and judged that they could not form a triangle.

The detection algorithm steps are as follows:

(1) the entire image is scanned from left to right, top to bottom. The upper corner is found by a convolution with a mask calculation of the upper corner of the triangle.

(2) From this top angle, two rays with slopes of 52 ° and 68 ° are made through it. And searching the lower left corner through the mask of the lower left corner of the triangle in the area limited by the two straight lines and the height of the mark or the edge of the image. And (4) if the lower left corner is found, executing the step (3), and returning to the step (1) if the lower left corner is not found after the area search is finished.

(3) After finding the lower left corner, two rays with slopes of-52 and-68 can be made through the upper vertex as if finding the lower left corner. Then the two rays and the height-limited or image-limited region are passed through a mask in the lower right corner of the triangle to find the lower right corner. If found, the three corners form a triangle and the triangle flag is detected. And returning to the step 2 if the lower right corner is not found after the area search is finished.

The whole algorithm is shown in fig. 4, the left gray area is the search area at the lower left corner, and the right gray area is the search area at the lower right corner. The comers 1, 2 and 3 are the upper top corner, the lower left corner and the lower right corner of the triangle respectively. Height min is the minimum Height of the mark, Height max is the maximum Height of the mark, and Her light cor 2 is the Height of the corner point at the lower left corner. a is the width of the left search area at the lower left corner point.

The detection mode of the rectangular mark is as follows:

the four corners of the rectangle are all right angles and a mask as in table 3.4 below is established for the lower left corner of the rectangle. The rectangle is also an axisymmetric figure. The masks for the other three corners can be obtained by a symmetrical transformation of the mask in the lower left corner.

TABLE 4 lower left corner mask with rectangular marks

The steps for detecting the rectangular marker are as follows:

(1) the four corners of the rectangular mark are found through the mask.

(2) The positions of these four corners are studied and it is judged that they cannot constitute a rectangle.

The detection algorithm steps are as follows:

(1) from left to right, the image is scanned from top to bottom and the convolution with the mask in the upper left corner of the rectangle is calculated. And (5) finding the upper left corner, and executing the step (2).

(2) Two rays with slopes of-85 deg. and-95 deg. are taken through this point. The lower left corner of the rectangle is then found through the rectangle lower left corner mask in the area bounded by the two rays and the marker height or image edge. And (5) if the step (3) is carried out, returning to the step (1) if the lower left corner is not found after the area search is finished.

(3) Two rays with slopes of +5 and-5 are made through the lower left corner vertex then the lower right corner of the rectangle is found through the rectangular lower right corner mask in the area bounded by these two rays and the mark width or image edge. And (5) if the execution step (4) is found, returning to the step (2) if the lower right corner is not found after the area search is finished.

(4) Two rays with slopes of 85 ° and 95 ° are made through the vertex of the lower right corner. The upper right corner of the rectangle is then found by the rectangular upper right corner mask in the area bounded by the two rays and the height or image edges. If found, the rectangular marker is considered to be detected. And (4) returning to the step (3) if the upper right corner is not found after the area search is finished.

The detection mode of the circular mark is as follows:

the detection of circular marks is the same as the detection of rectangular marks. It is even possible to use a completely rectangular mask to detect circular marks. Of course, no more angles are detected, but circular arcs at 45 °, 135 °, 225 ° and 315 ° on the circumference. Due to the use of thresholds, and the graphics in the image are often not ideal graphics. The positions of the corners or the positions of the arcs to be detected often have a number of point convolutions reaching a threshold value, at which time the centroid needs to be calculated. The distinction between rectangular and circular markers is that rectangular markers tend to be concentrated in the distribution of points that meet a threshold, whereas circular markers generally distribute the points in an elongated circular arc. In this way, it is possible to finally distinguish whether a detected rectangular mark or circular mark is the last mark.

Therefore, after the color feature threshold on the image to be recognized is segmented, a plurality of connected domains are obtained, then the corner point shapes are detected on the color threshold segmentation image, the obtained connected domains are reduced a plurality of times, and at the moment, the connected domain according to the traffic sign is the largest in area. For the condition of a plurality of traffic signs on one image, after the connected domains are sorted according to the area size, if the area of the current connected domain is larger than 50% of the area of the previous connected domain, the current connected domain is still considered to be the connected domain of one traffic sign, and if the area of the current connected domain is smaller than 50% of the area of the previous connected domain, the connected domain in the image is considered to be optimized and finished, and the current connected domain and the connected domains behind the current connected domain are the connected domains of non-traffic sign images.

Therefore, finding the traffic sign image from the connected domain on the color threshold segmentation map includes: arranging the connected domains in descending order according to the area; if the area of the current connected domain is larger than 50% of the area of the previous connected domain, taking the current connected domain as a traffic sign image; and if the area of the current connected domain is less than 50% of the area of the previous connected domain, taking the current connected domain and the connected domains behind the current connected domain as non-traffic sign images.

Taking fig. 5(a) as an example, after sorting the connected domains in descending order of area, the area of the connected domain with the largest area is 8457, the area of the next connected domain is 8338, and the area of the next connected domain is 307, according to the connected domain optimization algorithm introduced above, a conclusion that the image contains two traffic signs can be obtained, which is consistent with the actual observation result. Then, selecting the connected domain with the smallest area from the preferable and qualified connected domains, taking the area of the connected domain as a threshold, and deleting the connected domains with the areas smaller than the threshold, wherein the final positioning graph is shown in fig. 5 (b); fig. 5(a) is an original drawing on the left, a schematic drawing of a connected component area mark on the right, and fig. 5(b) is a final positioning diagram of a traffic sign represented by a connected component area.

Based on the final positioning map, according to the principle of bounding box (bounding box), the traffic sign, i.e. the final segmentation image shown in fig. 6(a), can be segmented in the original map.

The final segmentation map is still different from the maps in the template library, which is not beneficial to the subsequent template matching identification, so that the image to be identified for the traffic sign, namely the image to be identified as shown in fig. 6(b), is obtained after binarization, reverse color and size normalization are carried out on the final segmentation map.

Therefore, the image segmentation and processing of the traffic sign image contained in the connected domain from the preprocessed image to obtain the image to be recognized of the traffic sign image comprises: segmenting a traffic sign image on the preprocessed image by using a minimum bounding rectangle mode; and carrying out binarization, color reversal and size normalization processing on the traffic sign image to obtain an image to be identified for traffic sign identification.

The method for identifying the image to be identified provided by the embodiment has three methods, which are respectively as follows: the identification method based on the Pearson correlation coefficient, the identification method based on the SURF characteristic and the identification method based on the BP neural network.

In addition, some of the three identification methods are used in a standard library, and the standard library is established by selecting 131 traffic sign images which comprise all prohibition sign images, all indication sign images and all warning sign images, wherein the images are standard images. Obtaining a standard library after binarization, reverse color, edge erasure and normalization processing; fig. 7 shows various forms of a standard image (in the original image, other flag bits are red except the human-shaped flag) after the above processing.

When the identification method of the Pearson correlation coefficient is used, identifying the image to be identified comprises the following steps: extracting the gray level distribution of the standard image and the image to be identified in four directions of the horizontal direction, the vertical direction, the 45-degree direction and the 135-degree direction, wherein the standard image is a standard traffic sign image in a standard library; calculating person correlation coefficients of the standard image and the image to be recognized in four directions; calculating the average value of the person correlation coefficients of the standard image and the image to be identified in each of the four directions; and comparing the maximum value of the average value with a preset average value threshold, matching the image to be recognized with the standard image if the maximum value of the average value is greater than the average threshold, and taking the standard image as a recognition result.

The Pearson correlation coefficient identification method is to extract the gray level distribution of the standard image and the image to be identified in the horizontal, vertical, 45 degrees and 135 degrees directions. Calculating Pearson correlation coefficients of the source marker image and the template marker image in four directions, setting the threshold value of the average coefficient, and judging that the two marker images are successfully matched when the calculated correlation coefficient is the maximum value and the value is larger than the set threshold value.

The standard image size in the standard library used in this design was 86 x 86, i.e., 86 pixels long and 86 pixels wide. When matching is performed, the size of the image to be recognized is firstly normalized to 86 × 86, and then the Pearson correlation coefficient is calculated:

(1) in the vertical direction

The Pearson correlation coefficients of the corresponding columns of the standard image and the image to be recognized are calculated one by one, 86 column correlation coefficient values are obtained because the images have 86 columns in total, and the vertical direction correlation coefficient of the image to be recognized and the standard image is obtained by calculating the average value of the 86 values. Since there are 131 standard images, a 1 × 131 matrix of correlation coefficients is obtained.

(2) In the horizontal direction

The Pearson correlation coefficients of the corresponding rows of the standard image and the image to be recognized are calculated one by one, 86 row correlation coefficient values can be obtained because the images have 86 rows in total, and the horizontal direction correlation coefficient of the image to be recognized and the standard image is obtained by calculating the average value of the 86 values. Since there are a total of 131 standard images, a second matrix of 1 × 131 correlation coefficients is obtained.

(3) In the 45 deg. direction

The calculation of the Pearson correlation coefficient in the 45 ° direction is relatively complicated, and since the pixel arrangement of the image in the 45 ° direction cannot be directly extracted, the image is rotated by 45 ° clockwise, that is, the problem of extracting the pixel arrangement in the horizontal direction is solved. And finally, obtaining a third 1 x 131 correlation coefficient matrix.

(4)135 deg. direction

Similarly to the case of the 45 ° orientation, the image is rotated clockwise by 135 ° before processing, finally, a fourth matrix of 1 × 131 correlation coefficients is obtained. And finally obtaining a 4 x 131 correlation coefficient matrix.

And calculating the average value of 4 correlation coefficients in different directions obtained by the image to be recognized and each standard image, sorting the average values in a descending order, taking the threshold value as 0.4, and if the maximum value after the sorting of the correlation coefficients is greater than 0.4, taking the value with the maximum correlation coefficient, wherein the image represented by the maximum correlation coefficient is the recognized traffic sign. And if the maximum value after the sorting of the correlation coefficients is still less than 0.4, determining that the image to be recognized is not matched in the standard library.

When the identification method based on the SURF characteristics is used, identifying the image to be identified comprises the following steps: roughly classifying the image to be detected; performing SUFR characteristic point extraction on the roughly classified images to be detected; obtaining standard feature points and descriptions of SUFR feature points of the standard images after rough classification in advance; and matching the SUFR characteristic points of the image to be detected with the standard characteristic points, and if the matching is successful, taking the description of the standard image represented by the standard characteristic points as an identification result.

The SURF (speeded up robust feature) algorithm is a robust local feature point detection and description algorithm, and the principle and implementation process are as follows:

1. constructing a Black Zi matrix

All interest points for feature extraction are generated by constructing a Hessian (blackplug matrix), and the Hessian matrix is constructed as follows:

constructing a Hessian matrix, and performing image filtering through Gaussian convolution, wherein the filtered Hessian matrix is expressed as:

when this matrix is at its maximum, the current point is brighter or darker than the other points in the neighborhood, in this way locating the location of the keypoint.

In a discrete digital image, the first derivative is a function of the gray level difference between adjacent pixels as shown in the following equation:

Dx＝f(x+l，y)-f(x，y)；

in fact, the discriminant of the Hessian matrix is obtained by multiplying the second-order partial derivative of the current point in the horizontal direction by the second-order partial derivative of the current point in the vertical direction, and subtracting the square of the second-order partial derivative of the current point in the horizontal and vertical directions by the following equation:

det(H)＝Dxx*D yy-Dxy*Dxy；

f (x, y) in the Hessian matrix discriminant is the gaussian convolution of the original image. Since the gaussian test follows a normal distribution, the coefficients gradually decrease from the center point toward the outside. To improve operating speed, the SURF algorithm uses a box filter approximation instead of a gaussian filter, so a weighting factor of 0.9 is multiplied by Dxy to balance the error due to using the box filter approximation as shown in the following equation:

det(H)＝Dxx*D yy-(0.9*Dxy)^2

2. constructing a scale space

The scale-space construction of SURF consists of L o. In the construction of the SURF scale space, the sizes of different groups of images are the same, and the only difference is that the size of the template of the used box filter is increased along with the construction of the space; the template sizes for the image box filters of different layers of the same set are uniform, but the blur coefficients become progressively larger.

3. Feature point localization

For the positioning process of the feature points, SURF and SIFT are completely kept consistent, the sizes of pixel points processed by a Hessian matrix are compared with 26 points in a two-dimensional field and a scale space field to initially position the key points, weak key points and mistakenly positioned key points are removed through setting of a threshold, and finally reserved strong key points are feature stable points.

4. Feature point principal direction assignment

The SIFT feature point direction distribution takes the direction with the bin maximum or more than 80% of the statistical gradient histogram as the main direction.

SURF feature point direction assignment is to calculate har wavelet feature statistics and sum counts in all horizontal and vertical directions in a 60 ° degree sector over the entire region containing the feature points, rotate the sectors at 0.2 radian intervals, perform the sum counts of the feature regions again, and find the sector with the largest feature sum as the principal direction of the feature points.

5. Generating feature point descriptors

In constructing the SIFT feature point descriptor, it is constructed by collecting 4 × 4 block regions around the feature point and calculating 8 gradient directions within each block. The SIFT feature descriptor is a 128-dimensional vector with 4 × 8.

In the construction of the SURF feature point descriptor, the feature points are surrounded by a square, the square is divided into 16 area blocks, and the Haar wavelet features of 25 pixels in each small block in four directions are calculated. The Haar wavelet features have four directions: and taking the obtained 4 direction values as the feature vector of each small region block, wherein the descriptors of the common SURF features are 4 × 4-64 dimensional vectors which are 2 times smaller than SIFT feature descriptors.

SURF determines the degree of matching between two feature points by the magnitude of the euclidean distance between the two feature points. If the distance is close, the matching degree is good, and vice versa; wherein, the signs of the two matrix traces are the same, which means that the two features have the same contrast variation direction. If the signs are different, it is the two features that have a change in contrast in opposite directions. Even if the euclidean distance is 0, it is directly excluded.

The identification method based on SURF characteristics comprises the following steps:

the first step is as follows: the traffic sign images in the 131 template libraries are subjected to feature extraction and description through SURF features, so that 131 corresponding feature vector sets are obtained, and the sets are divided into four categories, namely yellow triangle, red circle, blue rectangle and blue circle according to color and shape features, so that 4 feature vector sub-libraries are obtained.

The second step is that: and roughly classifying the traffic sign images based on HSV color and corner shape detection and segmentation.

The third step: and extracting and describing SURF characteristics of the detected and segmented traffic sign images, calculating the similarity of the characteristic vectors and the corresponding roughly classified characteristic vector sub-library established in the template library through the similarity of a plurality of groups of characteristic vectors between two characteristic points, selecting a plurality of groups with the maximum similarity for summing Sum, and displaying the sample image with the maximum Sum as the type represented by the sign. Finally, the matching result Sum and the meaning of the mark are output.

When the identification method based on the BP neural network is used, the identification of the image to be identified comprises the following steps: classifying images to be identified by using a pre-improved BP neural network, wherein the classified categories comprise a prohibition sign, a warning sign and an indication sign in the traffic signs; refining the classified images to be recognized by using two BO neural networks which are improved in advance and a sample library of standard images until the images to be detected correspond to one standard image in the sample library, and taking the standard image corresponding to the images to be detected as a recognition result; the improvement of the BP neural network comprises: setting (0.01, 0.8) as a learning rate selection interval of the BP neural network, halving the learning rate after the BP neural network has two successive iterations with opposite gradient directions, and doubling the learning rate after the BP neural network has two successive iterations with the same gradient direction; acquiring a momentum term weight of the BP neural network, and adding a random number to the momentum term weight; the output range (0,1) of the Sigmoid function of the BP neural network is modified to (-1/2, 1/2).

The BP neural network is divided into two processes: the universal near theorem was proposed by robert hecht Nielsen in 1988: the hidden layer BP network can approximate any type of closed interval continuous function, so the three-layer BP network can complete any m-dimension to n-dimension mapping.

The number of nodes in the input layer and the output layer can be determined in advance in the BP neural network, the number of nodes in the hidden layer can not be specified in advance, but the number of the nodes can affect the performance of the neural network. The number of hidden layer nodes can be obtained according to the following formula:

in the above formula 4.22, h represents the number of hidden layer nodes, m and n represent the number of input and output layer nodes, respectively, and a is an adjustment constant between the input layer and the output layer, and the value range of a is [ 1-10 ].

Forward pass sub-process: assuming that each node outputs a value xj, the value of the value xj is determined by the weights wij between all nodes i and j, the threshold bj of all nodes j and the activation function f.

The specific calculation method is as follows:

x_j＝f(S_j)

in the formula, the f function is generally an S-shaped function.

Reverse transfer sub-process: for a BP neural network, there is no threshold for the input layer nodes. And setting the output layer result as dj, and transmitting the error signal reversely based on the Widrow-Hoff learning rule. The error function is as follows:

the BP neural network mainly aims at reducing the error function value to enable the error function value to reach the minimum value, and the used method is to continuously adjust the threshold value and the weight value. The Widrow-Hoff learning rule continuously adjusts the threshold and the weight mainly according to the direction that the sum of the square errors is reduced fastest. As known from the gradient descent method, the weight correction is proportional to the gradient of the current position E (w, b). For the jth output node, the following formula is present:

assume that the selection activation function is:

for wij there are:

the error between the actual output and the expected output of the system is reduced by changing the connection weight between the neurons, and the Widrow-Hoff learning rule (error correction learning rule or 8 learning rule) is essentially to reduce the error between the actual output and the expected output of the system by changing the connection weight between the neurons.

The above formula is the calculation of the weight between the hidden layer and the output layer and the calculation of the threshold adjustment amount of the output layer. And calculating the adjustment amount for the threshold between the input layer and the hidden layer is more complicated. Assuming that the weight between the ith node of the hidden layer and the kth node of the input layer is wki, the following calculation is performed:

among them are:

according to the principle of a gradient descent method, the weight and the threshold value between the hidden layer and the output layer are adjusted as follows:

the weight and threshold between the input layer and the hidden layer are adjusted as follows:

however, the convergence rate of the traditional BP neural network is slower, and the embodiment of the application improves the BP neural network by three points, which are respectively as follows: the learning rate is adjusted in an adaptive manner, the momentum item setting is increased, and the output range of the Sigmoid function is modified.

1. The learning rate is adjusted in an adaptive manner:

the learning rate of the BP algorithm is usually determined artificially, and has no uniform standard, if the learning rate is selected to be too large, the network can not be converged due to excessive correction; on the contrary, if the selected rate is too small, the training time is prolonged, and the convergence rate is reduced. Generally, (0.01, 0.8) is taken as a rate selection interval, and the learning rate is adjusted in an adaptive manner here:

w(k+1)＝w(k)+α(k)D(k)；α(k)＝2^λα(k-1)；λ＝sign[D(k)D(k-1)]

when the gradient directions of two successive iterations are opposite, the learning rate is decreased too fast, the step size needs to be reduced by half, and when the gradient directions of the two iterations are the same, the learning rate is decreased too slow, and the step size needs to be doubled.

2. Adding momentum item setting:

in general, the optimal point of the BP algorithm is determined according to the direction of gradient decrease of the error function. The weight is generally adjusted according to the gradient descending direction at a certain moment, and the gradient direction at the previous moment is not considered, so that the problems of slow convergence, overlong training time and the like are caused. The influence of the local detail change of the error curved surface on the network can be reduced by adding a small random number to the originally set momentum term weight, and the problem of existence of local minimum points in the network is well solved. Is represented as follows:

w(k+1)＝w(k)+α[(1-η)D(k)+ηD(k-1)]

w (k) and w (k +1) represent weight vectors at the time of k and k + 1. D (k-1) and D (k) respectively represent negative gradients at the time k-1 and k. The learning rate is represented by α, and the range is α > 0. The momentum factor is represented by η, ranging from 0< η < 1.

Modification of Sigmoid type function output range:

the Sigmoid function has an output range of (0,1), and half of the output range is close to 0, which can reduce the weight adjustment amount and even cannot perform the adjustment function, so that the training is longer. Therefore, by reducing the scope of the Sigmoid function, the training time is shortened, the times of convergence circulation are reduced, and the convergence speed is effectively improved. Passing function

The output range is changed to (-1/2, 1/2).

The embodiment of the present application further provides an intelligent traffic sign recognition system, including: the preprocessing module is used for carrying out noise reduction preprocessing on the acquired image to obtain a preprocessed image; the detection and segmentation module is used for detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image; and the image identification module is used for identifying the image to be identified.

The preprocessing module comprises: the device comprises a size reduction unit, an increment unit and a noise reduction unit; the size reduction unit is used for reducing the size of the acquired image by using a bilinear interpolation method; the increment unit is used for increasing the reduced image by using a histogram equalization method; and the noise reduction unit is used for performing noise reduction processing on the incremental image by using a median filtering method to obtain a preprocessed image.

The detection and segmentation module includes: the device comprises a model conversion unit, a threshold segmentation unit, an angular point acquisition unit, a shape judgment unit, a connected domain acquisition unit and a segmentation processing unit; the model conversion unit is used for converting the preprocessed image from the RGB color model into HSV model representation; the threshold segmentation unit is used for detecting the color characteristics of the preprocessed image by using an HSV model and carrying out threshold segmentation to obtain a color threshold segmentation graph with a connected domain; the corner point acquisition unit is used for acquiring corner points of all connected domains on the color threshold segmentation graph; the shape judging unit is used for determining whether each connected domain is in the shape of the traffic sign or not according to the corner points, if so, the connected domain is kept on the color threshold segmentation graph, and if not, the connected domain is removed from the color threshold segmentation graph; the connected domain acquiring unit is used for finding out a connected domain of the traffic sign image from the connected domain on the color threshold segmentation map; the segmentation processing unit is used for segmenting and processing the traffic sign image contained in the connected domain from the preprocessed image to obtain an image to be identified of the traffic sign image.

The model conversion unit includes: the system comprises an RGB component normalization subunit, an HSV definition unit and a conversion unit; the RGB component normalization subunit is used for normalizing the three components R, G, B of RGB to be in the range of [0, 1 ]; the HSV defining unit and the conversion unit are used for defining the value ranges of three components H, S, V of HSV as H [0,360 ], S [0, 1], V [0, 1 ]; the conversion unit is used for converting the RGB color model into the HSV model according to the conversion relation.

Wherein, the conversion relationship is as follows: v ═ max (R, G, B)

S＝(max(R,G,B)-min(R,G,B))/max(R,G,B)

If R ═ max (R, G, B), then

H′＝(G-B)/(max(R,G,B)-min(R,G,B))

If G ═ max (R, G, B), then

H′＝2+(B-R)/(max(R,G,B)-min(R,G,B))

If B is max (R, G, B), then

H′＝4+(R-G)/(max(R,G,B)-min(R,G,B))

Wherein, when H 'is not less than 0, H' x 60; when H '<0, H ═ H' × 60+ 360.

The connected component acquiring unit includes: a sorting subunit and a judging subunit; the sorting subunit is used for sorting the connected domains in a descending order according to the area; the judging subunit is used for judging whether the area of the current connected domain is larger than that of the previous connected domain, and if so, taking the current connected domain as the connected domain of a traffic sign image; and if not, taking the current connected domain and the connected domains behind the current connected domain as the connected domains of the non-traffic sign images.

The division processing unit includes: a minimum bounding rectangle subunit and a processing subunit; the minimum bounding rectangle subunit is used for segmenting the traffic sign image on the preprocessed image in a mode of using a minimum bounding rectangle; the processing subunit is used for carrying out binarization, color reversal and size normalization processing on the traffic sign image to obtain an image to be identified for traffic sign identification.

In this embodiment, the image recognition module includes: the device comprises a gray level extraction unit, a person correlation coefficient calculation unit, an average value calculation unit and a comparison unit; the gray level extraction unit is used for extracting gray level distribution in four directions of a standard image and an image to be identified, wherein the four directions are in the horizontal direction, the vertical direction, the 45-degree direction and the 135-degree direction, and the standard image is a standard traffic sign image in a standard library; the person correlation coefficient calculating unit is used for calculating person correlation coefficients in four directions of the standard image and the image to be identified; the average value calculating unit is used for calculating the average value of the person correlation coefficient of the standard image and the image to be identified in each direction of the four directions; the comparison unit is used for comparing the maximum value of the average value with a preset average value threshold value, if the maximum value of the average value is larger than the average threshold value, matching the image to be recognized with the standard image, and taking the standard image as a recognition result.

In other embodiments, the image recognition module comprises: the image detection method comprises a rough classification unit, a SUFR feature point extraction unit of an image to be detected, a SUFR feature point acquisition unit of a standard image and a matching unit; the rough classification unit is used for roughly classifying the image to be detected; the SUFR characteristic point extraction unit of the image to be detected is used for extracting SUFR characteristic points of the roughly classified image to be detected; the standard image SUFR characteristic point acquisition unit is used for acquiring standard characteristic points and descriptions of SUFR characteristic points of the standard images after rough classification in advance; the matching unit is used for matching the SUFR characteristic points of the image to be detected with the standard characteristic points, and if the matching is successful, the description of the standard image represented by the standard characteristic points is used as an identification result.

In other embodiments, the image recognition module comprises: a classification unit and a refinement unit; the classification unit is used for classifying the images to be identified by using a pre-improved BP neural network, and the classified categories comprise prohibition marks, warning marks and indication marks in traffic signs; the thinning unit is used for thinning the classified images to be recognized by using the two BO neural networks and the sample library of the standard images which are improved in advance until the images to be detected correspond to one standard image in the sample library, and taking the standard image corresponding to the images to be detected as a recognition result.

Wherein, the improvement of the BP neural network comprises: setting (0.01, 0.8) as a learning rate selection interval of the BP neural network, halving the learning rate after the BP neural network has two successive iterations with opposite gradient directions, and doubling the learning rate after the BP neural network has two successive iterations with the same gradient direction; acquiring a momentum term weight of a BP neural network, and adding a random number to the momentum term weight; the output range (0,1) of the Sigmoid function of the BP neural network is modified to (-1/2, 1/2).

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In view of the above description of the method and system for intelligently identifying a traffic sign provided by the present invention, those skilled in the art may change the concept of the embodiment of the present invention in the specific implementation manner and the application scope, and in summary, the content of the present specification should not be construed as limiting the present invention.

Claims

1. An intelligent traffic sign identification method is characterized by comprising the following steps:

carrying out noise reduction pretreatment on the acquired image to obtain a pretreated image;

detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image;

and identifying the image to be identified.

2. The intelligent traffic sign recognition method according to claim 1,

the denoising preprocessing of the acquired image comprises:

reducing the size of the acquired image by using a bilinear interpolation method;

the reduced image is subjected to increment by using a histogram equalization method;

and performing noise reduction processing on the incremental image by using a median filtering method to obtain a preprocessed image.

3. The intelligent traffic sign recognition method according to claim 1,

the detecting and segmenting of the traffic sign on the preprocessed image to obtain the image to be identified of the traffic sign image comprises the following steps:

converting the preprocessed image from an RGB color model to an HSV model representation;

detecting the color characteristics of the preprocessed image by using an HSV model and carrying out threshold segmentation to obtain a color threshold segmentation graph with a connected domain;

acquiring corner points of each connected domain on the color threshold segmentation graph;

determining whether each connected domain is in the shape of a traffic sign or not according to the corner points, if so, keeping the connected domain on the color threshold segmentation graph, and if not, removing the connected domain from the color threshold segmentation graph;

finding out a connected domain of the traffic sign image from the connected domain on the color threshold segmentation map;

and segmenting and processing the traffic sign image contained in the connected domain from the preprocessed image to obtain the image to be recognized of the traffic sign image.

4. The intelligent traffic sign recognition method according to claim 3,

the converting the pre-processed image from the RGB color model to the HSV model representation comprises:

normalizing the three components R, G, B of RGB to the range of [0, 1], and defining the value ranges of the three components H, S, V of HSV as H [0,360 ], S [0, 1], V [0, 1 ];

the conversion relationship between the RGB color model and the HSV model is as follows:

V＝max(R,G,B)

S＝(max(R,G,B)-min(R,G,B))/max(R,G,B)

if R ═ max (R, G, B), then

H′＝(G-B)/(max(R,G,B)-min(R,G,B))

If G ═ max (R, G, B), then

H′＝2+(B-R)/(max(R,G,B)-in(R,G,B))

If B is max (R, G, B), then

H′＝4+(R-G)/(max(R,G,B)-min(R,G,B))

Wherein, when H 'is not less than 0, H' x 60; when H '<0, H ═ H' × 60+ 360.

5. The intelligent traffic sign recognition method according to claim 4,

the finding of the connected domain of the traffic sign image from the connected domain on the color threshold segmentation map comprises the following steps:

arranging the connected domains in descending order according to the area;

if the area of the current connected domain is larger than 50% of the area of the previous connected domain, taking the current connected domain as the connected domain of the traffic sign image;

and if the area of the current connected domain is less than 50% of the area of the previous connected domain, taking the current connected domain and the connected domains behind the current connected domain as the connected domains of the non-traffic sign images.

6. The intelligent traffic sign recognition method according to claim 5,

the image segmentation and processing of the traffic sign image contained in the connected domain from the preprocessed image to obtain the image to be recognized of the traffic sign image comprises the following steps:

segmenting a traffic sign image on the preprocessed image by using a minimum bounding rectangle mode;

and carrying out binarization, color reversal and size normalization processing on the traffic sign image to obtain an image to be identified for traffic sign identification.

7. The intelligent traffic sign recognition method according to claim 1,

the identifying the image to be identified comprises:

extracting gray level distribution in four directions of a standard image and an image to be identified, wherein the gray level distribution is in the horizontal direction, the vertical direction, the 45-degree direction and the 135-degree direction, and the standard image is a standard traffic sign image in a standard library;

calculating person correlation coefficients of the standard image and the image to be recognized in four directions;

calculating the average value of the person correlation coefficients of the standard image and the image to be identified in each of the four directions;

and comparing the maximum value of the average value with a preset average value threshold value, if the maximum value of the average value is larger than the average threshold value, matching the image to be recognized with the standard image, and taking the standard image as a recognition result.

8. The intelligent traffic sign recognition method according to claim 1,

the identifying the image to be identified comprises:

roughly classifying the image to be detected;

performing SUFR characteristic point extraction on the roughly classified images to be detected;

obtaining standard feature points and descriptions of SUFR feature points of the standard images after rough classification in advance;

and matching the SUFR characteristic points of the image to be detected with the standard characteristic points, and if the matching is successful, taking the description of the standard image represented by the standard characteristic points as an identification result.

9. The intelligent traffic sign recognition method according to claim 1,

the identifying the image to be identified comprises:

classifying images to be identified by using a pre-improved BP neural network, wherein the classified categories comprise a prohibition sign, a warning sign and an indication sign in the traffic signs;

refining the classified images to be recognized by using two BO neural networks which are improved in advance and a sample library of standard images until the images to be detected correspond to one standard image in the sample library, and taking the standard image corresponding to the images to be detected as a recognition result;

the improvement of the BP neural network comprises:

setting (0.01, 0.8) as a learning rate selection interval of the BP neural network, halving the learning rate after the BP neural network has two successive iterations with opposite gradient directions, and doubling the learning rate after the BP neural network has two successive iterations with the same gradient direction;

acquiring a momentum term weight of a BP neural network, and adding a random number to the momentum term weight;

the output range (0,1) of the Sigmoid function of the BP neural network is modified to (-1/2, 1/2).

10. An intelligent traffic sign recognition system, comprising:

the preprocessing module is used for carrying out noise reduction preprocessing on the acquired image to obtain a preprocessed image;

the detection and segmentation module is used for detecting and segmenting the traffic sign of the preprocessed image to obtain an image to be identified of the traffic sign image;

and the image identification module is used for identifying the image to be identified.