CN113420633A

CN113420633A - Traffic sign identification method based on UM enhancement and SIFT feature extraction

Info

Publication number: CN113420633A
Application number: CN202110675582.3A
Authority: CN
Inventors: 郭朦; 陈紫强
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2021-06-18
Filing date: 2021-06-18
Publication date: 2021-09-21
Anticipated expiration: 2041-06-18
Also published as: CN113420633B

Abstract

The invention discloses a traffic sign identification method based on UM enhancement and SIFT feature extraction, which is characterized by comprising the following steps: 1) enhancing the image; 2) detecting the shape; 3) extracting SIFT characteristics; 4) and (5) SVM classification and identification. The method can improve the identification accuracy rate in a complex outdoor environment.

Description

Traffic sign identification method based on UM enhancement and SIFT feature extraction

Technical Field

The invention relates to a digital image processing technology and an artificial intelligence technology, in particular to a traffic sign recognition method based on UM enhancement and SIFT feature extraction.

Background

With the rapid development of domestic economy and science and technology, automobiles are more and more popular in people's lives, great convenience is brought to daily trips of people, and a lot of traffic problems are caused as the number of automobiles increases day by day. The main objective of modern traffic control is to gradually improve driving safety, and the detection and identification of traffic signs are the main methods and approaches for achieving the purpose. The traffic sign mainly has two functions: firstly, the traffic sign can help a driver or a pedestrian to drive or walk more conveniently; secondly, the traffic signs reflect local road condition information and can help people to safely drive or walk, thereby reducing and avoiding traffic accidents. The fact proves that a driver neglects the traffic signs due to factors such as fatigue and the like, so that a plurality of major accidents occur, a rapid and accurate traffic sign detection and identification algorithm is developed, a driving auxiliary system is continuously improved, and the method is a main means for reducing the traffic accident rate and improving the driving safety.

The road traffic sign recognition consists of two parts, namely detection and recognition. Firstly, after an image is acquired by image acquisition equipment, a detection and recognition algorithm is adopted to judge the specific category of the road traffic marking method. The road traffic sign can provide indication and warning information, standardizes the driving behavior of a driver, is used as an important component of road facilities and an important carrier of road traffic information, and can provide road condition information, safety warning, urge the driver to drive cautiously and safely and the like for the driver in a daily driving environment. In order to distinguish from natural or artificial scenes, traffic signs are generally designed to have a specific color and shape, and thus there are two main strategies for detecting traffic signs, namely, a color segmentation method and a shape detection method. Because color is the main visual characteristic of the traffic sign, most of the current traffic sign detection systems are realized by directly adopting a color segmentation method and then combining other methods, neglecting the image preprocessing function in the aspect of image enhancement, and have insufficient identification capability on common complex and various outdoor conditions such as illumination influence, target shielding, sundry scenes and the like.

Disclosure of Invention

The invention aims to provide a traffic sign identification method based on UM enhancement and SIFT feature extraction aiming at the defects of the prior art. The method can improve the identification accuracy rate in a complex outdoor environment.

The technical scheme for realizing the purpose of the invention is as follows:

a traffic sign identification method based on UM enhancement and SIFT feature extraction comprises the following steps:

1) image enhancement: the method comprises the steps of collecting an original image, wherein the original image contains traffic sign information under a natural scene, enhancing the original image by adopting an Unsharp Masking UM (UM) method based on wavelet transformation, so that the positions of edges and the like of the original image are clearer and the original image is more convenient to detect, wherein the Unsharp Masking method is one of the most commonly used edge detail enhancing methods, the wavelet transformation has multi-resolution analysis capability and time-frequency localization capability, the application effect in the field of image processing is outstanding, although the traditional wavelet transformation algorithm based on the wavelet transformation obtains better enhancing effect, the methods only determine the enhancing degree simply according to the absolute value of a wavelet coefficient, the influence of edge amplitudes is not considered, the edge enhancing effect difference with the same contrast and different amplitudes is always larger, and the Unsharp Masking method based on the wavelet transformation can obtain better enhancing effect on the edge details of the image, it is more suitable for the situation where traffic signs will appear at random positions in the acquired images,

let the coordinates of each pixel in the picture be represented by (i, j), then the expression is shown as equation (1):

g₂(i,j)＝g₁(i,j)+λ×z(i,j) (1)，

wherein, g₁(i, j) is the pixel with the original image coordinates (i, j), g₂(i, j) is the pixel with enhanced image coordinates (i, j), z (i, j) is the gradient value obtained by calculating the laplacian mask, and z (i, j) is expressed as formula (2):

wherein, λ is the adjustment coefficient, and the pixel has different λ values in different contrast regions, and the value of the middle contrast region λ is relatively large, and the value λ of the low and high contrast regions is relatively small, and the judgment of the pixel in different contrast regions adopts the estimation local variance method as shown in formula (3):

the calculated variance is shown in equation (4):

wherein, g₁Is the mean value, g_1varIs the variance of the received signal and the variance,

is an average number, and the threshold values of the pixel points at high and low contrast are respectively set as T₂And T₁And T is₂＞T₁If g is_1var＜T₁The pixel point is in the low contrast area if T₁＜g_1var＜T₂If the pixel point is in the middle contrast area, if g_1var＞T₂If so, the pixel point is in a high-contrast area;

2) and (3) shape detection: the method comprises the following steps:

1-2) expansion and corrosion processes: respectively carrying out expansion processing and corrosion processing on the image, wherein the process is closed operation and is used for eliminating fine objects in the image, namely separating the objects at fine positions, connecting adjacent objects with smooth image boundaries and filling fine cavities in the objects;

2-2) removing small objects process: removing small objects from the image, setting the number of pixels to be 50, removing the small objects with the number of pixels smaller than 50 in the image, and eliminating redundant noise points of the image;

3-2) connected domain labeling process: marking a white pixel point connected region in the binary image so as to determine the contour geometric parameters of an object in the image;

4-2) determination of aspect ratio range process: determining the length-width ratio range of the traffic sign, removing regions with small areas of the connected regions and reducing screening difficulty, thereby determining a plurality of regions possibly appearing in the traffic sign and positioning the regions to a target region;

5-2) target cutting process: cutting the determined target area to form a plurality of well-divided pixel areas;

3) SIFT feature extraction: the method comprises the following steps:

1-3) feature point detection: performing a gaussian convolution operation on the image, wherein the scale space of the image I (x, y) is defined as L (x, y, sigma), and the image I (x, y, sigma) is obtained by performing the convolution operation on the original image and Gaussian functions G (x, y, sigma) with different scales:

L(x,y,σ)＝G(x,y,σ)*I(x,y) (5)，

the method comprises the following steps that (x, y) is a space coordinate, sigma is a scale coordinate, an image pyramid is constructed through reduction sampling, Gaussian difference sum and image convolution of different scales are adopted, two adjacent Gaussian scale space images are subtracted, a Gaussian difference scale space DOG is generated, so that detected key points are more stable, and the expression of a Gaussian difference operator DOG is shown in a formula (7):

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y,kσ)-L(x,y,σ) (7)，

searching a local extreme point in a DOG scale space, and when the detected sampling point is the local extreme point in the 3D space, confirming that the point is an SIFT feature point (x, y) to be searched;

2-3) description of characteristic points: finding local extreme points in space, determining characteristic points and positions and dimensions thereof, describing the characteristic points to make the characteristic points have rotation invariance, and comprising the following steps:

1-2-3) solving gradient module value and direction of key points: assuming that L is the scale of the key point in the Gaussian image, the modulus m (x, y) of the gradient of the point L (x, y) is shown in formula (8):

the direction θ (x, y) of the gradient of the point L (x, y) is shown in equation (9):

θ(x,y)＝tan^-1(L(x,y+1)-L(x,y-1)/L(x,y+1)-L(x,y-1)) (9)；

2-2-3) assigning directions to the feature points: after the characteristic points are determined, assigning a main direction and an auxiliary direction to the characteristic points through a gradient histogram to enhance the robustness of SIFT characteristics, counting the gradient direction of a neighborhood window of the characteristic points by adopting the gradient histogram, wherein the gradient histogram comprises 36 columns, each column is 10 degrees, the main direction of the characteristic points is determined by the peak value of the 36 columns, and when a value which is equal to 80% of the main peak value exists, the auxiliary direction of the characteristic points is positioned;

3-2-3) generating feature vectors: generating a 128-dimensional feature vector, namely taking an 8 × 8 window with a feature point as a center, dividing the 8 × 8 window into 4 × 4 small blocks, calculating a gradient direction histogram, drawing an accumulated value of each gradient direction to form a seed point, wherein each seed point has vector information of 8 directions, and thus, the 128-dimensional SIFT feature vector is generated;

4-2-3) feature vector normalization: in order to further eliminate the influence of illumination, the 128-dimensional SIFT feature vectors of the feature points are normalized, and at the moment, the feature vectors have better robustness on geometric deformation factors such as scale change, rotation and the like, and the influence of illumination change on the feature points is also removed;

3-3) feature point matching: after SIFT feature vectors of two images are generated, the Euclidean distance is adopted to measure the similarity between feature points, the feature vectors are matched, for a certain feature point in the images, two points which are closest to the Euclidean distance in the other image are found, if the closest distance is divided by the next closest distance to be less than a proportional threshold value of 0.5, the pair of matching points is accepted, and when the number of the matching points of a certain cut image is obviously higher than that of other cut images, the image can be judged to be an SIFT feature matching image;

4) SVM classification and recognition: the SVM algorithm adopts two-stage classification, namely the first-stage classification mainly extracts the outline characteristics of the traffic sign, the second-stage classification is used for training and identifying, but the SVM algorithm is difficult to implement on large-scale training samples, and the multi-classification problem is solved by using the SVM, so that SIFT feature extraction and the use of a cascade classifier are adopted to realize the extraction of the outline characteristics of the traffic sign, the scale of the training samples of the SVM algorithm is greatly reduced, the complexity of calculation is simplified, the algorithm accuracy is higher, the method comprises the training and identifying processes,

the SVM training process is as follows:

1-4) establishing a sample database of standard traffic signs;

2-4) extracting an image recognition result output in the SIFT feature extraction process;

3-4) carrying out SVM training on the result extracted in the step 2-4) and a sample database;

the SVM recognition process is as follows:

4-4) extracting SIFT features in the step 3) and extracting a feature identification result of the traffic sign in the natural scene;

5-4) identifying by adopting a characteristic database;

6-4) obtaining an image to be identified, a standard image and a character paraphrase;

7-4) displaying the corresponding character paraphrases.

According to the technical scheme, the image is enhanced by adopting the unsharp mask method based on wavelet transformation, so that the problem that the image acquired by a camera has many uncertain factors in a natural scene, such as the situation that the position of a traffic sign in the image is easy to appear at the edge and the like, can be better solved, and a better enhancement effect is achieved on the details of the image edge.

According to the technical scheme, SIFT feature extraction and SVM classification recognition are combined for use, invariance of SIFT in local features, rotation, scale scaling and brightness change of an image can be brought into play, certain stability on perspective change, affine transformation and noise is kept, meanwhile, the scale of training samples of an SVM algorithm is greatly reduced, the complexity of calculation is simplified, and the algorithm accuracy is higher.

The method can improve the identification accuracy rate in a complex outdoor environment.

Drawings

FIG. 1 is a schematic flow diagram of an example method;

FIG. 2 is a schematic diagram illustrating a shape detection process in an embodiment;

FIG. 3 is a schematic diagram of an SIFT feature extraction flow in the embodiment;

FIG. 4 is a diagram illustrating an SVM classification recognition process according to an embodiment;

FIG. 5 is a schematic diagram illustrating the contrast between the enhancement and the histogram equalization of the unsharp masking method based on wavelet transform in the embodiment;

FIG. 6 is a diagram of SIFT feature recognition results in the embodiment;

fig. 7 is a second graph of SIFT feature recognition results in the embodiment.

Detailed Description

The invention will be further elucidated with reference to the drawings and examples, without however being limited thereto.

Example (b):

referring to fig. 1, a traffic sign recognition method based on UM enhancement and SIFT feature extraction includes the following steps:

1) image enhancement: the method comprises the steps of collecting an original image, wherein the image contains traffic sign information under a natural scene, enhancing the original image by adopting an unsharp mask UM method based on wavelet transformation, enhancing the original image to enable the positions of edges and the like to be clearer and be more beneficial to detection, wherein the unsharp mask method is one of the most commonly used edge detail enhancement methods, the wavelet transformation has multi-resolution analysis capability and time-frequency localization capability, and has a prominent application effect in the field of image processing, the traditional wavelet transformation algorithm based on the wavelet transformation obtains a better enhancement effect, but the methods simply determine the enhancement degree according to the absolute value of a wavelet coefficient, do not consider the influence of edge amplitude, always enable the edge enhancement effects with the same contrast and different amplitudes and angles to be larger, and the unsharp mask method based on the wavelet transformation can obtain a better enhancement effect on the edge details of the image, it is more suitable for the situation where traffic signs will appear at random positions in the acquired images,

g₂(i,j)＝g₁(i,j)+λ×z(i,j) (1)，

wherein, g₁(i, j) is the original image coordinate of (i, j)Pixel, g₂(i, j) is the pixel with enhanced image coordinates (i, j), z (i, j) is the gradient value obtained by calculating the laplacian mask, and z (i, j) is expressed as formula (2):

the calculated variance is shown in equation (4):

2) and (3) shape detection: as shown in fig. 2, includes:

3) SIFT feature extraction: as shown in fig. 3, includes:

L(x,y,σ)＝G(x,y,σ)*I(x,y) (5)，

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y,kσ)-L(x,y,σ) (7)，

θ(x,y)＝tan^-1(L(x,y+1)-L(x,y-1)/L(x,y+1)-L(x,y-1)) (9)；

4) SVM classification and recognition: as shown in fig. 4, the SVM algorithm adopts a two-stage classification, i.e. the first-stage classification mainly extracts the contour features of the traffic signs, the second-stage classification is to perform training and recognition, but the SVM algorithm is difficult to implement on large-scale training samples, and the SVM is difficult to solve the multi-classification problem, so the SIFT feature extraction and the use of the cascade classifier are adopted to realize the extraction of the contour features of the traffic signs, thereby greatly reducing the scale of the training samples of the SVM algorithm, simplifying the complexity of calculation, and improving the accuracy of the algorithm, including the training and recognition processes,

the SVM training process is as follows:

1-4) establishing a sample database of standard traffic signs;

the SVM recognition process is as follows:

5-4) identifying by adopting a characteristic database;

7-4) displaying the corresponding character paraphrases.

Referring to fig. 5, fig. 5 is a comparison graph of the result of the original image after step 1), in which a gray histogram in the second row describes the number of pixels in each gray level in the image, reflecting the frequency of occurrence of each gray level, wherein the abscissa represents the gray level, the ordinate represents the frequency of occurrence of the gray level, the peak of the gray histogram in the first row of the original image appears near the low value and the middle value, indicating that the overall brightness of the image is dark, the second row is the image after histogram equalization, the gray value with more pixels in the image is expanded, the gray value with less pixels is merged, so that the equalized gray histogram is uniformly distributed, thereby enhancing the contrast and definition of the image, the third row is the image after inverse sharpening mask, the original image is wavelet transformed to decompose the low frequency coefficient and the high frequency coefficient, and expand the low frequency coefficient information, the method comprises the steps of obtaining a fuzzy image after inhibiting or removing a high-frequency coefficient, subtracting the fuzzy image from an original image to obtain a mask model, multiplying the mask model by an enhancement factor, and adding the enhancement factor and an original image to obtain an enhanced image, wherein the histogram of the enhanced image is uniform in distribution, low in gray frequency and small in numerical difference, the times and complexity of algorithm calculation are reduced, and the image effectively eliminates noise, is strong in layering sense and enhances the outline and detail parts of the image.

As shown in fig. 6 and 7, it can be seen from the drawings that the simpler the texture of the target identification area of the image, i.e., the traffic sign, the fewer the number of existing feature points and the fewer the number of correct matching points; conversely, the more complex the texture, the more feature points exist, and the more correct matching points.

Claims

1. A traffic sign identification method based on UM enhancement and SIFT feature extraction is characterized by comprising the following steps:

1) image enhancement: collecting an original image, wherein the original image contains traffic sign information under a natural scene, and performing image enhancement on the original image by adopting an unsharp mask UM method based on wavelet transformation:

g₂(i,j)＝g₁(i,j)+λ×z(i,j) (1)，

wherein, λ is the adjustment coefficient, the pixel has different λ values in different contrast regions, the judgment of the pixel in different contrast regions adopts the estimation local variance method as shown in formula (3):

the calculated variance is shown in equation (4):

2) and (3) shape detection: the method comprises the following steps:

1-2) expansion and corrosion processes: respectively carrying out expansion treatment and corrosion treatment on the image, namely separating an object at a fine position, connecting an adjacent object with a smooth image boundary, and filling fine cavities in the object;

3) SIFT feature extraction: the method comprises the following steps:

L(x,y,σ)＝G(x,y,σ)*I(x,y) (5)，

wherein (x, y) is a spatial coordinate, σ is a scale coordinate, an image pyramid is constructed by reduction sampling, two adjacent gaussian scale space images are subtracted by adopting gaussian difference sum and convolution with the images in different scales to generate a gaussian difference scale space DOG, and a gaussian difference operator DOG expression is shown in a formula (7):

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y,kσ)-L(x,y,σ) (7)，

2-3) description of characteristic points: the method comprises the following steps:

θ(x,y)＝tan^-1(L(x,y+1)-L(x,y-1)/L(x,y+1)-L(x,y-1)) (9)；

2-2-3) assigning directions to the feature points: counting the gradient direction of a characteristic point neighborhood window by adopting a gradient histogram, wherein the gradient histogram comprises 36 columns, each column is 10 degrees, the main direction of the characteristic point is determined by the peak value of the 36 columns, and when a value which is equal to 80% of the main peak value exists, the auxiliary direction of the characteristic point is positioned;

3-2-3) generating feature vectors: generating a 128-dimensional feature vector, namely taking an 8 × 8 window with a feature point as a center, dividing the 8 × 8 window into 4 × 4 small blocks, calculating a gradient direction histogram, drawing an accumulated value of each gradient direction to form a seed point, wherein each seed point has vector information of 8 directions, and thus, a 128-dimensional SIFT feature vector is generated;

4-2-3) feature vector normalization: normalizing the 128-dimensional SIFT feature vectors of the feature points;

4) SVM classification and recognition: including a training and recognition process in which,

the SVM training process is as follows:

1-4) establishing a sample database of standard traffic signs;

the SVM recognition process is as follows:

5-4) identifying by adopting a characteristic database;

7-4) displaying the corresponding character paraphrases.