CN113128500A

CN113128500A - Mask-RCNN-based non-motor vehicle license plate recognition method and system

Info

Publication number: CN113128500A
Application number: CN202110378119.2A
Authority: CN
Inventors: 孔祥杰; 夏锋; 侯明良; 郝欣宇; 陈桥; 沈国江
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-04-08
Filing date: 2021-04-08
Publication date: 2021-07-16

Abstract

A non-motor vehicle license plate recognition method based on Mask-RCNN is characterized in that accurate positioning of a license plate is achieved based on a convolutional neural network (Mask-RCNN), license plate characters are segmented by a vertical projection method after inclination correction and binarization are conducted, character recognition is achieved through the convolutional neural network, and recognition accuracy is higher than that of a license plate recognition method adopting traditional computer image processing. The invention also comprises a system for implementing the Mask-RCNN-based non-motor vehicle license plate recognition method. The non-motor vehicle license plate recognition method based on the convolutional neural network (Mask-RCNN) has higher recognition rate under the conditions of complex scenes such as stronger illumination and poor image resolution, and greatly improves the robustness of the non-motor vehicle license plate recognition system.

Description

Mask-RCNN-based non-motor vehicle license plate recognition method and system

Technical Field

The invention relates to the fields of computer image processing, deep learning and intelligent traffic, in particular to a method and a system for identifying a license plate of a non-motor vehicle based on Mask-RCNN.

Background

With the development of the times, the quantity of non-motor vehicles kept in the whole society is increasing year by year, and the demand for obtaining the vehicle information of the non-motor vehicles is gradually increased due to the aspects of safety, reasonable resource distribution and the like; vehicle license plates, which are considered to be "identification cards" of a vehicle, are generally readily accessible for other information about the vehicle upon obtaining license plate information. Therefore, the license plate information plays a very important role in the recognition of the vehicle information in the traffic system. However, with the increasing quantity of non-motor vehicles, the disadvantages of high labor intensity, low working efficiency and the like of the recognition and recording of license plate information which is once completed by manpower are difficult to meet the requirements of high quality and high efficiency of the current society; meanwhile, computer technology has been rapidly developed in recent years. Accordingly, automatic identification technology of license plates for non-motor vehicles is increasingly gaining attention.

The non-motor vehicle license plate recognition technology is a comprehensive integration technology based on computer image processing, and the processing flow comprises two main functional modules of non-motor vehicle license plate image positioning and non-motor vehicle license plate character recognition. The non-motor vehicle license plate image positioning subject task is to position the position of a license plate from a shot non-motor vehicle image, accurately extract the license plate and use the license plate for subsequent non-motor vehicle license plate character recognition. The accurate positioning of the license plate image is a precondition and a basis for license plate character recognition, and is a key problem to be solved firstly by a non-motor vehicle license plate recognition technology.

However, although the existing non-motor vehicle license plate recognition technology based on traditional image processing can well operate in a simple scene, under an open and complex scene, the existing algorithm is difficult to obtain a relatively ideal license plate detection and recognition effect. In open scenes, the main difficulties are text interference in natural scenes (such as billboards, road signs, etc.) and random shooting conditions (such as different illumination, distortion, occlusion, blurring, etc.), etc.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a method and a system for recognizing the license plate of a non-motor vehicle based on a convolutional neural network (Mask-RCNN), so as to solve the difficulties in recognizing the license plate of the non-motor vehicle in a complex scene.

The invention adopts the following technical scheme:

a convolutional neural network (Mask-RCNN) based non-motor vehicle license plate recognition method comprises the following steps:

s1: training a convolutional neural network (Mask-RCNN) -based non-motor vehicle license plate positioning network;

s2: detecting a license plate image area of the non-motor vehicle in the shot picture by using the license plate positioning network of the non-motor vehicle of S1, and acquiring images of some suspected license plates of the non-motor vehicle and corresponding position coordinates;

s3: accurately screening the suspected non-motor vehicle license plate images obtained in the step S2 to finish accurate positioning of the license plate images;

s4: the license plate image of the non-motor vehicle obtained after the accurate screening in the S3 is subjected to inclination correction so as to be used for subsequent recognition of characters of the license plate of the non-motor vehicle, the inclination angle of the inclined license plate is extracted by adopting a Hough transformation method, and the rotation of the license plate image is completed;

s5: performing character segmentation on the license plate image subjected to inclination correction in the S4;

s6: and training a character recognition method based on a BP neural network, and recognizing the license plate characters segmented in S5 by adopting a trained recognition model.

A convolutional-neural-network (Mask-RCNN) -based non-automotive license plate recognition system, comprising: license plate detects orientation module: training a non-motor vehicle license plate positioning model based on a convolutional neural network (Mask-RCNN) to position a non-motor vehicle license plate region of a shot image;

license plate character segmentation module: carrying out binarization processing on the positioned license plate image of the non-motor vehicle, selecting a proper threshold value to segment the characters from the background of the license plate, setting the gray value of the pixels larger than the threshold value to be 0, namely black, and setting the gray value of the pixels smaller than the threshold value to be 255, namely white; then carrying out gradient sharpening on the binary image to enable the blurred image to be clear, and sharpening the image by adopting a Robert gradient operator; secondly, removing noise of miscellaneous points (including rivet removal) on the basis of the sharpened image; finally, scanning the image row by row and column by column, determining the accurate character width and height range, completing character segmentation, normalizing the segmented characters, and setting the uniform resolution as 12 × 12;

license plate character recognition module: and recognizing the cut characters of the license plate of the non-motor vehicle by adopting a BP neural network.

The invention has the advantages that: the recognition rate can be higher in a complex scene, such as the conditions of stronger illumination and poor image resolution, and the robustness of the non-motor vehicle license plate recognition system is greatly improved.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The invention provides a non-motor vehicle license plate recognition method based on a convolutional neural network (Mask-RCNN), which realizes quick and accurate non-motor vehicle license plate recognition by using a mode of combining related technologies of neural network and computer image processing, and meets the requirements of accuracy and real-time performance of a license plate recognition system.

The whole algorithm flow of the invention is shown in fig. 1. Firstly, roughly positioning a license plate region of a shot image by utilizing a trained non-motor vehicle license plate image positioning network, then accurately positioning the license plate region by adopting a plurality of screening strategies, correcting a distorted license plate, and finally outputting the license plate region; and after the output license plate is binarized, the characters of the license plate are segmented, the segmented characters are identified by using a trained BP neural network, and finally the license plate number of the non-motor vehicle of the shot image is output.

S1: training a convolutional neural network (Mask-RCNN) -based non-motor vehicle license plate positioning network, comprising the following steps;

s1.1: collecting a training data set: the data set used for training mainly comes from two ways, one part is obtained from the network, the other part comes from street solid shooting, the total number of the obtained data set is 3200 images containing the license plate of the non-motor vehicle;

s1.2: data enhancement: the number of the samples in the S1.1 is still insufficient for the data size required by deep learning, the data size is insufficient, overfitting is easily caused, in order to solve the problem, data enhancement needs to be performed on the obtained image sample data, the advantage of the data enhancement is that overfitting can be prevented and the generalization capability of the model can be improved while the data size is insufficient, and after the data enhancement is performed on the samples in the S1.1, 3200 original images containing license plates of non-motor vehicles are increased to 6500 images;

s1.3: and (3) labeling the data set obtained in the step (S1.2) to obtain the position of the license plate image in the original image data, wherein the method comprises the following steps:

s1.3.1: data set preprocessing, the preprocessing comprising: removing invalid data, and uniformly setting the resolution of the data set picture to 1280 x 960;

s1.3.2: marking a frame, namely selecting a license plate object of a non-motor vehicle to be detected by the frame, and obtaining coordinates of four vertexes;

s1.3.3: dividing the S1.3.2 labeled data set into a training set, a verification set and a test set according to the ratio of 8:1: 1;

s1.4: training on the S1.3.3 segmented data set by adopting a Mask-RCNN convolutional neural network structure to obtain a detection model for detecting a non-motor vehicle license plate image area in a shot image;

s2: detecting a license plate image area of the non-motor vehicle in the shot picture by using the detection model in the S1.4, and acquiring images of some suspected license plates of the non-motor vehicle and corresponding position coordinates;

s3: and accurately screening the suspected non-motor vehicle license plate image obtained in the step S2 to finish the accurate positioning of the license plate image, wherein the method comprises the following steps:

s3.1: the width-to-height ratio of the license plate of the non-motor vehicle is approximately within the range of 2.5-7.5, according to the ratio, firstly, an area rough screening is carried out on the image of S2, and some overlarge or undersize areas are removed;

s3.2: accurately screening candidate license plates by adopting a Support Vector Machine (SVM) classifier, and outputting a final candidate non-motor vehicle license plate image;

s4: the method comprises the following steps of (1) performing inclination correction on the non-motor vehicle license plate image in the S3.2 so as to be convenient for subsequent non-motor vehicle license plate character recognition, extracting the inclination angle of the inclined license plate by adopting a Hough transformation method, and completing license plate image rotation, wherein the method comprises the following steps:

s4.1: the method comprises the steps of carrying out edge detection on a license plate image of the non-motor vehicle, and detecting a straight line in the horizontal direction in the image by using a Sobel operator;

s4.2: assuming that an image corresponds to an X-o-Y space, defining an S-o-theta (the range of theta angle is 1-180) space, calculating each point of which the pixel is 1 in the image (applying a formula S ═ xcos theta + ysin theta), drawing a curve of which each pixel is 1, and simultaneously dividing an S-theta plane into small grids with equal intervals (1 × 1), wherein the small grids correspond to a counting matrix, and the value of the corresponding element of the counting matrix is added with 1 in the grids passed by the curve, so that the value of the element of the counting matrix is equal to the number of collinear points after each point in the original image is calculated, and the maximum value of the element in the counting matrix corresponds to the longest straight line in the original image;

s4.3: detecting a column coordinate theta corresponding to the maximum element of the counting matrix, wherein the theta is an included angle between the normal of the straight line and the X axis;

s4.4: and determining the inclination angle of the straight line through the theta angle determined in the S4.3, further rotating the image to finish inclination correction, and outputting a final license plate image for license plate recognition.

S5: and S4.4, performing character segmentation on the license plate image subjected to the inclination correction, wherein the method comprises the following steps:

s5.1: carrying out binarization processing on the non-motor vehicle license plate image in the S4.4, selecting a proper threshold value to segment the characters from the license plate background, setting the gray value of the pixels larger than the threshold value as 0, namely black, and setting the gray value of the pixels smaller than the threshold value as 255, namely white; the invention adopts an iterative threshold value method to carry out binarization processing, and comprises the following steps:

s5.1.1: selecting an initial estimation value T (average gray level of the image) for the global threshold;

s5.1.2: the image is segmented by T, yielding two sets of pixels: g1 is composed of pixels with gray values larger than T, G2 is composed of pixels smaller than or equal to T;

s5.1.3: calculating average gray values m1 and m2 of the G1 and G2 pixels;

s5.1.4: calculating a new threshold value T ═ (m1+ m 2)/2;

s5.1.5: repeating steps S5.1.2 through S5.1.4 until the value of T in successive iterations stabilizes;

s5.2: carrying out gradient sharpening on the binarized image in the S5.1 to enable the blurred image to be clear, and sharpening the image by adopting a Robert gradient operator;

s5.3: removing noise (including rivet removal) on the basis of the image in S5.2, specifically as follows: scanning the whole image, counting the number SUM (i) of white pixel points owned by each line, and further solving a threshold value P:

wherein M represents the number of rows and N represents the number of columns, and comparing with SUM (i) and threshold p, if

If SUM (i) is less than P, all the pixel points of the line change are set to be black; if SUM (i) > P, the row is kept unchanged;

s5.4: and (5) performing character segmentation on the image subjected to the denoising processing in S5.3, wherein the character segmentation method comprises the following steps:

s5.4.1: firstly, scanning an image line by line from bottom to top until a first white pixel point is met, and recording; then, scanning line by line from top to bottom until a first white pixel point is met, and determining the height range of the image;

s5.4.2: scanning from left to right in the height range determined by S5.4.1 column by column, considering the first white pixel as the starting point of character segmentation, then continuing scanning until no white pixel exists in a column, considering the column as the end position of segmentation, and scanning to the rightmost end of the image according to the method to obtain the more accurate width range of each character;

s5.4.3: in the known width range of each character, acquiring the height range of each character according to the method S5.4.1 to perform accurate character segmentation;

s5.4.4: normalizing the characters segmented by S5.4.3, and setting the uniform resolution size to be 12 × 12;

s6: the invention adopts a character recognition method based on a BP neural network, which comprises the following steps:

s6.1: the method adopts the fusion of the gray level features of the image and the four-corner features of the character to recognize the character, and the specific features are described as follows:

gray scale characteristics: scanning 12-by-12 pixel dot-matrix graphs after character normalization one by one from top to bottom and from left to right, and storing pixel values into a one-dimensional array according to the pixel position of 0 or 1, so that 144 gray-scale features of the character images are generated;

character four corner characteristics: for similar characters, such as '0' and 'D', the invention adopts the character four-corner characteristics and distinguishes the characters on the basis of the gray characteristics, and the specific method is as follows:

dividing the normalized pixel dot matrix into 12 rows and 12 columns according to pixels;

scanning along the main diagonal direction from the upper left vertex, and calculating the number of first white pixel points from the upper left vertex of the image;

calculating the number of points from the top right vertex, the bottom left vertex and the bottom right vertex to the first white pixel point according to the same method;

by extracting the four-corner features, the four-corner feature vectors of '0' and 'D' can be obtained as follows:

V₀＝{0，0,0,0}V_D＝{1，0，1，0}

s6.2: training by taking the characteristics of S6.1 as an input vector of a BP neural network to obtain a character recognition model;

s6.3: and calling the recognition model obtained in S6.2 for the S5.4.4 normalized characters to perform character recognition, and outputting the number of the non-motor vehicle license plate.

The sample data enhancement in step S1.2 includes the following steps:

s1.2.1: and (3) scale transformation: performing Gaussian filtering on the data acquired in the S1.1, and changing the blurring degree of the image so as to increase the sample data scale;

s1.2.2: and (3) position conversion: performing position transformation on the data sample obtained in the S1.1 comprises three different strategies, namely translation, rotation and turnover transformation, and performing data enhancement by randomly adopting one strategy for each sample picture;

specifically, image translation refers to translating image content, and setting parameters such as translation direction and translation step length in a random setting mode; the image rotation means that the image content is rotated by a certain angle, and the orientation of the image content is changed; the image flipping means flipping an image in a horizontal or vertical direction;

s1.2.3: contrast and saturation conversion: firstly, converting a sample picture from an RGB color space to an HSV color space; second, the saturation component (S) and the luminance component (V) are changed while keeping the hue component (H) unchanged.

In the step S1.2.3, the specific calculation formula of the color space conversion is as follows:

R'＝R/255G'＝G/255B'＝B/255

wherein: r, G, B are picture color components;

C_max＝max(R',G',B')C_min＝min(R',G',B')

Δ＝C_max-C_min

HSV color component calculation:

V＝C_max

the implementation method of the support vector machine classifier in the step S3.2 comprises the following steps:

the method selects 1000 non-motor vehicle license plate samples (positive samples) and 3000 non-license plate samples (negative samples), divides data into training data (tranndata) and testing data (testdata), in the experiment, 80% of license plate samples are used as training data, 20% of license plate samples are used as testing data, wherein proportion training of the non-license plate sample training data and the testing data respectively accounts for 80% and 20%, finally, the data are input into an SVM for training, and a trained final model is stored in an xml form to be called in a program.

Claims

1. A Mask-RCNN-based non-motor vehicle license plate recognition method is characterized by comprising the following steps: the method comprises the following steps:

s1: training a convolutional neural network (Mask-RCNN) based non-motor vehicle license plate detection model;

s2: detecting a license plate image area of the non-motor vehicle in the shot picture by using the detection model of S1, and acquiring images of some suspected license plates of the non-motor vehicle and corresponding position coordinates;

s4: the method comprises the steps of performing inclination correction on the license plate image of the non-motor vehicle in S3 so as to be convenient for subsequent recognition of characters of the license plate of the non-motor vehicle, extracting the inclination angle of the inclined license plate by adopting a Hough transformation method, and completing rotation of the license plate image;

s5: performing character segmentation on the license plate image subjected to the inclination correction in the step S4, and performing normalization on the character image;

s6: and recognizing the normalized character image of S5 by using a character recognition method based on a BP neural network.

2. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: step S1 specifically includes:

s1.1: collecting a training data set: the data set used for training mainly comes from two ways, one part is obtained from the network, another part comes from street and field shooting, the total number of the data set obtained is 3200 images containing the license plate of the non-motor vehicle;

s1.4: and training on the S1.3.3 segmented data set by adopting a Mask-RCNN convolutional neural network structure to obtain a detection model for detecting the image area of the license plate of the non-motor vehicle in the shot image.

3. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 12, wherein: the sample data enhancement in step S1.2 includes the following steps:

the image translation refers to translating the image content, and setting parameters such as translation direction, translation step length and the like in a random setting mode; the image rotation means that the image content is rotated by a certain angle, and the orientation of the image content is changed; the image flipping means flipping an image in a horizontal or vertical direction;

R'＝R/255G'＝G/255B'＝B/255

wherein: r, G, B are picture color components;

C_max＝max(R',G',B')C_min＝min(R',G',B')

Δ＝C_max-C_min

HSV color component calculation:

V＝C_max

4. the Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: step S3 specifically includes:

s3.1: the width-to-height ratio of the license plate of the non-motor vehicle is 2.5-7.5, firstly, carrying out region coarse screening on the image obtained in the step S2 according to the ratio, and removing some oversize or undersize regions;

s3.2: and (3) accurately screening the candidate license plates by adopting a Support Vector Machine (SVM) classifier, and outputting a final candidate non-motor vehicle license plate image.

5. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: the implementation method of the support vector machine classifier in the step S3.2 comprises the following steps:

selecting 1000 non-motor vehicle license plate samples (positive samples) and 3000 non-license plate samples (negative samples), dividing data into training data (train data) and testing data (test data), taking 80% of the license plate samples as the training data and 20% of the license plate samples as the testing data, wherein the proportion training of the non-license plate sample training data and the testing data respectively accounts for 80% and 20%, finally inputting the data into an SVM for training, and storing the trained final model in an 'xml' form for calling in a program.

6. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: the step S4 specifically includes:

s4.1: performing edge detection on the license plate image of the non-motor vehicle, and detecting a straight line in the horizontal direction in the image by using a Sobel operator;

s4.2: assuming that an image corresponds to an X-o-Y space, defining an S-o-theta (the theta angle ranges from 1 to 180) space, calculating each point of which the pixel is 1 in the image by using a formula S (xcos theta + ysin theta), drawing a curve of the point of which each pixel is 1, and simultaneously dividing an S-theta plane into small grids with equal intervals (1 multiplied by 1), wherein the small grids correspond to a counting matrix, and the value of the element of the counting matrix is added by 1 after each point in the original image is calculated, so that the value of the element of the counting matrix is equal to the collinear point number, and the maximum value of the element in the counting matrix is corresponding to the longest straight line in the original image;

7. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: the step S5 specifically includes:

s5.1: carrying out binarization processing on the non-motor vehicle license plate image in the S4.4, selecting a proper threshold value to segment the characters from the license plate background, setting the gray value of the pixels larger than the threshold value as 0, namely black, and setting the gray value of the pixels smaller than the threshold value as 255, namely white; an iterative threshold value method is adopted for binarization processing, and the method comprises the following steps:

s5.1.1: selecting an initial estimation value T for the global threshold, wherein the T adopts the average gray scale of the image;

s5.1.3: calculating average gray values m1 and m2 of the G1 and G2 pixels;

s5.1.4: calculating a new threshold value T ═ (m1+ m 2)/2;

s5.3: removing noise of the miscellaneous points on the basis of the image in the S5.2, including rivet removal, and the specific method is as follows: scanning the whole image, counting the number SUM (i) of white pixel points owned by each line, and further solving a threshold value P:

wherein M represents the number of lines owned by the image, N represents the number of columns owned by the image, SUM (i) is used for comparing with a threshold value P, and if SUM (i) is less than P, all the pixel points of the changed lines are set to be black; if SUM (i) > P, the row is kept unchanged;

s5.4.4: the character segmented by S5.4.3 was normalized to set a uniform resolution size of 12 x 12.

8. The Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, wherein: the character recognition method based on the BP neural network described in step S6 includes the following steps:

V₀＝{0，0,0,0}V_D＝{1，0，1，0}

9. The system for implementing the Mask-RCNN-based non-motor vehicle license plate recognition method of claim 1, comprising:

license plate detects orientation module: training a non-motor vehicle license plate positioning model based on a convolutional neural network (Mask-RCNN) to position a non-motor vehicle license plate region of a shot image;