CN111027539A

CN111027539A - License plate character segmentation method based on spatial position information

Info

Publication number: CN111027539A
Application number: CN201910989195.XA
Authority: CN
Inventors: 张卡; 何佳; 尼秀明
Original assignee: Anhui Qingxin Internet Information Technology Co ltd
Current assignee: Anhui Qingxin Internet Information Technology Co ltd
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2020-04-17
Anticipated expiration: 2039-10-17
Also published as: CN111027539B

Abstract

A license plate character segmentation method based on spatial position information can solve the technical problem that the existing license plate character segmentation method has poor segmentation effect on low-quality license plate images which are stained, adhered, lost in characters and inaccurate in positioning. S1, establishing a deep neural network model; s2, optimizing the deep neural network model parameters through the marked training sample data to obtain an optimal deep neural network model; s3, license plate image information is read, and the spatial position information features among the license plate characters are extracted through the optimal deep neural network model operation, so that the position of each character of the license plate is obtained. The invention adopts a deep learning technology to directly predict the positions of the license plate characters, adopts a anchor-free training mechanism, reduces the difficulty of model training, comprehensively utilizes the overall characteristics and the local characteristics of the license plate characters, has more accurate license plate character segmentation result, and has stronger robustness for the low-quality license plate images with stain, adhesion, character loss and inaccurate positioning.

Description

License plate character segmentation method based on spatial position information

Technical Field

The invention relates to the technical field of license plate recognition, in particular to a license plate character segmentation method based on spatial position information.

Background

License plate discernment is the core technology of intelligent transportation, has contained three parts: license plate positioning, character segmentation and character recognition. The character segmentation is the most important part of the whole technology, and the quality of the character segmentation directly influences the subsequent character recognition and further influences the overall recognition performance.

Character segmentation means that each single character is accurately segmented from an image with known license plate accurate positions. For clear license plate images, a plurality of mature methods are available, and good segmentation results can be obtained, however, in an actual environment, due to the existence of various complex scenes such as light change, shooting angle, license plate pollution and the like, the defects of blurring, missing, adhesion and the like of license plate characters can be caused, and the current mature method is difficult to perform accurate segmentation, so that the final license plate recognition fails. Therefore, how to effectively perform accurate character segmentation on the low-quality license plate image is still a problem of limiting the license plate recognition technology at present.

At present, the license plate character segmentation mainly comprises the following methods:

(1) the method is based on a vertical projection method, and the edge position of each character is obtained according to the positions of wave crests and wave troughs of a vertical projection curve of license plate characters. The method has the advantages of simple algorithm, high speed and good segmentation effect on clear license plates, and has the defect that the segmentation effect is obviously reduced or even fails on some low-quality license plates which are stained, adhered and positioned inaccurately.

(2) The method based on the connected region analysis comprises the steps of firstly carrying out license plate image binarization, carrying out analysis by utilizing the characteristics that all single characters are in a single connected region, and finally obtaining the positions of the characters. The method has the advantages of good adaptability to the license plate with low image quality and high speed, but has no effect on the missing and adhered characters.

(3) A method based on machine learning, such as a license plate character segmentation method based on a support vector machine, comprises the steps of obtaining the layout rule characteristics of a license plate, training and learning by means of a classifier, and finally completing the segmentation of license plate characters. The method has the advantages that the method has good recognition effect on clear license plates, has certain resistance to low-quality images, and has the defects that proper layout rule features are difficult to select, part of deformed license plates do not meet the layout rule, and the process of calculating the features is relatively complex.

(4) In recent years, deep learning technology can simulate a human brain neural network, accurate nonlinear prediction can be performed, various fields are widely concerned and applied, a group of classical target position detection network frameworks such as fast rcnn, ssd, yolo and the like appear, and the classical network frameworks can well detect the positions of license plate characters through transfer learning.

Disclosure of Invention

The invention provides a license plate character segmentation method based on spatial position information, which can solve the technical problem that the existing license plate character segmentation method has poor segmentation effect on low-quality license plate images with stain, adhesion, character loss and inaccurate positioning.

In order to achieve the purpose, the invention adopts the following technical scheme:

a license plate character segmentation method based on spatial position information comprises the following steps:

s1, establishing a deep neural network model;

s2, optimizing the deep neural network model parameters through the marked training sample data to obtain an optimal deep neural network model;

and S3, reading license plate image information, and extracting the space position information characteristics among the license plate characters through the optimal deep neural network model operation, thereby obtaining the position of each character of the license plate.

Further, the S1, establishing the deep neural network model specifically includes:

s11, designing an input image of the deep neural network model;

s12, designing a fast descent network for acquiring high-level features of the input image;

s13, designing a spatial position information network, and acquiring a character spatial position information characteristic diagram based on the high-level characteristics of the input image;

s14, designing a character prediction network, further improving the expression capability of the characteristic network on the basis of the acquired character space position information characteristic diagram, and finally predicting the accurate position of each character of the license plate.

Further, the RGB image having an input image size of 512 × 256 is used in the step S11.

Further, the fast descent network includes a convolution layer conv0, a maximum value down-sampling layer maxpool0, two residual network infrastructure resnetblock0 and resnetblock1, a merging layer eltsum and a convolution layer conv 2;

the core size of the convolutional layer conv0 is 7 × 7, and the span is 4 × 4;

the maximum downsampled layer maxpool0 kernel size is 2 x 2, the span is 2 x 2;

the residual error network infrastructure includes convolutional layer convresnet0, convresnet1_0, convresnet1_1, convresnet1_2, and,

The convolutional layer convresnet0 has a core size of 3 × 3 and a span of 2 × 2, the convresnet1_0 has a core size of 1 × 1 and a span of 1 × 1, and the convolutional layer convresnet1_0 functions to reduce the number of feature map channels and reduce the computation amount of subsequent convolutional layers; the convresnet1_1 core size is 3 × 3, the span is 2 × 2, the convolutional layer convresnet1_2 core size is 1 × 1, the span is 1 × 1, and the convolutional layer convresnet1_2 has the functions of increasing the number of feature map channels and increasing feature richness;

the merging layer eltsum is a merging layer for performing pixel-by-pixel addition on the two input feature maps;

the convolutional layer conv2 is a convolutional layer with a core size of 3 × 3 and a span of 1 × 1, and functions to perform merging feature fusion.

Further, the spatial position information network comprises a height direction spatial position information characteristic graph height and a width direction spatial position information characteristic graph width;

the width-direction spatial position information feature map obtaining step is as follows, and the output feature map size of step S12 is set to 8 × 16 × 128:

the method specifically comprises the following steps:

s131, slicing line by line along the width direction, where the size of each slice feature map is 8 × 1 × 128, and the names of the slice feature maps are cut0, cut1, and cut2.. cut 15;

s132, convolving the first slice feature picture cut0 by using 128 convolution kernels with the kernel size of 3 × 128 and the span of 1 × 1 to obtain an output feature picture cut0-out with the size of 8 × 1 × 128;

s133, adding the output characteristic picture cut0-out obtained in the step S132 and the slice characteristic picture cut1 pixel by pixel to obtain a new slice characteristic picture cut1_ new;

s134, carrying out the operation similar to the step S132 and the step S133 on the obtained new slice feature picture cut1_ new to obtain a new slice feature picture cut2_ new, and circulating the step S132 and the step S133 until obtaining the last new slice feature picture cut15_ new;

s135 collects all the new slice feature maps obtained in steps S131 to S134, and splices them according to the width dimension, and the output feature map is the width-direction spatial position information feature map.

Further, the character prediction network in step S14 includes two branch networks, namely a LocX branch and a LocY branch, where the LocX branch is used to predict the division position of each character of the license plate in the X coordinate axis direction, and the LocY branch is used to predict the division position of each character of the license plate in the Y coordinate axis direction;

the network structure of the LocX branch and the LocY branch is the same, including that convrecog0 is a convolutional layer with a core size of 3 × 3 and a span of 2 × 2, convrecog1 is a convolutional layer with a core size of 3 × 3 and a span of 2 × 1, convog 2 is a convolutional layer with a core size of 2 × 2 and a span of 1 × 1, and the output feature map size is 1 × 1 × 14, where 14 represents the number of prediction regression values of each branch network.

Further, in step S2, optimizing parameters of the deep neural network model through the labeled training sample data to obtain an optimal deep neural network model;

the method specifically comprises the following steps:

s21, acquiring training sample images, including collecting license plate images under various scenes, various light rays and various angles, acquiring a local area image of a license plate by using the existing license plate detection method, and then labeling position information of license plate characters;

s22, designing a target loss function of the deep neural network model;

and S23, training the deep neural network model, including sending the marked license plate character sample image set into the well-defined deep neural network model, and learning and determining model parameters.

Further, the method for labeling the position information of the vehicle license plate character in step S21 includes:

the method comprises the steps of firstly obtaining the minimum circumscribed rectangle of a single character on the license plate, then obtaining the coordinates of the borders of 4 minimum circumscribed rectangles, namely the left border coordinate left _ X, the right border coordinate right _ X, the upper border coordinate up _ Y and the lower border coordinate bottom _ Y, and finally serially connecting the left border and the right border of all characters on the license plate in sequence to be used as the marked value in the X coordinate axis direction.

Further, the step S3 of reading license plate image information, extracting spatial position information features between license plate characters through the optimal deep neural network model operation, and further obtaining a position of each character of the license plate; the method specifically comprises the following steps:

and reading any given local image information of the license plate, performing forward operation on the local image information through a deep neural network model, outputting a feature map which is a segmentation position set of each character on the license plate in the direction of a single coordinate axis, and performing combined value taking on the output feature maps of the two branch networks in sequence to obtain the segmentation position of each character on the license plate.

According to the technical scheme, the license plate character segmentation method based on the spatial position information has the following beneficial effects:

according to the method, the position of the license plate character is directly predicted by adopting a deep learning technology, an efficient fast descent network is adopted, the memory consumption of a model is reduced, the running speed of the system is greatly improved, an anchor-free training mechanism is adopted, the difficulty of model training is reduced, meanwhile, the convergence speed of the training model is higher, the overall characteristic and the local characteristic of the license plate character are comprehensively utilized, the license plate character segmentation result is more accurate, and the robustness is higher for the low-quality license plate image which is stained, adhered, character-missing and inaccurate in positioning.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a diagram of a deep neural network model architecture;

FIG. 3 is a diagram of a residual network infrastructure architecture;

FIG. 4 is a diagram of a network architecture for spatial position information in the width direction;

wherein, the alphanumerics beside each module graph represent the name of the current feature layer, the feature diagram size of the current feature layer, namely: the feature map height x feature map width x number of feature map channels.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.

In this embodiment, a common single-layer blue license plate and a common single-layer yellow license plate are taken as examples, the license plate has 7 characters, and a fixed arrangement rule exists between the characters of the license plate.

Specifically, in this embodiment, as shown in fig. 1, a license plate character segmentation method based on spatial position information includes the following specific steps:

s1, establishing a deep neural network model;

The embodiments of the present invention are specifically described below for each of the above steps:

s1, designing a deep neural network model, wherein the deep neural network model designed by the embodiment of the invention is mainly used for extracting the spatial position information characteristics among the license plate characters by means of the deep neural network model so as to accurately acquire the position of each character of the license plate. The particularity of a license plate character segmentation task and the computing capacity of a convolutional neural network are comprehensively considered, a deep neural network model adopted by the embodiment of the invention is shown in fig. 2, and the deep neural network model comprises a fast descent network, a spatial position information network, a character prediction network and the like. The embodiment of the invention adopts a Convolutional Neural Network (CNN), the size of a characteristic diagram refers to the height of the characteristic diagram multiplied by the width of the characteristic diagram multiplied by the number of channels of the characteristic diagram, the size of a kernel refers to the width of the kernel multiplied by the height of the kernel, and the span refers to the width of the span multiplied by the span of the height direction.

The specific design steps of the deep neural network model are as follows:

s11, designing an input image of the deep neural network model, wherein the input image adopted by the embodiment of the invention is an RGB image with the size of 512 x 256, and the larger the size of the input image is, the more the input image contains details, the more the input image is beneficial to accurately segmenting license plate characters, but the storage space and the operation amount of the deep neural network model can be increased.

S12, designing a fast descent network, wherein the fast descent network is mainly used for fast acquiring high-level features with high abstraction and rich expression capability of an input image, and the accuracy of subsequent character segmentation is directly influenced by the quality of high-level feature extraction. As can be seen from step S11, the input image size adopted in the embodiment of the present invention is large, which is not favorable for fast operation of the deep neural network model, and therefore, an efficient network capable of extracting features of the input image is required to quickly remove the operation amount influence caused by the large input image size. The fast descent network adopted by the invention is shown in fig. 2, conv0 is a convolution layer with the kernel size of 7 × 7 and the span of 4 × 4, and the large-kernel-size and large-span convolution operation has the advantages that the size of a characteristic graph can be fast reduced, the operation amount of subsequent operation is greatly reduced, and more image details are reserved; maxpool0 is a maximum downsampled layer with a kernel size of 2 x 2 and a span of 2 x 2, resnetblock0, resnetblock1 are two residual network infrastructures, in the residual network infrastructure, as shown in fig. 3, convresnet0 is a convolutional layer having a core size of 3 × 3 and a span of 2 × 2, convresnet1_0 is a convolutional layer having a core size of 1 × 1 and a span of 1 × 1, the convolutional layer is used for reducing the number of characteristic diagram channels and the operation amount of the subsequent convolutional layer, convresnet1_1 is a convolutional layer with the core size of 3 multiplied by 3 and the span of 2 multiplied by 2, convresnet1_2 is a convolutional layer with the core size of 1 multiplied by 1 and the span of 1 multiplied by 1, the convolutional layer has the functions of increasing the number of feature map channels and increasing feature richness, eltsum is a merging layer for performing pixel-by-pixel addition on two input feature maps, conv2 is a convolutional layer with the kernel size of 3 multiplied by 3 and the span of 1 multiplied by 1, and the convolutional layer has the functions of merging features.

S13, a space position information network is designed, the license plate character segmentation in the embodiment of the invention is different from the universal target position detection application, the same regular arrangement exists among all license plate character positions, and the regular arrangement exists in the whole area of all characters of the license plate and also exists between any two adjacent characters. Therefore, the license plate characters are accurately segmented, and the segmentation is related to not only the overall characteristics of the license plate characters, but also the local characteristics among the license plate characters. The invention adopts a novel spatial position information network, and can comprehensively utilize the overall characteristics and the local characteristics of the license plate characters. As shown in fig. 2, the spatial position information network includes a height direction spatial position information feature map height and a width direction spatial position information feature map width, where the height direction spatial position information feature map and the width direction spatial position information feature map are similar to each other in the obtaining method, and taking the width direction spatial position information network as an example, as shown in fig. 4, the specific obtaining steps are as follows, where the size of the output feature map of step S12 is 8 × 16 × 128:

s131, slicing the slices line by line along the width direction, where the size of each slice feature map is 8 × 1 × 128, and the names of the slice feature maps are cut0, cut1, and cut2.

S132, convolving the first slice feature map cut0 with 128 convolution kernels having a kernel size of 3 × 128 and a span of 1 × 1, and obtaining an output feature map cut0-out having a size of 8 × 1 × 128.

and S134, performing the same operation as the step S132 and the step S133 on the obtained new slice feature picture cut1_ new to obtain a new slice feature picture cut2_ new, and circulating the step S132 and the step S133 until obtaining the last new slice feature picture cut15_ new.

S135, collecting all the new slice feature maps obtained in the steps S131 to S134, splicing the new slice feature maps according to the width direction dimension, wherein the output feature map is a width direction spatial position information feature map;

and S14, designing a character prediction network, wherein the character prediction network is mainly used for further improving the expression capability of the characteristic network on the basis of the character space position information characteristic diagram obtained in the step S13, and finally predicting the accurate position of each character of the license plate. The character prediction network includes two branch networks, namely a LocX branch and a LocY branch, wherein the LocX branch is mainly used for predicting the division position of each character of the license plate in the X coordinate axis direction, the LocY branch is mainly used for predicting the division position of each character of the license plate in the Y coordinate axis direction, the network structures of the LocX branch and the LocY branch are the same, as shown in fig. 2, convrecog0 is a convolutional layer with a core size of 3 × 3 and a span of 2 × 2, convog 1 is a convolutional layer with a core size of 3 × 3 and a span of 2 × 1, convog 2 is a convolutional layer with a core size of 2 × 2 and a span of 1 × 1, the output characteristic diagram size is 1 × 1 × 14, and 14 represents the number of prediction regression values of each branch network, and the setting method of the values is as follows: each branch network needs to predict the segmentation position of 7 characters of the license plate, and in each coordinate axis direction, 2 coordinates need to be used for representing the segmentation position of one character of the license plate;

s2, training the deep neural network model, optimizing parameters of the deep neural network model mainly through a large amount of labeled training sample data to enable the recognition performance of the deep neural network model to be optimal, and specifically comprising the following steps:

s21, acquiring training sample images, mainly collecting license plate images under various scenes, various light rays and various angles, acquiring local area images of the license plate by using the existing license plate detection method, and then labeling the position information of license plate characters. The specific labeling method is as follows: the method comprises the steps of firstly obtaining the minimum circumscribed rectangle of a single character on the license plate, then obtaining the coordinates of the borders of 4 minimum circumscribed rectangles, namely the left border coordinate left _ X, the right border coordinate right _ X, the upper border coordinate up _ Y and the lower border coordinate bottom _ Y, and finally serially connecting the left border and the right border of all characters on the license plate in sequence to be used as the marked value in the X coordinate axis direction.

S22, designing a target loss function of the deep neural network model, wherein the target loss function is a mean square error loss function.

S23, training a deep neural network model, mainly sending a marked license plate character sample image set into the well-defined deep neural network model, and learning related model parameters;

s3, a deep neural network model is used, after the deep neural network model is trained, the model is used in an actual environment, for any given license plate local image, after forward operation of the deep neural network model, an output feature map is a segmentation position set of each character on the license plate in the single coordinate axis direction, and the output feature maps of the two branch networks are combined and valued in sequence, so that the segmentation position of each character on the license plate can be obtained.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A license plate character segmentation method based on spatial position information is characterized by comprising the following steps:

the method comprises the following steps:

s1, establishing a deep neural network model;

2. The license plate character segmentation method based on the spatial position information as claimed in claim 1, wherein: the S1, establishing the deep neural network model, specifically including:

s11, designing an input image of the deep neural network model;

3. The license plate character segmentation method based on the spatial position information as claimed in claim 2, wherein: the RGB image having an input image size of 512 × 256 employed in the step S11.

4. The license plate character segmentation method based on the spatial position information as claimed in claim 2, wherein: the fast descent network comprises a convolution layer conv0, a maximum value down-sampling layer maxpool0, two residual error network infrastructure structures resnetblock0 and resnetblock1, a merging layer eltsum and a convolution layer conv 2;

the maximum downsampled layer maxpool0 kernel size is 2 x 2, the span is 2 x 2;

5. The license plate character segmentation method based on the spatial position information as claimed in claim 2, wherein: the spatial position information network comprises a height direction spatial position information characteristic graph height and a width direction spatial position information characteristic graph width;

the method specifically comprises the following steps:

6. The license plate character segmentation method based on the spatial position information as claimed in claim 5, wherein:

the character prediction network in the step S14 includes two branch networks, namely a LocX branch and a LocY branch, where the LocX branch is used to predict the division position of each character of the license plate in the X coordinate axis direction, and the LocY branch is used to predict the division position of each character of the license plate in the Y coordinate axis direction;

7. The license plate character segmentation method based on the spatial position information as claimed in claim 1, wherein: the S2, optimizing the deep neural network model parameters through the marked training sample data to obtain an optimal deep neural network model;

the method specifically comprises the following steps:

s22, designing a target loss function of the deep neural network model;

8. The license plate character segmentation method based on the spatial position information as set forth in claim 7, wherein: the target loss function in step S22 is a mean square error loss function.

9. The license plate character segmentation method based on the spatial position information as set forth in claim 7, wherein: the method for labeling the position information of the license plate character in the step S21 includes:

10. The license plate character segmentation method based on the spatial position information as set forth in claim 6, wherein: the step S3 of reading license plate image information, extracting space position information characteristics among license plate characters through the optimal deep neural network model operation, and further obtaining the position of each character of the license plate;

the method specifically comprises the following steps: