CN111027539B

CN111027539B - License plate character segmentation method based on spatial position information

Info

Publication number: CN111027539B
Application number: CN201910989195.XA
Authority: CN
Inventors: 张卡; 何佳; 尼秀明
Original assignee: Anhui Qingxin Internet Information Technology Co ltd
Current assignee: Anhui Qingxin Internet Information Technology Co ltd
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2023-11-07
Anticipated expiration: 2039-10-17
Also published as: CN111027539A

Abstract

A license plate character segmentation method based on spatial position information can solve the technical problems that the existing license plate character segmentation method is poor in low-quality license plate image segmentation effect on offset, adhesion, character deletion and inaccurate positioning. S1, establishing a deep neural network model; s2, optimizing parameters of the deep neural network model through marked training sample data to obtain an optimal deep neural network model; s3, reading license plate image information, and extracting spatial position information features among license plate characters through optimal deep neural network model operation, so that the position of each character of the license plate is obtained. According to the invention, the deep learning technology is adopted to directly predict the positions of license plate characters, the non-anchor training mechanism is adopted, the difficulty of model training is reduced, the integral features and the local features of license plate characters are comprehensively utilized, the license plate character segmentation result is more accurate, and the robustness is stronger for low-quality license plate images with pollution, adhesion, character deletion and inaccurate positioning.

Description

License plate character segmentation method based on spatial position information

Technical Field

The invention relates to the technical field of license plate recognition, in particular to a license plate character segmentation method based on spatial position information.

Background

License plate recognition is a core technology of intelligent traffic and comprises three parts: license plate positioning, character segmentation and character recognition. The character segmentation is the most important part of the whole technology, and the quality of the character segmentation directly influences the subsequent character recognition, thereby influencing the overall recognition performance.

Character segmentation refers to precisely segmenting each single character in an image with a known accurate position of a license plate. For clear license plate images, a plurality of mature methods exist, and a better segmentation result can be obtained, however, in the actual environment, due to the existence of various complex scenes such as light change, shooting angle, license plate offset and the like, the defects such as blurring, missing, adhesion and the like of license plate characters can be caused, and the current mature method is difficult to accurately segment, so that final license plate recognition fails. Therefore, how to effectively perform accurate character segmentation on the low-quality license plate image is still a difficult problem of currently limiting license plate recognition technology.

Currently, license plate character segmentation mainly comprises the following methods:

(1) Based on the vertical projection method, the method obtains the edge position of each character by obtaining the vertical projection curve of the license plate character according to the peak and trough positions of the curve. The method has the advantages of simple algorithm, high speed and better segmentation effect for clear license plates, and has the defects that the segmentation effect can be obviously reduced or even disabled for low-quality license plates with inaccurate fouling, adhesion and positioning.

(2) According to the method, firstly, binarization of license plate images is carried out, analysis is carried out by utilizing the characteristics that single characters are single connected areas, and finally, the positions of the characters are obtained. The method has the advantages of good adaptability to license plates with low image quality and high speed, however, the method has no effect on missing and adhered characters.

(3) A method based on machine learning, such as a license plate character segmentation method based on a support vector machine, is used for training and learning by means of a classifier by acquiring layout rule features of a license plate, and finally completes segmentation of license plate characters. The method has the advantages of better recognition effect on clear license plates and certain resistance on low-quality images, and has the defects that proper layout rule features are difficult to select, partial deformed license plates do not meet the layout rule, and the process of calculating the features is relatively complex.

(4) In recent years, the deep learning technology can simulate a human brain neural network, can perform accurate nonlinear prediction, is widely focused and applied in various fields, and a group of classical target position detection network frameworks such as faster rcnn, ssd, yolo and the like appear, and can well detect license plate character positions through transfer learning, but the technology has the defects of large model consumption memory, large operand, complex model training parameters based on an anchor box and difficult convergence, and severely limits the application of the deep learning algorithm in the license plate character segmentation field.

Disclosure of Invention

The license plate character segmentation method based on the spatial position information can solve the technical problems that the existing license plate character segmentation method is poor in low-quality license plate image segmentation effect on pollution, adhesion, character deletion and inaccurate positioning.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a license plate character segmentation method based on spatial position information comprises the following steps:

s1, establishing a deep neural network model;

s2, optimizing parameters of the deep neural network model through marked training sample data to obtain an optimal deep neural network model;

and S3, reading license plate image information, extracting spatial position information features among license plate characters through the optimal deep neural network model operation, and further obtaining the position of each character of the license plate.

Further, the step S1 of establishing the deep neural network model specifically includes:

s11, designing an input image of a deep neural network model;

s12, designing a rapid descent network for acquiring high-level features of an input image;

s13, designing a spatial position information network, and acquiring a character spatial position information feature map based on high-level features of the input image;

s14, designing a character prediction network, further improving the expression capacity of the feature network on the basis of the obtained character space position information feature map, and finally predicting the accurate position of each character of the car license plate.

Further, the input image size adopted in the step S11 is an RGB image of 512×256.

Further, the fast descent network comprises a convolution layer conv0, a maximum value downsampling layer maxpool0, two residual network infrastructure bodies, a resnetblock0 and a resnetblock1, a merging layer eltsum and a convolution layer conv2;

the kernel size of the convolutional layer conv0 is 7×7 and the span is 4×4;

the maximum downsampling layer maxpool0 kernel size is 2×2, span is 2×2;

the residual network infrastructure includes convolutional layers convreset 0, convreset 1_0, convreset 1_1, convreset 1_2,

The kernel size of the convolution layer convreset 0 is 3×3, the span is 2×2, the kernel size of the convreset 1_0 is 1×1, the span is 1×1, the effect of the convolution layer convreset 1_0 is to reduce the number of channels of the feature map and reduce the operation amount of the subsequent convolution layers; the core size of the convrenet1_1 is 3×3, the span is 2×2, the core size of the convolution layer convrenet1_2 is 1×1, the span is 1×1, and the function of the convolution layer convrenet1_2 is to promote the number of channels of the feature map and increase the feature richness;

the merging layer eltsum is a merging layer for adding two input feature images pixel by pixel;

the convolutional layer conv2 is a convolutional layer with a kernel size of 3×3 and a span of 1×1, and functions to perform merging feature fusion.

Further, the spatial position information network comprises a height direction spatial position information characteristic map, a width direction spatial position information characteristic map, and a width direction spatial position information characteristic map;

the width direction spatial position information feature map obtaining step is as follows, and the output feature map size of step S12 is set to 8×16×128:

the method specifically comprises the following steps:

s131, slicing the slice line by line along the width direction, wherein the size of each slice characteristic diagram is 8 multiplied by 1 multiplied by 128, and the names of the slice characteristic diagrams are cut0, cut1, cut2.

S132, performing convolution on the first slice characteristic diagram cut0 by using 128 convolution cores with the kernel size of 3 multiplied by 128 and the span of 1 multiplied by 1, wherein the size of the obtained output characteristic diagram cut0-out is 8 multiplied by 1 multiplied by 128;

s133, adding the output characteristic graphs cut0-out and the slice characteristic graph cut1 obtained in the step S132 pixel by pixel to obtain a new slice characteristic graph cu1_new;

s134, performing operations like the step S132 and the step S133 on the obtained new slice feature map cu1_new to obtain a new slice feature map cu2_new, and circulating the step S132 and the step S133 until the last new slice feature map cu15_new is obtained;

s135, collecting all the new slice feature graphs obtained in the steps S131 to S134, and splicing according to the width direction dimension, wherein the output feature graph is the width direction space position information feature graph.

Further, the character prediction network in step S14 includes two branch networks, namely a LocX branch and a LocY branch, where the LocX branch is used to predict a segmentation position of each character of the license plate in the X coordinate axis direction, and the LocY branch is used to predict a segmentation position of each character of the license plate in the Y coordinate axis direction;

the network structure of the LocX and LocY branches is the same and includes a convolution layer with a core size of 3×3 and a span of 2×2, convrelog 1 is a convolution layer with a core size of 3×3 and a span of 2×1, convrelog 2 is a convolution layer with a core size of 2×2 and a span of 1×1, and the output feature map size is 1×1×14, where 14 represents the number of predictive regression values per branch network.

Further, the S2 optimizes parameters of the deep neural network model through marked training sample data to obtain an optimal deep neural network model;

the method specifically comprises the following steps:

s21, acquiring training sample images, namely collecting license plate images under various scenes, various light rays and various angles, acquiring license plate local area images by using an existing license plate detection method, and then labeling the position information of license plate characters;

s22, designing a target loss function of the deep neural network model;

s23, training the deep neural network model, namely sending the marked license plate character sample image set into the defined deep neural network model, and learning and determining model parameters.

Further, the labeling method for labeling the position information of the license plate character in the step S21 includes:

firstly, acquiring the minimum circumscribed rectangle of a single character on a license plate, then acquiring the coordinates of 4 minimum circumscribed rectangle frames, namely a left frame coordinate left_x, a right frame coordinate right_x, an upper frame coordinate up_y and a lower frame coordinate bottom_y, finally, sequentially connecting the left frames and the right frames of all the characters on the license plate in series to serve as labeling values in the X coordinate axis direction, and similarly, sequentially connecting the upper frames and the lower frames of all the characters on the license plate to serve as labeling values in the Y coordinate axis direction.

Further, S3, license plate image information is read, spatial position information features among license plate characters are extracted through the optimal deep neural network model operation, and then the position of each character of the license plate is obtained; the method specifically comprises the following steps:

and reading the local image information of a license plate which is arbitrarily given, and performing forward operation on the model of the deep neural network, wherein the output feature map is a segmentation position set of each character on the license plate in the single coordinate axis direction, and the segmentation positions of each character on the license plate can be obtained by sequentially combining and taking values of the output feature maps of the two branch networks.

According to the technical scheme, the license plate character segmentation method based on the spatial position information has the following beneficial effects:

the invention adopts deep learning technology to directly predict the positions of license plate characters, adopts high-efficiency rapid descent network, reduces the memory consumption of the model, greatly improves the running speed of the system, adopts an anchor-free training mechanism, reduces the difficulty of model training, simultaneously ensures that the convergence speed of the training model is faster, comprehensively utilizes the integral characteristics and the local characteristics of the license plate characters, has more accurate license plate character segmentation results, and has stronger robustness on low-quality license plate images with pollution, adhesion, character deletion and inaccurate positioning.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a block diagram of a deep neural network model;

FIG. 3 is a block diagram of a residual network infrastructure;

FIG. 4 is a diagram of a widthwise spatial location information network;

wherein the alphanumeric number next to each module graphic represents the name of the current feature layer, the feature map size of the current feature layer, namely: feature height x feature width x feature channel number.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention.

In this embodiment, a common single-layer blue license plate and a single-layer yellow license plate are taken as examples, the license plate has 7 characters, and a fixed arrangement rule exists between the license plate characters.

Specifically, in the license plate character segmentation method based on spatial location information according to this embodiment, as shown in fig. 1, specific steps are as follows:

s1, establishing a deep neural network model;

The following specifically describes an embodiment of the present invention with respect to each of the above steps:

s1, designing a deep neural network model, wherein the deep neural network model designed by the embodiment of the invention mainly aims at extracting spatial position information characteristics among license plate characters by means of the deep neural network model so as to accurately acquire the position of each character of the license plate. The specificity of the license plate character segmentation task and the computing capacity of a convolutional neural network are comprehensively considered, a deep neural network model adopted by the embodiment of the invention is shown in figure 2, and the deep neural network model comprises a rapid descent network, a spatial position information network, a character prediction network and the like. The embodiment of the invention adopts a Convolutional Neural Network (CNN), wherein the dimension of a feature map refers to the height of the feature map, the width of the feature map and the number of channels of the feature map, the dimension of a kernel refers to the width of the kernel and the height of the kernel, and the span refers to the span in the width direction and the span in the height direction.

The specific design steps of the deep neural network model are as follows:

s11, designing an input image of the deep neural network model, wherein the input image adopted by the embodiment of the invention is an RGB image with the size of 512 multiplied by 256, and the larger the input image size is, the more details are contained, thereby being beneficial to accurately dividing license plate characters, but simultaneously increasing the storage space and the operation amount of the deep neural network model.

S12, designing a rapid descent network, wherein the rapid descent network is mainly used for rapidly acquiring high-level features of an input image, which have high abstraction and rich expression capability, and the accuracy of the subsequent character segmentation is directly influenced by the quality of the high-level feature extraction. As can be seen from step S11, the size of the input image adopted in the embodiment of the present invention is larger, which is not beneficial to the fast operation of the deep neural network model, so that an efficient network capable of extracting the features of the input image is needed to quickly remove the influence of the operand caused by the larger input image size. As shown in FIG. 2, the fast descent network adopted by the invention is a convolution layer with a kernel size of 7 multiplied by 7 and a span of 4 multiplied by 4, and the large-kernel-size large-span convolution operation has the advantages that the size of a characteristic diagram can be rapidly reduced, the operand of subsequent operation is greatly reduced, and more image details are reserved; maxpool0 is a maximum downsampling layer with a kernel size of 2×2, a span of 2×2, resnetblock0, resnetblock1 is two residual network infrastructures, as shown in fig. 3, convreset 0 is a convolution layer with a kernel size of 3×3, a span of 2×2, convrenet1_0 is a convolution layer with a kernel size of 1×1, a span of 1×1, the function of the convolution layer is to reduce the number of feature map channels, the operand of the subsequent convolution layer is reduced, convrenet1_1 is a convolution layer with a kernel size of 3×3, a span of 2×2, convrenet1_2 is a convolution layer with a kernel size of 1×1, a span of 1×1, the function of the convolution layer is to increase the feature map channel number, eltsum is a merging layer where two input feature maps are added pixel by pixel, conv2 is a convolution layer with a kernel size of 3×1, and a span is a feature layer with a function of merging.

S13, designing a spatial position information network, wherein the license plate character segmentation in the embodiment of the invention is different from the general target position detection application, and the same regular arrangement exists among all license plate character positions, and the regular arrangement exists not only in the whole area of all characters of the license plate, but also between any two adjacent characters. Therefore, the license plate characters are accurately segmented, and the license plate characters have a relation with the whole characteristics of the license plate characters and the local characteristics of the license plate characters. The invention adopts a novel spatial position information network, and can comprehensively utilize the integral characteristics and the local characteristics of license plate characters. As shown in fig. 2, the spatial location information network includes a height direction spatial location information feature map, a width direction spatial location information feature map widthcontext, where the method for obtaining the height direction spatial location information feature map and the width direction spatial location information feature map is similar, taking the width direction spatial location information network as an example, as shown in fig. 4, specific obtaining steps are as follows, where the size of the output feature map in step S12 is 8×16×128:

s131, slicing line by line along the width direction, wherein the size of each slice feature map is 8×1×128, and the names of the slice feature maps are cut0, cut1, cut2.

S132, a convolution kernel with 128 kernels of size 3×128 and span 1×1 is used to convolve the first slice feature map cut0, and the size of the obtained output feature map cut0-out is 8×1×128.

s134, performing the same operation as the step S132 and the step S133 on the obtained new slice characteristic diagram cu1_new to obtain a new slice characteristic diagram cu2_new, and circulating the step S132 and the step S133 until the last new slice characteristic diagram cu15_new is obtained.

S135, collecting all the new slice feature graphs obtained in the steps S131 to S134, and splicing according to the width direction dimension, wherein the output feature graph is the width direction space position information feature graph;

s14, designing a character prediction network, wherein the character prediction network is mainly used for further improving the expression capability of the feature network on the basis of the character space position information feature map obtained in the step S13, and finally predicting the accurate positions of all the characters of the car license plate. The character prediction network comprises two branch networks, namely a LocX branch and a LocY branch, wherein the LocX branch is mainly used for predicting the segmentation position of each character of a license plate in the X coordinate axis direction, the LocY branch is mainly used for predicting the segmentation position of each character of the license plate in the Y coordinate axis direction, the network structures of the LocX branch and the LocY branch are the same, as shown in figure 2, the convreog 0 is a convolution layer with a core size of 3X 3, a span of 2X 2, the convreog 1 is a convolution layer with a core size of 3X 3, a span of 2X 1, the convreog 2 is a convolution layer with a core size of 2X 2, the span of 1X 1, the output feature map size is 1X 14, and 14 represents the number of predicted regression values of each branch network, and the setting method of the values is as follows: each branch network needs to predict the segmentation position of 7 characters of the license plate, and in each coordinate axis direction, 2 coordinates need to be used for representing the segmentation position of one character of the license plate;

s2, training a deep neural network model, namely optimizing parameters of the deep neural network model by a large amount of marked training sample data so as to ensure that the deep neural network model has optimal recognition performance, wherein the method comprises the following specific steps of:

s21, acquiring training sample images, mainly collecting license plate images under various scenes, various light rays and various angles, acquiring license plate local area images by using an existing license plate detection method, and then labeling the position information of license plate characters. The specific labeling method is as follows: firstly, acquiring the minimum circumscribed rectangle of a single character on a license plate, then acquiring the coordinates of 4 minimum circumscribed rectangle frames, namely a left frame coordinate left_x, a right frame coordinate right_x, an upper frame coordinate up_y and a lower frame coordinate bottom_y, finally, sequentially connecting the left frames and the right frames of all the characters on the license plate in series to serve as labeling values in the X coordinate axis direction, and similarly, sequentially connecting the upper frames and the lower frames of all the characters on the license plate to serve as labeling values in the Y coordinate axis direction.

S22, designing a target loss function of the deep neural network model, wherein the target loss function is a mean square error loss function.

S23, training a deep neural network model, namely mainly sending a marked license plate character sample image set into the defined deep neural network model, and learning related model parameters;

s3, training the deep neural network model, then using the model in an actual environment, and carrying out forward operation on a given license plate local image by the deep neural network model, wherein the output feature map is a segmentation position set of each character on the license plate in the single coordinate axis direction, and the segmentation positions of each character on the license plate can be obtained by sequentially carrying out combination value on the output feature maps of the two branch networks.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A license plate character segmentation method based on spatial position information is characterized by comprising the following steps:

the method comprises the following steps:

s1, establishing a deep neural network model;

s3, reading license plate image information, extracting spatial position information features among license plate characters through the optimal deep neural network model operation, and further obtaining the position of each character of the license plate;

s1, establishing a deep neural network model, which specifically comprises the following steps:

s11, designing an input image of a deep neural network model;

s14, designing a character prediction network, further improving the expression capacity of the feature network on the basis of the acquired character space position information feature map, and finally predicting the accurate position of each character of the car license plate;

the fast descent network comprises a convolution layer conv0, a maximum value downsampling layer maxpool0, two residual network infrastructure bodies, a resnetblock0 and a resnetblock1, a merging layer eltsum and a convolution layer conv2;

the kernel size of the convolutional layer conv0 is 7×7 and the span is 4×4;

the maximum downsampling layer maxpool0 kernel size is 2×2, span is 2×2;

the convolution layer conv2 is a convolution layer with a kernel size of 3×3 and a span of 1×1, and functions to perform merging feature fusion;

the spatial position information network comprises a height direction spatial position information characteristic diagram height context and a width direction spatial position information characteristic diagram widthcontext;

the method specifically comprises the following steps:

the character prediction network in step S14 includes two branch networks, namely a LocX branch and a LocY branch, where the LocX branch is used to predict a segmentation position of each character of the license plate in the X coordinate axis direction, and the LocY branch is used to predict a segmentation position of each character of the license plate in the Y coordinate axis direction;

2. The license plate character segmentation method based on the spatial location information according to claim 1, wherein: the input image size adopted in the step S11 is an RGB image of 512×256.

3. The license plate character segmentation method based on the spatial location information according to claim 1, wherein: s2, optimizing parameters of the deep neural network model through marked training sample data to obtain an optimal deep neural network model;

the method specifically comprises the following steps:

s22, designing a target loss function of the deep neural network model;

4. The license plate character segmentation method based on the spatial location information according to claim 3, wherein: the target loss function in step S22 is a mean square error loss function.

5. The license plate character segmentation method based on the spatial location information according to claim 4, wherein: the labeling method of the position information of the license plate characters in the step S21 comprises the following steps:

6. The license plate character segmentation method based on the spatial location information according to claim 5, wherein: s3, reading license plate image information, extracting spatial position information features among license plate characters through the optimal deep neural network model operation, and further obtaining the position of each character of the license plate;

the method specifically comprises the following steps: