CN112308092B - Light-weight license plate detection and identification method based on multi-scale attention mechanism - Google Patents
Light-weight license plate detection and identification method based on multi-scale attention mechanism Download PDFInfo
- Publication number
- CN112308092B CN112308092B CN202011316603.4A CN202011316603A CN112308092B CN 112308092 B CN112308092 B CN 112308092B CN 202011316603 A CN202011316603 A CN 202011316603A CN 112308092 B CN112308092 B CN 112308092B
- Authority
- CN
- China
- Prior art keywords
- license plate
- network
- convolution
- data set
- feature map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/625—License plates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Abstract
The invention provides a light-weight license plate detection and identification method based on a multi-scale attention mechanism, wherein a license plate detection and identification network is constructed by the following steps; step S1: acquiring a picture as an original data set; step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate; and step S3: constructing a deep neural network for detecting a license plate; and step S4: inputting an original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate; step S5: and performing perspective transformation on the P2 according to the corner points of the license plate to obtain a corrected image P3. Step S6: constructing a deep neural network for identifying the license plate; step S7: inputting the P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate; the invention can simultaneously obtain lower network parameter quantity and calculated quantity under the condition of ensuring the network accuracy.
Description
Technical Field
The invention relates to the technical field of machine vision, in particular to a light-weight license plate detection and identification method based on a multi-scale attention mechanism.
Background
Along with the gradual development of economy and the increase of the number of automobiles year by year, the urban traffic pressure is higher and higher, how to efficiently manage the traffic becomes a problem which needs to be solved urgently, the license plate detection and recognition technology plays an important role in traffic management, and the capacity of automatically detecting and recognizing license plates from traffic violation to accident monitoring is one of key tools used by law enforcement agencies in various regions. The detection and identification of the license plate are not only widely applied to road traffic management, but also more widely applied to aspects of parking lots, community security, robbery and wanted vehicle and the like.
The traditional license plate detection and identification method comprises four stages: the method has the advantages that the method has the defects that the required steps are excessive, the error of each step is gradually accumulated, and the inaccurate prediction of any step can cause the error of license plate recognition; moreover, the acquired images used in the traditional license plate detection and recognition are all based on fixed angles, and the accuracy of the license plate detection and recognition of large-angle license plates in natural scenes is poor. The license plate detection and identification method for deep learning has good performance, but because a deeper network is used, the parameter quantity of the model is overlarge, the calculation quantity is too large, and the deployment and the operation at a mobile end are difficult.
Disclosure of Invention
The invention provides a light-weight license plate detection and recognition method based on a multi-scale attention mechanism, which can finish the detection and recognition of a license plate only in three stages, effectively reduces recognition errors caused by character segmentation errors, improves the accuracy of character recognition, and can obtain lower network parameters and calculated amount simultaneously under the condition of ensuring network accuracy.
The invention adopts the following technical scheme.
A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the construction of the license plate detection and identification network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting the license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: and performing perspective transformation on the obtained license plate detection area P2 according to the corner points of the license plate to obtain a corrected license plate image P3.
Step S6: constructing a deep neural network for identifying the license plate;
step S7: and inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain the license plate number corresponding to the detected license plate.
The original data set used in the step S1 is a CCPD license plate data set.
And S2, correcting the license plate through marking the license plate angular points in the CCPD data set and perspective transformation, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
S3, constructing a deep neural network for detecting the license plate, and specifically comprising the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a characteristic pyramid network FPN, a receptive field module RFB, an attention mechanism module CBAM and a detection head;
step S32: constructing a loss function for a license plate detection network, using a multitask loss functionTo carry outJoint optimization:
whereinAs a function of the loss of classification of the license plate, L cls The loss of softmax, p, is used i Indicating the probability that the ith anchor point may be a license plate,the value of the ith anchor point is a positive example value of 1 and a negative example value of 0;a regression loss function representing the license plate frame,wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i Andrespectively representing the predicted value of the ith anchor point to the license plate frame coordinate and the real value of the license plate frame coordinate corresponding to the anchor point;representing regression loss function of the corner points of the license plate, and still using smooth-L1 loss function, whereinAndrespectively representing the predicted values of 4 angular points of the license plate at the ith anchor point and the true values of four angular points of the license plate corresponding to the anchor point; lambda [ alpha ] 1 And λ 2 And representing the weight values of the license plate frame prediction and the corner point prediction in the license plate detection task.
The step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used in a backbone part for extracting features; the method comprises the steps that a characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a receptive field module RFB, the RFB carries out three branches on the characteristic graphs with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the characteristic graph obtained by each branch is sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the characteristic pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to focus on important characteristics and inhibit unimportant characteristics, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.
Step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 needs to be uniformly scaled to 640 x 640, in order to prevent image distortion, black edge processing needs to be performed on the image, in order to improve the generalization performance of the model, a target picture needs to be cut, the brightness, the contrast and the saturation are changed, operations such as turning, tilting and the like are performed, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three characteristic pictures F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' 1 、F′ 2 、F′ 3 . F' 1 Performing convolution using three different convolution branches, each convolution branch being convolved with a 1 x 1 convolution kernel, and applying the convolution kernel to the featureAdjusting the dimension of the graph, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using convolution kernels of 3 × 3 respectively after the convolution is completed, performing convolution by using expansion convolution kernels with expansion rates of 1,3 and 5 respectively to obtain feature graphs with different size receptive fields, and finally performing convolution by using concat operation on the three obtained feature graphs and performing convolution by using 1 × 1 to obtain a feature graph F ″, wherein the size of the feature graph is completely consistent with that of the input feature graph 1 。F′ 2 、F′ 3 Operation with F' 1 In agreement, F ″, is obtained 2 、F″ 3 ;
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight c ∈R C×1×1 And is combined with the initial characteristic diagram F ″) 1 Multiply by one to obtain a product of sizeA characteristic diagram of (2); obtained characteristic diagram F' 1 Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like s (F)∈R H×W And is combined with F' 1 Multiply to obtain the sizeThe characteristic diagram of (1).
The step S43 specifically includes the following steps:
step S431: compressing the input feature map through an average pooling layer and a maximum pooling layer to obtainAnd
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1 ,
where σ denotes the Sigmoid activation function, W 0 And W 1 Representing weights of a multi-layer perceptron;
step S434: the original characteristic diagram F ″) 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplying element by element to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
Step S435: the weighted new feature map F' is obtained after passing through the average pooling layer and the maximum pooling layerAnd
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W CalculatingThe formula is as follows:
wherein f is 7×7 Represents a convolution layer using a convolution kernel of 7 × 7.
Step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C ×1×1 Multiplication element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinates of the corner points of the license plate at the same time.
S6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
in the formula, S represents a training data set, Z is input data of a network, G is label information, (Z, G) is a set of data input into the network in the training set, and p (G | Z) is a probability of obtaining a label G when the input data is Z.
Compared with the prior art, the invention has the following beneficial effects:
1. the license plate recognition method without character segmentation effectively avoids recognition errors caused by character segmentation errors;
2. according to the invention, mobileNet is used as a backbone network, so that parameters and calculated amount of the network are obviously reduced under the condition of ensuring detection precision;
3. the invention uses the RFB module in the network, can integrate the context information of the target area, promote the detection performance;
4. according to the method, an attention mechanism is used in the network, so that the attention degree of the network to important features is improved, and the attention degree of unimportant features is restrained;
5. the method optimizes by using a multitask loss function, adds angular point loss, enhances the positioning accuracy of the license plate by the loss, corrects the license plate by utilizing the positioned angular points to perform perspective transformation, and identifies the corrected license plate, thereby improving the identification accuracy of the license plate characters;
6. the invention can realize the identification of the license plates with different digits by using the CTC loss function.
The invention can finish the detection and identification of the license plate only by three stages: the method has the advantages that the loss caused by license plate segmentation is reduced through image acquisition, license plate positioning and character recognition, the recognition error caused by the character segmentation error is effectively reduced, the multitask loss is used for optimizing the license plate positioning part, and the angular point positioning loss is added, so that the positioning accuracy of the license plate is enhanced, the predicted angular point can be used for subsequent correction of the license plate, and the accuracy of character recognition is improved; in the character recognition part, the characteristics of the characteristic sequence are further extracted by using a recurrent neural network, and the license plate characters are recognized by using CTC loss. And because a lightweight model MobileNet is used in the network backbone part, lower network parameter quantity and calculation quantity can be obtained simultaneously under the condition of ensuring network accuracy.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a schematic diagram of a structural block of an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating an effect of a CCPD partial data set downloaded in step S1 according to an embodiment of the present invention;
FIG. 3 is a diagram of a txt file for storing different path addresses of two data sets obtained in step S2 according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the license plate picture captured in step S2 in the example of the present invention;
FIG. 5 is a schematic diagram of a network frame for license plate detection constructed in step S3 in the embodiment of the present invention;
FIG. 6 is a schematic diagram of the frame of the RFB module constructed in step S31 in the embodiment of the present invention;
FIG. 7 is a schematic diagram of the CBAM module frame map constructed in step S31 according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a channel attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a spatial attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;
fig. 10 is a schematic diagram of a network frame for license plate recognition constructed in step S32 in the embodiment of the present invention;
FIG. 11 is a schematic view of an uncorrected license plate
FIG. 12 is a schematic view of a license plate corrected after perspective transformation.
Detailed Description
As shown in the figure, a light-weight license plate detection and recognition method based on a multi-scale attention mechanism adopts a license plate detection network and a license plate recognition network to recognize license plates; the construction of the license plate detection and identification network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting the license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: and performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3.
Step S6: constructing a deep neural network for identifying the license plate;
step S7: and inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain the license plate number corresponding to the detected license plate.
The original data set used in the step S1 is a CCPD license plate data set.
And S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
S3, constructing a deep neural network for detecting the license plate, which specifically comprises the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a Feature Pyramid network FPN (Feature Pyramid Networks), a Receptive Field Module RFB (Receptive Field Block), an Attention mechanism Module CBAM (conditional Block attachment Module) and a detection head;
step S32: constructing a loss function of a license plate detection network, and using a multi-task loss functionPerforming joint optimization:
whereinAs a function of the classification loss of the license plate, L cls The loss of softmax, p, is used i Representing the probability that the ith anchor point may be a license plate,the value of the ith anchor point is a positive example value of 1 and a negative example value of 0;a regression loss function representing the license plate frame,wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i Andrespectively representing a predicted value of the coordinates of the license plate frame at the ith anchor point and a real value of the coordinates of the license plate frame corresponding to the anchor point;expressing the regression loss function of the license plate corner points, wherein the loss function of the license plate corner points still uses smooth-L1 loss function, whereinAndrespectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the real values of the four corner points of the license plate corresponding to the anchor point; lambda 1 And λ 2 And representing the weight values of license plate frame prediction and angular point prediction in a license plate detection task.
The step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the feature pyramid network FPN is formed by up-sampling feature maps extracted by a backbone and fusing the feature maps with the same size to form a plurality of feature maps with different sizes, the formed feature maps are all sent to a receptive field module RFB, the RFB carries out three branches on the feature maps with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the feature maps obtained by each branch are sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the feature pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to pay attention to important features and inhibit unimportant features, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.
Step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 needs to be uniformly scaled to 640 x 640, in order to prevent image distortion, black edge processing needs to be performed on the image, in order to improve the generalization performance of the model, a target picture needs to be cut, the brightness, the contrast and the saturation are changed, operations such as turning, tilting and the like are performed, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' 1 、F′ 2 、F′ 3 . Is prepared from F' 1 Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the characteristic diagram, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively,after the convolution is finished, respectively using expansion convolution kernels with convolution kernels of 3X 3 and expansion rates of 1,3,5 to carry out convolution so as to obtain feature maps with different sizes of reception fields, and finally using concat operation to carry out convolution with 1X 1 on the three obtained feature maps so as to obtain a feature map F' with the size completely consistent with that of the input feature map 1 。F′ 2 、F′ 3 Operation (c) with F' 1 After agreement, F ″' is obtained 2 、F″ 3 ;
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight c ∈R C×1×1 And is compared with the initial characteristic diagram F ″) 1 Multiply by one to obtain a product of sizeA characteristic diagram of (1); obtained characteristic diagram F' 1 Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like s (F)∈R H×W And is combined with F' 1 Multiply to obtain the sizeA characteristic diagram of (c).
The step S43 specifically includes the following steps:
step S431: compressing the input feature map through an Average pooling layer (Average pooling) and a maximum pooling layer (Max pooling) to obtain the feature mapAnd
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1 ,
where σ denotes a Sigmoid activation function, W 0 And W 1 Representing weights of the multi-layer perceptron;
step S434: the original characteristic diagram F ″) 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Element by element multiplication to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
Step S435: the weighted new feature map F' is obtained after passing through the average pooling layer and the maximum pooling layerAnd
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W The calculation formula is as follows:
wherein f is 7×7 Represents a convolution layer using a convolution kernel of 7 × 7.
Step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C ×1×1 Multiplying element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinate of the license plate corner point at the same time.
S6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
in the formula, S represents a training data set, Z is input data of a network, G is label information, (Z, G) is a set of data input into the network in the training set, and p (G | Z) is a probability of obtaining a label G when the input data is Z.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (4)
1. A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the method is characterized in that: the construction of the license plate detection and recognition network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3;
step S6: constructing a deep neural network for identifying the license plate;
step S7: inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate;
s3, constructing a deep neural network for detecting the license plate, and specifically comprising the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a characteristic pyramid network FPN, a receptive field module RFB, an attention mechanism module CBAM and a detection head;
step S32: constructing a loss function of a license plate detection network, and using a multi-task loss functionPerforming joint optimization according to a formula I:
whereinAs a function of the loss of classification of the license plate, L cls The loss of softmax, p, is used i Representing the probability that the ith anchor point is a license plate,the value of the ith anchor point is 1 in positive example and 0 in negative example;a regression loss function representing the license plate frame,wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i Andrespectively representing the predicted value of the ith anchor point to the license plate frame coordinate and the real value of the license plate frame coordinate corresponding to the anchor point;expressing the regression loss function of the license plate corner points, wherein the loss function of the license plate corner points still uses smooth-L1 loss function, whereinAndrespectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the license plate corresponding to the anchor pointThe real values of the four corner points; lambda [ alpha ] 1 And λ 2 Representing the weight of license plate frame prediction and angular point prediction in a license plate detection task;
s6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
in the formula, s represents a training data set, Z is input data of a network, G is label information, (Z, G) is a group of data input into the network in the training set, and p (G | Z) is the probability of obtaining a label G under the condition that the input data is Z;
the step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a reception field module RFB, the RFB carries out three branches on the characteristic graphs with different scales, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different scales, and the characteristic graphs obtained by each branch are sent to a CBAM; the detection head consists of three parts which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function;
step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 is uniformly scaled to 640 × 640, the image is subjected to black edge supplementing processing, a target image is cut, the brightness, the contrast and the saturation are changed, the operation is turned over and inclined, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused with different scales by performing convolution and up-sampling operations on the three feature maps' 1 、F′ 2 、F′ 3 (ii) a Is prepared from F' 1 Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the feature map, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using expansion convolution kernels of 3 × 3 and 1,3 and 5 respectively after the convolution is completed, thereby obtaining feature maps with different sizes of receptive fields, and finally performing convolution by using concat operation and 1 × 1 on the three obtained feature maps to obtain a feature map F ″, wherein the size of the feature map is completely consistent with that of the input feature map 1 ;F′ 2 、F′ 3 Operation with F' 1 In agreement, F ″, is obtained 2 、F″ 3 ;
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to global average pooling, maximum pooling, convolution and activation function operation to obtain a feature map M of channel attention weight c ∈R C×1×1 And is compared with the initial characteristic diagram F ″) 1 Multiplied by another to give a size of F' 1 H×W×C A characteristic diagram of (1); obtained byCharacteristic map F' 1 Obtaining a feature map M of the spatial attention weight through average pooling, maximum pooling, splicing and activation operations s (F)∈R H×W And is combined with F' 1 Multiply to obtain the size F "" 1 H×W×C A characteristic diagram of (c).
2. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the original data set used in the step S1 is a CCPD license plate data set.
3. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: and S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
4. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the step S43 specifically includes the following steps:
step S431: compressing the input feature map through an average pooling layer and a maximum pooling layer to obtainAnd
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1 ,
where σ denotes the Sigmoid activation function, W 0 And W 1 Representing weights of a multi-layer perceptron;
step S434: the original feature map F ″ 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplying element by element to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
step S435: obtaining a weighted new feature map F' through an average pooling layer and a maximum pooling layerAnd
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through convolution layers, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W The calculation formula is as follows:
whereinf 7×7 Represents a convolutional layer using a 7 × 7 convolutional kernel;
step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplication element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinates of the corner points of the license plate at the same time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011316603.4A CN112308092B (en) | 2020-11-20 | 2020-11-20 | Light-weight license plate detection and identification method based on multi-scale attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011316603.4A CN112308092B (en) | 2020-11-20 | 2020-11-20 | Light-weight license plate detection and identification method based on multi-scale attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112308092A CN112308092A (en) | 2021-02-02 |
CN112308092B true CN112308092B (en) | 2023-02-28 |
Family
ID=74335448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011316603.4A Active CN112308092B (en) | 2020-11-20 | 2020-11-20 | Light-weight license plate detection and identification method based on multi-scale attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112308092B (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112861978B (en) * | 2021-02-20 | 2022-09-02 | 齐齐哈尔大学 | Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism |
CN112926588B (en) * | 2021-02-24 | 2022-07-22 | 南京邮电大学 | Large-angle license plate detection method based on convolutional network |
CN113033321A (en) * | 2021-03-02 | 2021-06-25 | 深圳市安软科技股份有限公司 | Training method of target pedestrian attribute identification model and pedestrian attribute identification method |
CN112967253A (en) * | 2021-03-08 | 2021-06-15 | 中国计量大学 | Cervical cancer cell detection method based on deep learning |
CN112634273B (en) * | 2021-03-10 | 2021-08-13 | 四川大学 | Brain metastasis segmentation system based on deep neural network and construction method thereof |
CN112966631A (en) * | 2021-03-19 | 2021-06-15 | 浪潮云信息技术股份公司 | License plate detection and identification system and method under unlimited security scene |
CN113255443B (en) * | 2021-04-16 | 2024-02-09 | 杭州电子科技大学 | Graph annotation meaning network time sequence action positioning method based on pyramid structure |
CN113221988A (en) * | 2021-04-30 | 2021-08-06 | 佛山市南海区广工大数控装备协同创新研究院 | Method for constructing lightweight network based on attention mechanism |
CN114938425A (en) * | 2021-06-15 | 2022-08-23 | 义隆电子股份有限公司 | Photographing apparatus and object recognition method using artificial intelligence |
CN113486886B (en) * | 2021-06-21 | 2023-06-23 | 华侨大学 | License plate recognition method and device in natural scene |
CN113554030B (en) * | 2021-07-27 | 2022-08-16 | 上海大学 | Multi-type license plate recognition method and system based on single character attention |
CN113823292B (en) * | 2021-08-19 | 2023-07-21 | 华南理工大学 | Small sample speaker recognition method based on channel attention depth separable convolution network |
CN114092926A (en) * | 2021-10-20 | 2022-02-25 | 杭州电子科技大学 | License plate positioning and identifying method in complex environment |
CN114821289B (en) * | 2022-01-17 | 2023-10-17 | 电子科技大学 | Forest fire picture real-time segmentation and fire edge point monitoring algorithm |
CN114677502B (en) * | 2022-05-30 | 2022-08-12 | 松立控股集团股份有限公司 | License plate detection method with any inclination angle |
CN115410189B (en) * | 2022-10-31 | 2023-01-24 | 松立控股集团股份有限公司 | Complex scene license plate detection method |
CN115909316B (en) * | 2023-02-21 | 2023-05-19 | 昆明理工大学 | Light end-to-end license plate identification method for data non-uniform scene |
CN116664918A (en) * | 2023-05-12 | 2023-08-29 | 杭州像素元科技有限公司 | Method for detecting traffic state of each lane of toll station based on deep learning |
CN116704487A (en) * | 2023-06-12 | 2023-09-05 | 三峡大学 | License plate detection and recognition method based on Yolov5s network and CRNN |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508715A (en) * | 2018-10-30 | 2019-03-22 | 南昌大学 | A kind of License Plate and recognition methods based on deep learning |
CN109740653A (en) * | 2018-12-25 | 2019-05-10 | 北京航空航天大学 | A kind of vehicle recognition methods again for merging visual appearance and space-time restriction |
CN111325203A (en) * | 2020-01-21 | 2020-06-23 | 福州大学 | American license plate recognition method and system based on image correction |
CN111553205A (en) * | 2020-04-12 | 2020-08-18 | 西安电子科技大学 | Vehicle weight recognition method, system, medium and video monitoring system without license plate information |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10140553B1 (en) * | 2018-03-08 | 2018-11-27 | Capital One Services, Llc | Machine learning artificial intelligence system for identifying vehicles |
-
2020
- 2020-11-20 CN CN202011316603.4A patent/CN112308092B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508715A (en) * | 2018-10-30 | 2019-03-22 | 南昌大学 | A kind of License Plate and recognition methods based on deep learning |
CN109740653A (en) * | 2018-12-25 | 2019-05-10 | 北京航空航天大学 | A kind of vehicle recognition methods again for merging visual appearance and space-time restriction |
CN111325203A (en) * | 2020-01-21 | 2020-06-23 | 福州大学 | American license plate recognition method and system based on image correction |
CN111553205A (en) * | 2020-04-12 | 2020-08-18 | 西安电子科技大学 | Vehicle weight recognition method, system, medium and video monitoring system without license plate information |
Non-Patent Citations (3)
Title |
---|
Madhusree Mondal et al..Automatic number plate recognition using CNN based self synthesized feature learning.《2017 IEEE Calcutta Conference (CALCON)》.2018,第378-381页. * |
张世豪 等.轻量型多尺度注意力融合的车牌检测算法.《计算机工程与应用》.2021,第57卷(第22期),第208-214页. * |
徐富勇 等.基于深度学习的任意形状场景文字识别.《四川大学学报(自然科学版)》.2020,第57卷(第02期),第255-263页. * |
Also Published As
Publication number | Publication date |
---|---|
CN112308092A (en) | 2021-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112308092B (en) | Light-weight license plate detection and identification method based on multi-scale attention mechanism | |
CN113065558B (en) | Lightweight small target detection method combined with attention mechanism | |
CN110427937B (en) | Inclined license plate correction and indefinite-length license plate identification method based on deep learning | |
CN110033002B (en) | License plate detection method based on multitask cascade convolution neural network | |
CN109766805B (en) | Deep learning-based double-layer license plate character recognition method | |
CN114677502B (en) | License plate detection method with any inclination angle | |
CN114187450A (en) | Remote sensing image semantic segmentation method based on deep learning | |
CN109784171A (en) | Car damage identification method for screening images, device, readable storage medium storing program for executing and server | |
CN114972208B (en) | YOLOv 4-based lightweight wheat scab detection method | |
CN111401145A (en) | Visible light iris recognition method based on deep learning and DS evidence theory | |
CN112785480B (en) | Image splicing tampering detection method based on frequency domain transformation and residual error feedback module | |
CN113160062A (en) | Infrared image target detection method, device, equipment and storage medium | |
CN113947766A (en) | Real-time license plate detection method based on convolutional neural network | |
CN113095152A (en) | Lane line detection method and system based on regression | |
CN113850136A (en) | Yolov5 and BCNN-based vehicle orientation identification method and system | |
CN114038004A (en) | Certificate information extraction method, device, equipment and storage medium | |
CN110084743A (en) | Image mosaic and localization method based on more air strips starting track constraint | |
CN110751226A (en) | Crowd counting model training method and device and storage medium | |
CN111626241A (en) | Face detection method and device | |
CN111444916A (en) | License plate positioning and identifying method and system under unconstrained condition | |
CN116681742A (en) | Visible light and infrared thermal imaging image registration method based on graph neural network | |
CN114998630A (en) | Ground-to-air image registration method from coarse to fine | |
CN111241986B (en) | Visual SLAM closed loop detection method based on end-to-end relationship network | |
CN111967579A (en) | Method and apparatus for performing convolution calculation on image using convolution neural network | |
CN113361375B (en) | Vehicle target identification method based on improved BiFPN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230725 Address after: Room 203, No. 397, Xihong, Hongshan Town, Gulou District, Fuzhou City, Fujian Province 350025 Patentee after: FUZHOU IVISIONIC TECHNOLOGY Co.,Ltd. Address before: Fuzhou University, No.2, wulongjiang North Avenue, Fuzhou University Town, Minhou County, Fuzhou City, Fujian Province Patentee before: FUZHOU University |
|
TR01 | Transfer of patent right |