CN112308092B - Light-weight license plate detection and identification method based on multi-scale attention mechanism - Google Patents

Light-weight license plate detection and identification method based on multi-scale attention mechanism Download PDF

Info

Publication number
CN112308092B
CN112308092B CN202011316603.4A CN202011316603A CN112308092B CN 112308092 B CN112308092 B CN 112308092B CN 202011316603 A CN202011316603 A CN 202011316603A CN 112308092 B CN112308092 B CN 112308092B
Authority
CN
China
Prior art keywords
license plate
network
convolution
data set
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011316603.4A
Other languages
Chinese (zh)
Other versions
CN112308092A (en
Inventor
吴林煌
张世豪
杨绣郡
陈志峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Ivisionic Technology Co ltd
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202011316603.4A priority Critical patent/CN112308092B/en
Publication of CN112308092A publication Critical patent/CN112308092A/en
Application granted granted Critical
Publication of CN112308092B publication Critical patent/CN112308092B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention provides a light-weight license plate detection and identification method based on a multi-scale attention mechanism, wherein a license plate detection and identification network is constructed by the following steps; step S1: acquiring a picture as an original data set; step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate; and step S3: constructing a deep neural network for detecting a license plate; and step S4: inputting an original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate; step S5: and performing perspective transformation on the P2 according to the corner points of the license plate to obtain a corrected image P3. Step S6: constructing a deep neural network for identifying the license plate; step S7: inputting the P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate; the invention can simultaneously obtain lower network parameter quantity and calculated quantity under the condition of ensuring the network accuracy.

Description

Light-weight license plate detection and identification method based on multi-scale attention mechanism
Technical Field
The invention relates to the technical field of machine vision, in particular to a light-weight license plate detection and identification method based on a multi-scale attention mechanism.
Background
Along with the gradual development of economy and the increase of the number of automobiles year by year, the urban traffic pressure is higher and higher, how to efficiently manage the traffic becomes a problem which needs to be solved urgently, the license plate detection and recognition technology plays an important role in traffic management, and the capacity of automatically detecting and recognizing license plates from traffic violation to accident monitoring is one of key tools used by law enforcement agencies in various regions. The detection and identification of the license plate are not only widely applied to road traffic management, but also more widely applied to aspects of parking lots, community security, robbery and wanted vehicle and the like.
The traditional license plate detection and identification method comprises four stages: the method has the advantages that the method has the defects that the required steps are excessive, the error of each step is gradually accumulated, and the inaccurate prediction of any step can cause the error of license plate recognition; moreover, the acquired images used in the traditional license plate detection and recognition are all based on fixed angles, and the accuracy of the license plate detection and recognition of large-angle license plates in natural scenes is poor. The license plate detection and identification method for deep learning has good performance, but because a deeper network is used, the parameter quantity of the model is overlarge, the calculation quantity is too large, and the deployment and the operation at a mobile end are difficult.
Disclosure of Invention
The invention provides a light-weight license plate detection and recognition method based on a multi-scale attention mechanism, which can finish the detection and recognition of a license plate only in three stages, effectively reduces recognition errors caused by character segmentation errors, improves the accuracy of character recognition, and can obtain lower network parameters and calculated amount simultaneously under the condition of ensuring network accuracy.
The invention adopts the following technical scheme.
A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the construction of the license plate detection and identification network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting the license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: and performing perspective transformation on the obtained license plate detection area P2 according to the corner points of the license plate to obtain a corrected license plate image P3.
Step S6: constructing a deep neural network for identifying the license plate;
step S7: and inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain the license plate number corresponding to the detected license plate.
The original data set used in the step S1 is a CCPD license plate data set.
And S2, correcting the license plate through marking the license plate angular points in the CCPD data set and perspective transformation, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
S3, constructing a deep neural network for detecting the license plate, and specifically comprising the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a characteristic pyramid network FPN, a receptive field module RFB, an attention mechanism module CBAM and a detection head;
step S32: constructing a loss function for a license plate detection network, using a multitask loss function
Figure BDA0002790598990000021
To carry outJoint optimization:
wherein
Figure BDA0002790598990000022
As a function of the loss of classification of the license plate, L cls The loss of softmax, p, is used i Indicating the probability that the ith anchor point may be a license plate,
Figure BDA0002790598990000023
the value of the ith anchor point is a positive example value of 1 and a negative example value of 0;
Figure BDA0002790598990000024
a regression loss function representing the license plate frame,
Figure BDA0002790598990000025
wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i And
Figure BDA0002790598990000031
respectively representing the predicted value of the ith anchor point to the license plate frame coordinate and the real value of the license plate frame coordinate corresponding to the anchor point;
Figure BDA0002790598990000032
representing regression loss function of the corner points of the license plate, and still using smooth-L1 loss function, wherein
Figure BDA0002790598990000033
And
Figure BDA0002790598990000034
respectively representing the predicted values of 4 angular points of the license plate at the ith anchor point and the true values of four angular points of the license plate corresponding to the anchor point; lambda [ alpha ] 1 And λ 2 And representing the weight values of the license plate frame prediction and the corner point prediction in the license plate detection task.
The step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used in a backbone part for extracting features; the method comprises the steps that a characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a receptive field module RFB, the RFB carries out three branches on the characteristic graphs with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the characteristic graph obtained by each branch is sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the characteristic pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to focus on important characteristics and inhibit unimportant characteristics, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.
Step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 needs to be uniformly scaled to 640 x 640, in order to prevent image distortion, black edge processing needs to be performed on the image, in order to improve the generalization performance of the model, a target picture needs to be cut, the brightness, the contrast and the saturation are changed, operations such as turning, tilting and the like are performed, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three characteristic pictures F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' 1 、F′ 2 、F′ 3 . F' 1 Performing convolution using three different convolution branches, each convolution branch being convolved with a 1 x 1 convolution kernel, and applying the convolution kernel to the featureAdjusting the dimension of the graph, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using convolution kernels of 3 × 3 respectively after the convolution is completed, performing convolution by using expansion convolution kernels with expansion rates of 1,3 and 5 respectively to obtain feature graphs with different size receptive fields, and finally performing convolution by using concat operation on the three obtained feature graphs and performing convolution by using 1 × 1 to obtain a feature graph F ″, wherein the size of the feature graph is completely consistent with that of the input feature graph 1 。F′ 2 、F′ 3 Operation with F' 1 In agreement, F ″, is obtained 2 、F″ 3
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight c ∈R C×1×1 And is combined with the initial characteristic diagram F ″) 1 Multiply by one to obtain a product of size
Figure BDA0002790598990000043
A characteristic diagram of (2); obtained characteristic diagram F' 1 Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like s (F)∈R H×W And is combined with F' 1 Multiply to obtain the size
Figure BDA0002790598990000044
The characteristic diagram of (1).
The step S43 specifically includes the following steps:
step S431: compressing the input feature map through an average pooling layer and a maximum pooling layer to obtain
Figure BDA0002790598990000041
And
Figure BDA0002790598990000042
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1
The calculation formula is as follows:
Figure BDA0002790598990000051
where σ denotes the Sigmoid activation function, W 0 And W 1 Representing weights of a multi-layer perceptron;
step S434: the original characteristic diagram F ″) 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplying element by element to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
Figure BDA0002790598990000052
wherein
Figure BDA0002790598990000053
Indicating that the corresponding elements are multiplied by one another.
Step S435: the weighted new feature map F' is obtained after passing through the average pooling layer and the maximum pooling layer
Figure BDA0002790598990000054
And
Figure BDA0002790598990000055
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W CalculatingThe formula is as follows:
Figure BDA0002790598990000056
wherein f is 7×7 Represents a convolution layer using a convolution kernel of 7 × 7.
Step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C ×1×1 Multiplication element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
Figure BDA0002790598990000057
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinates of the corner points of the license plate at the same time.
S6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
Figure BDA0002790598990000061
in the formula, S represents a training data set, Z is input data of a network, G is label information, (Z, G) is a set of data input into the network in the training set, and p (G | Z) is a probability of obtaining a label G when the input data is Z.
Compared with the prior art, the invention has the following beneficial effects:
1. the license plate recognition method without character segmentation effectively avoids recognition errors caused by character segmentation errors;
2. according to the invention, mobileNet is used as a backbone network, so that parameters and calculated amount of the network are obviously reduced under the condition of ensuring detection precision;
3. the invention uses the RFB module in the network, can integrate the context information of the target area, promote the detection performance;
4. according to the method, an attention mechanism is used in the network, so that the attention degree of the network to important features is improved, and the attention degree of unimportant features is restrained;
5. the method optimizes by using a multitask loss function, adds angular point loss, enhances the positioning accuracy of the license plate by the loss, corrects the license plate by utilizing the positioned angular points to perform perspective transformation, and identifies the corrected license plate, thereby improving the identification accuracy of the license plate characters;
6. the invention can realize the identification of the license plates with different digits by using the CTC loss function.
The invention can finish the detection and identification of the license plate only by three stages: the method has the advantages that the loss caused by license plate segmentation is reduced through image acquisition, license plate positioning and character recognition, the recognition error caused by the character segmentation error is effectively reduced, the multitask loss is used for optimizing the license plate positioning part, and the angular point positioning loss is added, so that the positioning accuracy of the license plate is enhanced, the predicted angular point can be used for subsequent correction of the license plate, and the accuracy of character recognition is improved; in the character recognition part, the characteristics of the characteristic sequence are further extracted by using a recurrent neural network, and the license plate characters are recognized by using CTC loss. And because a lightweight model MobileNet is used in the network backbone part, lower network parameter quantity and calculation quantity can be obtained simultaneously under the condition of ensuring network accuracy.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a schematic diagram of a structural block of an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating an effect of a CCPD partial data set downloaded in step S1 according to an embodiment of the present invention;
FIG. 3 is a diagram of a txt file for storing different path addresses of two data sets obtained in step S2 according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the license plate picture captured in step S2 in the example of the present invention;
FIG. 5 is a schematic diagram of a network frame for license plate detection constructed in step S3 in the embodiment of the present invention;
FIG. 6 is a schematic diagram of the frame of the RFB module constructed in step S31 in the embodiment of the present invention;
FIG. 7 is a schematic diagram of the CBAM module frame map constructed in step S31 according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a channel attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a spatial attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;
fig. 10 is a schematic diagram of a network frame for license plate recognition constructed in step S32 in the embodiment of the present invention;
FIG. 11 is a schematic view of an uncorrected license plate
FIG. 12 is a schematic view of a license plate corrected after perspective transformation.
Detailed Description
As shown in the figure, a light-weight license plate detection and recognition method based on a multi-scale attention mechanism adopts a license plate detection network and a license plate recognition network to recognize license plates; the construction of the license plate detection and identification network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting the license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: and performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3.
Step S6: constructing a deep neural network for identifying the license plate;
step S7: and inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain the license plate number corresponding to the detected license plate.
The original data set used in the step S1 is a CCPD license plate data set.
And S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
S3, constructing a deep neural network for detecting the license plate, which specifically comprises the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a Feature Pyramid network FPN (Feature Pyramid Networks), a Receptive Field Module RFB (Receptive Field Block), an Attention mechanism Module CBAM (conditional Block attachment Module) and a detection head;
step S32: constructing a loss function of a license plate detection network, and using a multi-task loss function
Figure BDA0002790598990000081
Performing joint optimization:
wherein
Figure BDA0002790598990000082
As a function of the classification loss of the license plate, L cls The loss of softmax, p, is used i Representing the probability that the ith anchor point may be a license plate,
Figure BDA0002790598990000083
the value of the ith anchor point is a positive example value of 1 and a negative example value of 0;
Figure BDA0002790598990000084
a regression loss function representing the license plate frame,
Figure BDA0002790598990000085
wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i And
Figure BDA0002790598990000091
respectively representing a predicted value of the coordinates of the license plate frame at the ith anchor point and a real value of the coordinates of the license plate frame corresponding to the anchor point;
Figure BDA0002790598990000092
expressing the regression loss function of the license plate corner points, wherein the loss function of the license plate corner points still uses smooth-L1 loss function, wherein
Figure BDA0002790598990000093
And
Figure BDA0002790598990000094
respectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the real values of the four corner points of the license plate corresponding to the anchor point; lambda 1 And λ 2 And representing the weight values of license plate frame prediction and angular point prediction in a license plate detection task.
The step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the feature pyramid network FPN is formed by up-sampling feature maps extracted by a backbone and fusing the feature maps with the same size to form a plurality of feature maps with different sizes, the formed feature maps are all sent to a receptive field module RFB, the RFB carries out three branches on the feature maps with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the feature maps obtained by each branch are sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the feature pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to pay attention to important features and inhibit unimportant features, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.
Step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 needs to be uniformly scaled to 640 x 640, in order to prevent image distortion, black edge processing needs to be performed on the image, in order to improve the generalization performance of the model, a target picture needs to be cut, the brightness, the contrast and the saturation are changed, operations such as turning, tilting and the like are performed, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' 1 、F′ 2 、F′ 3 . Is prepared from F' 1 Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the characteristic diagram, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively,after the convolution is finished, respectively using expansion convolution kernels with convolution kernels of 3X 3 and expansion rates of 1,3,5 to carry out convolution so as to obtain feature maps with different sizes of reception fields, and finally using concat operation to carry out convolution with 1X 1 on the three obtained feature maps so as to obtain a feature map F' with the size completely consistent with that of the input feature map 1 。F′ 2 、F′ 3 Operation (c) with F' 1 After agreement, F ″' is obtained 2 、F″ 3
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight c ∈R C×1×1 And is compared with the initial characteristic diagram F ″) 1 Multiply by one to obtain a product of size
Figure BDA0002790598990000103
A characteristic diagram of (1); obtained characteristic diagram F' 1 Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like s (F)∈R H×W And is combined with F' 1 Multiply to obtain the size
Figure BDA0002790598990000104
A characteristic diagram of (c).
The step S43 specifically includes the following steps:
step S431: compressing the input feature map through an Average pooling layer (Average pooling) and a maximum pooling layer (Max pooling) to obtain the feature map
Figure BDA0002790598990000101
And
Figure BDA0002790598990000102
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1
The calculation formula is as follows:
Figure BDA0002790598990000111
where σ denotes a Sigmoid activation function, W 0 And W 1 Representing weights of the multi-layer perceptron;
step S434: the original characteristic diagram F ″) 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Element by element multiplication to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
Figure BDA0002790598990000112
wherein
Figure BDA0002790598990000113
Representing the multiplication of corresponding elements one by one.
Step S435: the weighted new feature map F' is obtained after passing through the average pooling layer and the maximum pooling layer
Figure BDA0002790598990000114
And
Figure BDA0002790598990000115
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W The calculation formula is as follows:
Figure BDA0002790598990000116
wherein f is 7×7 Represents a convolution layer using a convolution kernel of 7 × 7.
Step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C ×1×1 Multiplying element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
Figure BDA0002790598990000117
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinate of the license plate corner point at the same time.
S6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
Figure BDA0002790598990000121
in the formula, S represents a training data set, Z is input data of a network, G is label information, (Z, G) is a set of data input into the network in the training set, and p (G | Z) is a probability of obtaining a label G when the input data is Z.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (4)

1. A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the method is characterized in that: the construction of the license plate detection and recognition network comprises the following steps;
step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;
step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate;
and step S3: constructing a deep neural network for detecting a license plate;
and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;
step S5: performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3;
step S6: constructing a deep neural network for identifying the license plate;
step S7: inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate;
s3, constructing a deep neural network for detecting the license plate, and specifically comprising the following steps:
step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a characteristic pyramid network FPN, a receptive field module RFB, an attention mechanism module CBAM and a detection head;
step S32: constructing a loss function of a license plate detection network, and using a multi-task loss function
Figure FDA0003955546840000011
Performing joint optimization according to a formula I:
wherein
Figure FDA0003955546840000012
As a function of the loss of classification of the license plate, L cls The loss of softmax, p, is used i Representing the probability that the ith anchor point is a license plate,
Figure FDA0003955546840000013
the value of the ith anchor point is 1 in positive example and 0 in negative example;
Figure FDA0003955546840000014
a regression loss function representing the license plate frame,
Figure FDA0003955546840000015
wherein R represents a smooth-L1 loss function, t i ={t x ,t y ,t w ,t h } i And
Figure FDA0003955546840000021
respectively representing the predicted value of the ith anchor point to the license plate frame coordinate and the real value of the license plate frame coordinate corresponding to the anchor point;
Figure FDA0003955546840000022
expressing the regression loss function of the license plate corner points, wherein the loss function of the license plate corner points still uses smooth-L1 loss function, wherein
Figure FDA0003955546840000025
And
Figure FDA0003955546840000023
respectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the license plate corresponding to the anchor pointThe real values of the four corner points; lambda [ alpha ] 1 And λ 2 Representing the weight of license plate frame prediction and angular point prediction in a license plate detection task;
s6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:
step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;
s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:
Figure FDA0003955546840000024
in the formula, s represents a training data set, Z is input data of a network, G is label information, (Z, G) is a group of data input into the network in the training set, and p (G | Z) is the probability of obtaining a label G under the condition that the input data is Z;
the step S31 specifically includes the following steps;
step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a reception field module RFB, the RFB carries out three branches on the characteristic graphs with different scales, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different scales, and the characteristic graphs obtained by each branch are sent to a CBAM; the detection head consists of three parts which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function;
step S4 specifically includes the following steps;
step S41: at the input end of the license plate detection network, the input image P1 is uniformly scaled to 640 × 640, the image is subjected to black edge supplementing processing, a target image is cut, the brightness, the contrast and the saturation are changed, the operation is turned over and inclined, and normalization processing is performed;
step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes 1 、F 2 、F 3 Obtaining three feature maps F 'fused with different scales by performing convolution and up-sampling operations on the three feature maps' 1 、F′ 2 、F′ 3 (ii) a Is prepared from F' 1 Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the feature map, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using expansion convolution kernels of 3 × 3 and 1,3 and 5 respectively after the convolution is completed, thereby obtaining feature maps with different sizes of receptive fields, and finally performing convolution by using concat operation and 1 × 1 on the three obtained feature maps to obtain a feature map F ″, wherein the size of the feature map is completely consistent with that of the input feature map 1 ;F′ 2 、F′ 3 Operation with F' 1 In agreement, F ″, is obtained 2 、F″ 3
Step S43: f' obtained in step S42 1 、F″ 2 、F″ 3 Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to global average pooling, maximum pooling, convolution and activation function operation to obtain a feature map M of channel attention weight c ∈R C×1×1 And is compared with the initial characteristic diagram F ″) 1 Multiplied by another to give a size of F' 1 H×W×C A characteristic diagram of (1); obtained byCharacteristic map F' 1 Obtaining a feature map M of the spatial attention weight through average pooling, maximum pooling, splicing and activation operations s (F)∈R H×W And is combined with F' 1 Multiply to obtain the size F "" 1 H×W×C A characteristic diagram of (c).
2. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the original data set used in the step S1 is a CCPD license plate data set.
3. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: and S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.
4. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the step S43 specifically includes the following steps:
step S431: compressing the input feature map through an average pooling layer and a maximum pooling layer to obtain
Figure FDA0003955546840000041
And
Figure FDA0003955546840000042
step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;
step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight c ∈R C×1×1
The calculation formula is as follows:
Figure FDA0003955546840000043
where σ denotes the Sigmoid activation function, W 0 And W 1 Representing weights of a multi-layer perceptron;
step S434: the original feature map F ″ 1 、F″ 2 、F″ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplying element by element to obtain a new feature map F' 1 、F″′ 2 、F″′ 3 The formula is as follows:
Figure FDA0003955546840000044
wherein
Figure FDA0003955546840000045
Representing the multiplication of corresponding elements one by one;
step S435: obtaining a weighted new feature map F' through an average pooling layer and a maximum pooling layer
Figure FDA0003955546840000046
And
Figure FDA0003955546840000047
step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through convolution layers, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight s (F)∈R H×W The calculation formula is as follows:
Figure FDA0003955546840000051
whereinf 7×7 Represents a convolutional layer using a 7 × 7 convolutional kernel;
step S437: original feature map F' 1 、F″′ 2 、F″′ 3 Feature map M with channel attention weights, respectively c ∈R C×1×1 Multiplication element by element to obtain a new characteristic diagram F' 1 、F″″ 2 、F″″ 3 The formula is as follows:
Figure FDA0003955546840000052
step S44: f "", obtained in step S43 1 、F″″ 2 、F″″ 3 And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinates of the corner points of the license plate at the same time.
CN202011316603.4A 2020-11-20 2020-11-20 Light-weight license plate detection and identification method based on multi-scale attention mechanism Active CN112308092B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011316603.4A CN112308092B (en) 2020-11-20 2020-11-20 Light-weight license plate detection and identification method based on multi-scale attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011316603.4A CN112308092B (en) 2020-11-20 2020-11-20 Light-weight license plate detection and identification method based on multi-scale attention mechanism

Publications (2)

Publication Number Publication Date
CN112308092A CN112308092A (en) 2021-02-02
CN112308092B true CN112308092B (en) 2023-02-28

Family

ID=74335448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011316603.4A Active CN112308092B (en) 2020-11-20 2020-11-20 Light-weight license plate detection and identification method based on multi-scale attention mechanism

Country Status (1)

Country Link
CN (1) CN112308092B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861978B (en) * 2021-02-20 2022-09-02 齐齐哈尔大学 Multi-branch feature fusion remote sensing scene image classification method based on attention mechanism
CN112926588B (en) * 2021-02-24 2022-07-22 南京邮电大学 Large-angle license plate detection method based on convolutional network
CN113033321A (en) * 2021-03-02 2021-06-25 深圳市安软科技股份有限公司 Training method of target pedestrian attribute identification model and pedestrian attribute identification method
CN112967253A (en) * 2021-03-08 2021-06-15 中国计量大学 Cervical cancer cell detection method based on deep learning
CN112634273B (en) * 2021-03-10 2021-08-13 四川大学 Brain metastasis segmentation system based on deep neural network and construction method thereof
CN112966631A (en) * 2021-03-19 2021-06-15 浪潮云信息技术股份公司 License plate detection and identification system and method under unlimited security scene
CN113255443B (en) * 2021-04-16 2024-02-09 杭州电子科技大学 Graph annotation meaning network time sequence action positioning method based on pyramid structure
CN113221988A (en) * 2021-04-30 2021-08-06 佛山市南海区广工大数控装备协同创新研究院 Method for constructing lightweight network based on attention mechanism
CN114938425A (en) * 2021-06-15 2022-08-23 义隆电子股份有限公司 Photographing apparatus and object recognition method using artificial intelligence
CN113486886B (en) * 2021-06-21 2023-06-23 华侨大学 License plate recognition method and device in natural scene
CN113554030B (en) * 2021-07-27 2022-08-16 上海大学 Multi-type license plate recognition method and system based on single character attention
CN113823292B (en) * 2021-08-19 2023-07-21 华南理工大学 Small sample speaker recognition method based on channel attention depth separable convolution network
CN114092926A (en) * 2021-10-20 2022-02-25 杭州电子科技大学 License plate positioning and identifying method in complex environment
CN114821289B (en) * 2022-01-17 2023-10-17 电子科技大学 Forest fire picture real-time segmentation and fire edge point monitoring algorithm
CN114677502B (en) * 2022-05-30 2022-08-12 松立控股集团股份有限公司 License plate detection method with any inclination angle
CN115410189B (en) * 2022-10-31 2023-01-24 松立控股集团股份有限公司 Complex scene license plate detection method
CN115909316B (en) * 2023-02-21 2023-05-19 昆明理工大学 Light end-to-end license plate identification method for data non-uniform scene
CN116664918A (en) * 2023-05-12 2023-08-29 杭州像素元科技有限公司 Method for detecting traffic state of each lane of toll station based on deep learning
CN116704487A (en) * 2023-06-12 2023-09-05 三峡大学 License plate detection and recognition method based on Yolov5s network and CRNN

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508715A (en) * 2018-10-30 2019-03-22 南昌大学 A kind of License Plate and recognition methods based on deep learning
CN109740653A (en) * 2018-12-25 2019-05-10 北京航空航天大学 A kind of vehicle recognition methods again for merging visual appearance and space-time restriction
CN111325203A (en) * 2020-01-21 2020-06-23 福州大学 American license plate recognition method and system based on image correction
CN111553205A (en) * 2020-04-12 2020-08-18 西安电子科技大学 Vehicle weight recognition method, system, medium and video monitoring system without license plate information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10140553B1 (en) * 2018-03-08 2018-11-27 Capital One Services, Llc Machine learning artificial intelligence system for identifying vehicles

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508715A (en) * 2018-10-30 2019-03-22 南昌大学 A kind of License Plate and recognition methods based on deep learning
CN109740653A (en) * 2018-12-25 2019-05-10 北京航空航天大学 A kind of vehicle recognition methods again for merging visual appearance and space-time restriction
CN111325203A (en) * 2020-01-21 2020-06-23 福州大学 American license plate recognition method and system based on image correction
CN111553205A (en) * 2020-04-12 2020-08-18 西安电子科技大学 Vehicle weight recognition method, system, medium and video monitoring system without license plate information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Madhusree Mondal et al..Automatic number plate recognition using CNN based self synthesized feature learning.《2017 IEEE Calcutta Conference (CALCON)》.2018,第378-381页. *
张世豪 等.轻量型多尺度注意力融合的车牌检测算法.《计算机工程与应用》.2021,第57卷(第22期),第208-214页. *
徐富勇 等.基于深度学习的任意形状场景文字识别.《四川大学学报(自然科学版)》.2020,第57卷(第02期),第255-263页. *

Also Published As

Publication number Publication date
CN112308092A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112308092B (en) Light-weight license plate detection and identification method based on multi-scale attention mechanism
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
CN110427937B (en) Inclined license plate correction and indefinite-length license plate identification method based on deep learning
CN110033002B (en) License plate detection method based on multitask cascade convolution neural network
CN109766805B (en) Deep learning-based double-layer license plate character recognition method
CN114677502B (en) License plate detection method with any inclination angle
CN114187450A (en) Remote sensing image semantic segmentation method based on deep learning
CN109784171A (en) Car damage identification method for screening images, device, readable storage medium storing program for executing and server
CN114972208B (en) YOLOv 4-based lightweight wheat scab detection method
CN111401145A (en) Visible light iris recognition method based on deep learning and DS evidence theory
CN112785480B (en) Image splicing tampering detection method based on frequency domain transformation and residual error feedback module
CN113160062A (en) Infrared image target detection method, device, equipment and storage medium
CN113947766A (en) Real-time license plate detection method based on convolutional neural network
CN113095152A (en) Lane line detection method and system based on regression
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN114038004A (en) Certificate information extraction method, device, equipment and storage medium
CN110084743A (en) Image mosaic and localization method based on more air strips starting track constraint
CN110751226A (en) Crowd counting model training method and device and storage medium
CN111626241A (en) Face detection method and device
CN111444916A (en) License plate positioning and identifying method and system under unconstrained condition
CN116681742A (en) Visible light and infrared thermal imaging image registration method based on graph neural network
CN114998630A (en) Ground-to-air image registration method from coarse to fine
CN111241986B (en) Visual SLAM closed loop detection method based on end-to-end relationship network
CN111967579A (en) Method and apparatus for performing convolution calculation on image using convolution neural network
CN113361375B (en) Vehicle target identification method based on improved BiFPN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230725

Address after: Room 203, No. 397, Xihong, Hongshan Town, Gulou District, Fuzhou City, Fujian Province 350025

Patentee after: FUZHOU IVISIONIC TECHNOLOGY Co.,Ltd.

Address before: Fuzhou University, No.2, wulongjiang North Avenue, Fuzhou University Town, Minhou County, Fuzhou City, Fujian Province

Patentee before: FUZHOU University

TR01 Transfer of patent right