CN112308092B

CN112308092B - Light-weight license plate detection and identification method based on multi-scale attention mechanism

Info

Publication number: CN112308092B
Application number: CN202011316603.4A
Authority: CN
Inventors: 吴林煌; 张世豪; 杨绣郡; 陈志峰
Original assignee: Fuzhou University
Current assignee: Fuzhou Ivisionic Technology Co ltd
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2023-02-28
Anticipated expiration: 2040-11-20
Also published as: CN112308092A

Abstract

The invention provides a light-weight license plate detection and identification method based on a multi-scale attention mechanism, wherein a license plate detection and identification network is constructed by the following steps; step S1: acquiring a picture as an original data set; step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate; and step S3: constructing a deep neural network for detecting a license plate; and step S4: inputting an original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate; step S5: and performing perspective transformation on the P2 according to the corner points of the license plate to obtain a corrected image P3. Step S6: constructing a deep neural network for identifying the license plate; step S7: inputting the P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate; the invention can simultaneously obtain lower network parameter quantity and calculated quantity under the condition of ensuring the network accuracy.

Description

Light-weight license plate detection and identification method based on multi-scale attention mechanism

Technical Field

The invention relates to the technical field of machine vision, in particular to a light-weight license plate detection and identification method based on a multi-scale attention mechanism.

Background

Along with the gradual development of economy and the increase of the number of automobiles year by year, the urban traffic pressure is higher and higher, how to efficiently manage the traffic becomes a problem which needs to be solved urgently, the license plate detection and recognition technology plays an important role in traffic management, and the capacity of automatically detecting and recognizing license plates from traffic violation to accident monitoring is one of key tools used by law enforcement agencies in various regions. The detection and identification of the license plate are not only widely applied to road traffic management, but also more widely applied to aspects of parking lots, community security, robbery and wanted vehicle and the like.

The traditional license plate detection and identification method comprises four stages: the method has the advantages that the method has the defects that the required steps are excessive, the error of each step is gradually accumulated, and the inaccurate prediction of any step can cause the error of license plate recognition; moreover, the acquired images used in the traditional license plate detection and recognition are all based on fixed angles, and the accuracy of the license plate detection and recognition of large-angle license plates in natural scenes is poor. The license plate detection and identification method for deep learning has good performance, but because a deeper network is used, the parameter quantity of the model is overlarge, the calculation quantity is too large, and the deployment and the operation at a mobile end are difficult.

Disclosure of Invention

The invention provides a light-weight license plate detection and recognition method based on a multi-scale attention mechanism, which can finish the detection and recognition of a license plate only in three stages, effectively reduces recognition errors caused by character segmentation errors, improves the accuracy of character recognition, and can obtain lower network parameters and calculated amount simultaneously under the condition of ensuring network accuracy.

The invention adopts the following technical scheme.

A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the construction of the license plate detection and identification network comprises the following steps;

step S1: acquiring a picture with a license plate and a license plate label as an original data set required by training;

step S2: processing the original data set to obtain a data set A for training a model for detecting the license plate and a data set B for training a model for recognizing the license plate;

and step S3: constructing a deep neural network for detecting a license plate;

and step S4: inputting the original image P1 of the data set A into the network constructed in the step S3 to obtain a license plate detection area P2 and four corner points of a license plate;

step S5: and performing perspective transformation on the obtained license plate detection area P2 according to the corner points of the license plate to obtain a corrected license plate image P3.

Step S6: constructing a deep neural network for identifying the license plate;

step S7: and inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain the license plate number corresponding to the detected license plate.

The original data set used in the step S1 is a CCPD license plate data set.

And S2, correcting the license plate through marking the license plate angular points in the CCPD data set and perspective transformation, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.

S3, constructing a deep neural network for detecting the license plate, and specifically comprising the following steps:

step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a characteristic pyramid network FPN, a receptive field module RFB, an attention mechanism module CBAM and a detection head;

step S32: constructing a loss function for a license plate detection network, using a multitask loss function

To carry outJoint optimization:

wherein

As a function of the loss of classification of the license plate, L _cls The loss of softmax, p, is used _i Indicating the probability that the ith anchor point may be a license plate,

the value of the ith anchor point is a positive example value of 1 and a negative example value of 0;

a regression loss function representing the license plate frame,

wherein R represents a smooth-L1 loss function, t _i ＝{t _x ，t _y ，t _w ，t _h } _i And

respectively representing the predicted value of the ith anchor point to the license plate frame coordinate and the real value of the license plate frame coordinate corresponding to the anchor point;

representing regression loss function of the corner points of the license plate, and still using smooth-L1 loss function, wherein

And

respectively representing the predicted values of 4 angular points of the license plate at the ith anchor point and the true values of four angular points of the license plate corresponding to the anchor point; lambda [ alpha ] ₁ And λ ₂ And representing the weight values of the license plate frame prediction and the corner point prediction in the license plate detection task.

The step S31 specifically includes the following steps;

step S311: in a license plate detection network, a lightweight network model MobileNet is used in a backbone part for extracting features; the method comprises the steps that a characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a receptive field module RFB, the RFB carries out three branches on the characteristic graphs with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the characteristic graph obtained by each branch is sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the characteristic pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to focus on important characteristics and inhibit unimportant characteristics, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.

Step S4 specifically includes the following steps;

step S41: at the input end of the license plate detection network, the input image P1 needs to be uniformly scaled to 640 x 640, in order to prevent image distortion, black edge processing needs to be performed on the image, in order to improve the generalization performance of the model, a target picture needs to be cut, the brightness, the contrast and the saturation are changed, operations such as turning, tilting and the like are performed, and normalization processing is performed;

step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three characteristic pictures F with different sizes ₁ 、F ₂ 、F ₃ Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' ₁ 、F′ ₂ 、F′ ₃ . F' ₁ Performing convolution using three different convolution branches, each convolution branch being convolved with a 1 x 1 convolution kernel, and applying the convolution kernel to the featureAdjusting the dimension of the graph, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using convolution kernels of 3 × 3 respectively after the convolution is completed, performing convolution by using expansion convolution kernels with expansion rates of 1,3 and 5 respectively to obtain feature graphs with different size receptive fields, and finally performing convolution by using concat operation on the three obtained feature graphs and performing convolution by using 1 × 1 to obtain a feature graph F ″, wherein the size of the feature graph is completely consistent with that of the input feature graph ₁ 。F′ ₂ 、F′ ₃ Operation with F' ₁ In agreement, F ″, is obtained ₂ 、F″ ₃ ；

Step S43: f' obtained in step S42 ₁ 、F″ ₂ 、F″ ₃ Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight _c ∈R ^C×1×1 And is combined with the initial characteristic diagram F ″) ₁ Multiply by one to obtain a product of size

A characteristic diagram of (2); obtained characteristic diagram F' ₁ Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like _s (F)∈R ^H×W And is combined with F' ₁ Multiply to obtain the size

The characteristic diagram of (1).

The step S43 specifically includes the following steps:

step S431: compressing the input feature map through an average pooling layer and a maximum pooling layer to obtain

And

step S432: inputting the obtained two channel characteristics into a multilayer perceptron to carry out convolution operation;

step S433: and performing element-by-element addition on the output obtained after the operation is finished, and activating by using a Sigmoid function to obtain a feature map M of the channel attention weight _c ∈R ^C×1×1 ，

The calculation formula is as follows:

where σ denotes the Sigmoid activation function, W ₀ And W ₁ Representing weights of a multi-layer perceptron;

step S434: the original characteristic diagram F ″) ₁ 、F″ ₂ 、F″ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C×1×1 Multiplying element by element to obtain a new feature map F' ₁ 、F″′ ₂ 、F″′ ₃ The formula is as follows:

wherein

Indicating that the corresponding elements are multiplied by one another.

Step S435: the weighted new feature map F' is obtained after passing through the average pooling layer and the maximum pooling layer

And

step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight _s (F)∈R ^H×W CalculatingThe formula is as follows:

wherein f is ^7×7 Represents a convolution layer using a convolution kernel of 7 × 7.

Step S437: original feature map F' ₁ 、F″′ ₂ 、F″′ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C ^×1×1 Multiplication element by element to obtain a new characteristic diagram F' ₁ 、F″″ ₂ 、F″″ ₃ The formula is as follows:

step S44: f "", obtained in step S43 ₁ 、F″″ ₂ 、F″″ ₃ And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinates of the corner points of the license plate at the same time.

S6, constructing a deep neural network for identifying the license plate, which specifically comprises the following steps:

step S61: constructing a deep neural network for license plate recognition, wherein the network consists of a convolutional neural network and a bidirectional gated recursive network, the convolutional neural network is used for extracting image characteristics of a license plate to obtain a characteristic diagram, the characteristic diagram is converted into a characteristic sequence and is sent into the bidirectional gated recursive network for further extracting sequence characteristics, and finally, a connecting time sequence classification algorithm (CTC) is used for converting the extracted characteristic sequence into a tag sequence;

s62, constructing a loss function of the license plate recognition network, and using the CTC loss function as follows:

in the formula, S represents a training data set, Z is input data of a network, G is label information, (Z, G) is a set of data input into the network in the training set, and p (G | Z) is a probability of obtaining a label G when the input data is Z.

Compared with the prior art, the invention has the following beneficial effects:

1. the license plate recognition method without character segmentation effectively avoids recognition errors caused by character segmentation errors;

2. according to the invention, mobileNet is used as a backbone network, so that parameters and calculated amount of the network are obviously reduced under the condition of ensuring detection precision;

3. the invention uses the RFB module in the network, can integrate the context information of the target area, promote the detection performance;

4. according to the method, an attention mechanism is used in the network, so that the attention degree of the network to important features is improved, and the attention degree of unimportant features is restrained;

5. the method optimizes by using a multitask loss function, adds angular point loss, enhances the positioning accuracy of the license plate by the loss, corrects the license plate by utilizing the positioned angular points to perform perspective transformation, and identifies the corrected license plate, thereby improving the identification accuracy of the license plate characters;

6. the invention can realize the identification of the license plates with different digits by using the CTC loss function.

The invention can finish the detection and identification of the license plate only by three stages: the method has the advantages that the loss caused by license plate segmentation is reduced through image acquisition, license plate positioning and character recognition, the recognition error caused by the character segmentation error is effectively reduced, the multitask loss is used for optimizing the license plate positioning part, and the angular point positioning loss is added, so that the positioning accuracy of the license plate is enhanced, the predicted angular point can be used for subsequent correction of the license plate, and the accuracy of character recognition is improved; in the character recognition part, the characteristics of the characteristic sequence are further extracted by using a recurrent neural network, and the license plate characters are recognized by using CTC loss. And because a lightweight model MobileNet is used in the network backbone part, lower network parameter quantity and calculation quantity can be obtained simultaneously under the condition of ensuring network accuracy.

Drawings

The invention is described in further detail below with reference to the following figures and detailed description:

FIG. 1 is a schematic diagram of a structural block of an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating an effect of a CCPD partial data set downloaded in step S1 according to an embodiment of the present invention;

FIG. 3 is a diagram of a txt file for storing different path addresses of two data sets obtained in step S2 according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the license plate picture captured in step S2 in the example of the present invention;

FIG. 5 is a schematic diagram of a network frame for license plate detection constructed in step S3 in the embodiment of the present invention;

FIG. 6 is a schematic diagram of the frame of the RFB module constructed in step S31 in the embodiment of the present invention;

FIG. 7 is a schematic diagram of the CBAM module frame map constructed in step S31 according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a channel attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a spatial attention module framework in the CBAM module constructed in step S31 according to an embodiment of the present invention;

fig. 10 is a schematic diagram of a network frame for license plate recognition constructed in step S32 in the embodiment of the present invention;

FIG. 11 is a schematic view of an uncorrected license plate

FIG. 12 is a schematic view of a license plate corrected after perspective transformation.

Detailed Description

As shown in the figure, a light-weight license plate detection and recognition method based on a multi-scale attention mechanism adopts a license plate detection network and a license plate recognition network to recognize license plates; the construction of the license plate detection and identification network comprises the following steps;

and step S3: constructing a deep neural network for detecting a license plate;

step S5: and performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3.

Step S6: constructing a deep neural network for identifying the license plate;

The original data set used in the step S1 is a CCPD license plate data set.

And S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.

S3, constructing a deep neural network for detecting the license plate, which specifically comprises the following steps:

step S31: constructing a deep neural network for detecting a license plate, wherein the neural network consists of five parts, namely a backbone, a Feature Pyramid network FPN (Feature Pyramid Networks), a Receptive Field Module RFB (Receptive Field Block), an Attention mechanism Module CBAM (conditional Block attachment Module) and a detection head;

step S32: constructing a loss function of a license plate detection network, and using a multi-task loss function

Performing joint optimization:

wherein

As a function of the classification loss of the license plate, L _cls The loss of softmax, p, is used _i Representing the probability that the ith anchor point may be a license plate,

a regression loss function representing the license plate frame,

respectively representing a predicted value of the coordinates of the license plate frame at the ith anchor point and a real value of the coordinates of the license plate frame corresponding to the anchor point;

expressing the regression loss function of the license plate corner points, wherein the loss function of the license plate corner points still uses smooth-L1 loss function, wherein

And

respectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the real values of the four corner points of the license plate corresponding to the anchor point; lambda ₁ And λ ₂ And representing the weight values of license plate frame prediction and angular point prediction in a license plate detection task.

The step S31 specifically includes the following steps;

step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the feature pyramid network FPN is formed by up-sampling feature maps extracted by a backbone and fusing the feature maps with the same size to form a plurality of feature maps with different sizes, the formed feature maps are all sent to a receptive field module RFB, the RFB carries out three branches on the feature maps with different sizes, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different sizes, the feature maps obtained by each branch are sent to a CBAM, the detection capability of license plates with different sizes is enhanced by using the feature pyramid network, the RFB improves the detection effect of the license plates by integrating context information of a target area, the CBAM can enable the network to pay attention to important features and inhibit unimportant features, and therefore the detection performance of the network is improved; the detection head is composed of three parts, which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function.

Step S4 specifically includes the following steps;

step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes ₁ 、F ₂ 、F ₃ Obtaining three feature maps F 'fused by different scales through convolution, up-sampling and other operations of the three feature maps' ₁ 、F′ ₂ 、F′ ₃ . Is prepared from F' ₁ Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the characteristic diagram, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively,after the convolution is finished, respectively using expansion convolution kernels with convolution kernels of 3X 3 and expansion rates of 1,3,5 to carry out convolution so as to obtain feature maps with different sizes of reception fields, and finally using concat operation to carry out convolution with 1X 1 on the three obtained feature maps so as to obtain a feature map F' with the size completely consistent with that of the input feature map ₁ 。F′ ₂ 、F′ ₃ Operation (c) with F' ₁ After agreement, F ″' is obtained ₂ 、F″ ₃ ；

Step S43: f' obtained in step S42 ₁ 、F″ ₂ 、F″ ₃ Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to operations such as global average pooling, maximum pooling, convolution, activation function and the like to obtain a feature map M of channel attention weight _c ∈R ^C×1×1 And is compared with the initial characteristic diagram F ″) ₁ Multiply by one to obtain a product of size

A characteristic diagram of (1); obtained characteristic diagram F' ₁ Obtaining a feature map M of the spatial attention weight through operations of average pooling, maximum pooling, splicing, activation and the like _s (F)∈R ^H×W And is combined with F' ₁ Multiply to obtain the size

A characteristic diagram of (c).

The step S43 specifically includes the following steps:

step S431: compressing the input feature map through an Average pooling layer (Average pooling) and a maximum pooling layer (Max pooling) to obtain the feature map

And

The calculation formula is as follows:

where σ denotes a Sigmoid activation function, W ₀ And W ₁ Representing weights of the multi-layer perceptron;

step S434: the original characteristic diagram F ″) ₁ 、F″ ₂ 、F″ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C×1×1 Element by element multiplication to obtain a new feature map F' ₁ 、F″′ ₂ 、F″′ ₃ The formula is as follows:

wherein

Representing the multiplication of corresponding elements one by one.

And

step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through a convolution layer, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight _s (F)∈R ^H×W The calculation formula is as follows:

Step S437: original feature map F' ₁ 、F″′ ₂ 、F″′ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C ^×1×1 Multiplying element by element to obtain a new characteristic diagram F' ₁ 、F″″ ₂ 、F″″ ₃ The formula is as follows:

step S44: f "", obtained in step S43 ₁ 、F″″ ₂ 、F″″ ₃ And (4) sending the license plate to a detection head, and obtaining the predicted bbox coordinate of the license plate and the coordinate of the license plate corner point at the same time.

The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims

1. A light-weight license plate detection and identification method based on a multi-scale attention mechanism is characterized in that a license plate detection network and a license plate identification network are adopted to identify license plates; the method is characterized in that: the construction of the license plate detection and recognition network comprises the following steps;

step S2: processing the original data set to obtain a data set A for training a model for detecting a license plate and a data set B for training a model for recognizing the license plate;

and step S3: constructing a deep neural network for detecting a license plate;

step S5: performing perspective transformation on the obtained license plate detection region P2 according to the corner points of the license plate to obtain a corrected license plate image P3;

step S6: constructing a deep neural network for identifying the license plate;

step S7: inputting the corrected license plate image P3 into the network constructed in the step S6 to obtain a license plate number corresponding to the detected license plate;

Performing joint optimization according to a formula I:

wherein

As a function of the loss of classification of the license plate, L _cls The loss of softmax, p, is used _i Representing the probability that the ith anchor point is a license plate,

the value of the ith anchor point is 1 in positive example and 0 in negative example;

a regression loss function representing the license plate frame,

And

respectively representing the predicted values of 4 corner points of the license plate at the ith anchor point and the license plate corresponding to the anchor pointThe real values of the four corner points; lambda [ alpha ] ₁ And λ ₂ Representing the weight of license plate frame prediction and angular point prediction in a license plate detection task;

in the formula, s represents a training data set, Z is input data of a network, G is label information, (Z, G) is a group of data input into the network in the training set, and p (G | Z) is the probability of obtaining a label G under the condition that the input data is Z;

the step S31 specifically includes the following steps;

step S311: in a license plate detection network, a lightweight network model MobileNet is used for extracting features in a backbone part; the characteristic pyramid network FPN is formed by up-sampling characteristic graphs extracted by a backbone and fusing the characteristic graphs with the same size to form a plurality of characteristic graphs with different sizes, the formed characteristic graphs are all sent to a reception field module RFB, the RFB carries out three branches on the characteristic graphs with different scales, each branch carries out sliding convolution by using convolution kernels and expansion convolutions with different scales, and the characteristic graphs obtained by each branch are sent to a CBAM; the detection head consists of three parts which respectively correspond to three tasks of license plate detection, license plate frame regression and license plate four-corner point regression, each part uses 1 × 1 convolution kernel to carry out convolution, so that adjustment is carried out on the dimension, the dimension can correspond to the corresponding task, the dimension is transformed during output, and the calculation corresponds to the subsequent loss function;

step S4 specifically includes the following steps;

step S41: at the input end of the license plate detection network, the input image P1 is uniformly scaled to 640 × 640, the image is subjected to black edge supplementing processing, a target image is cut, the brightness, the contrast and the saturation are changed, the operation is turned over and inclined, and normalization processing is performed;

step S42: inputting the pictures into a backbone network of a license plate detection network to obtain three feature maps F with different sizes ₁ 、F ₂ 、F ₃ Obtaining three feature maps F 'fused with different scales by performing convolution and up-sampling operations on the three feature maps' ₁ 、F′ ₂ 、F′ ₃ (ii) a Is prepared from F' ₁ Performing convolution by using three different convolution branches, performing convolution by using a convolution kernel of 1 × 1 for each convolution branch, adjusting the dimension of the feature map, performing convolution by using convolution kernels of 1 × 1,3 × 3 and 5 × 5 respectively, performing convolution by using expansion convolution kernels of 3 × 3 and 1,3 and 5 respectively after the convolution is completed, thereby obtaining feature maps with different sizes of receptive fields, and finally performing convolution by using concat operation and 1 × 1 on the three obtained feature maps to obtain a feature map F ″, wherein the size of the feature map is completely consistent with that of the input feature map ₁ ；F′ ₂ 、F′ ₃ Operation with F' ₁ In agreement, F ″, is obtained ₂ 、F″ ₃ ；

Step S43: f' obtained in step S42 ₁ 、F″ ₂ 、F″ ₃ Respectively sending the information to an attention mechanism module CBAM, namely a CBAM convolution attention module, wherein the CBAM consists of two parts, a channel attention module and a space attention module; firstly, the feature map is subjected to global average pooling, maximum pooling, convolution and activation function operation to obtain a feature map M of channel attention weight _c ∈R ^C×1×1 And is compared with the initial characteristic diagram F ″) ₁ Multiplied by another to give a size of F' ₁ ^H×W×C A characteristic diagram of (1); obtained byCharacteristic map F' ₁ Obtaining a feature map M of the spatial attention weight through average pooling, maximum pooling, splicing and activation operations _s (F)∈R ^H×W And is combined with F' ₁ Multiply to obtain the size F "" ₁ ^H×W×C A characteristic diagram of (c).

2. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the original data set used in the step S1 is a CCPD license plate data set.

3. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: and S2, correcting the license plate through perspective transformation by marking license plate corner points in the CCPD data set, wherein the license plate image data set obtained after correction is used as a data set B for training a license plate recognition network, and the original CCPD data set is used as a data set A for training a license plate detection network.

4. The method for detecting and identifying the light-weight license plate based on the multi-scale attention mechanism as claimed in claim 1, wherein: the step S43 specifically includes the following steps:

And

The calculation formula is as follows:

step S434: the original feature map F ″ ₁ 、F″ ₂ 、F″ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C×1×1 Multiplying element by element to obtain a new feature map F' ₁ 、F″′ ₂ 、F″′ ₃ The formula is as follows:

wherein

Representing the multiplication of corresponding elements one by one;

step S435: obtaining a weighted new feature map F' through an average pooling layer and a maximum pooling layer

And

step S436: splicing the two spatial features together to form a new spatial feature map with the channel number of 2, then performing convolution through convolution layers, and activating by using a Sigmoid function, thereby obtaining a feature map M of spatial attention weight _s (F)∈R ^H×W The calculation formula is as follows:

whereinf ^7×7 Represents a convolutional layer using a 7 × 7 convolutional kernel;

step S437: original feature map F' ₁ 、F″′ ₂ 、F″′ ₃ Feature map M with channel attention weights, respectively _c ∈R ^C×1×1 Multiplication element by element to obtain a new characteristic diagram F' ₁ 、F″″ ₂ 、F″″ ₃ The formula is as follows: