CN115546779A

CN115546779A - Logistics truck license plate recognition method and device

Info

Publication number: CN115546779A
Application number: CN202211494713.9A
Authority: CN
Inventors: 孙晓宇; 黄博; 江培荣; 麻亮; 李攀; 何永霞; 吴农中; 甄克; 王帅; 杨营; 贺定雄
Original assignee: Chengdu Yunlitchi Technology Co ltd
Current assignee: Chengdu Yunlitchi Technology Co ltd
Priority date: 2022-11-26
Filing date: 2022-11-26
Publication date: 2022-12-30
Anticipated expiration: 2042-11-26
Also published as: CN115546779B

Abstract

The invention discloses a logistics truck license plate recognition method and equipment, and belongs to the technical field of image recognition. The identification method comprises the steps of obtaining a freight vehicle license plate image collected in an unmanned wagon balance system, obtaining a pre-trained image identification convolution network, inputting the freight vehicle license plate image into the image identification convolution network, enabling the freight vehicle license plate image to sequentially pass through each feature extraction module, carrying out global pooling operation on each layer of a preliminary feature map by utilizing a main pooling layer, enabling the preliminary feature vector to sequentially pass through a feed-forward layer and a main classifier, generating an identification result of the freight vehicle license plate image, and the like. The characteristic extraction module adopts a parallel multi-convolution structure, each branch is provided with a modulation module, different types of noises can be more pertinently and effectively inhibited, and the model has high utilization rate of effective information and strong anti-jamming capability.

Description

Logistics truck license plate recognition method and device

Technical Field

The invention belongs to the technical field of image recognition, and particularly relates to a logistics truck license plate recognition method and equipment.

Background

The existing unmanned wagon balance system can automatically weigh a logistics wagon and record relevant data such as weight and license plates, and compared with the traditional wagon balance system, the unmanned wagon balance system is high in operation efficiency and avoids a large number of errors caused by manual operation. In the unmanned wagon balance system, a camera arranged beside the unmanned wagon balance system is used for collecting a license plate image of a vehicle, and then an image recognition algorithm is used for recognizing and obtaining license plate information. In the actual scene, the dirty condition of license plate often can appear in the commodity circulation freight train, and freight train weight is big moreover, can cause the camera shake when crossing the pound, and the freight train exhaust also can arouse a large amount of dusts in the ambient air, forms the haze, and these factors result in having a large amount of noises in the license plate image of shooting the acquisition. However, the existing license plate recognition algorithms do not consider the interference factors, so that a higher error rate exists when the license plate recognition is carried out in an unmanned wagon balance system.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a logistics truck license plate recognition method and equipment, so as to improve the recognition accuracy of license plate images collected in an unmanned wagon balance system.

In order to achieve the above purpose, the solution adopted by the invention is as follows: a logistics truck license plate recognition method comprises the following steps:

s100, acquiring a license plate image of a freight vehicle (the license plate image of the freight vehicle is a segmented license plate image, and each image only comprises one Chinese character, letter or number) acquired from an unmanned wagon balance system, and acquiring a pre-trained image recognition convolution network; the image recognition convolutional network is provided with a main body pooling layer, a feed-forward layer, a main body classifier and a plurality of feature extraction modules, and the feature extraction modules are used for extracting feature information of the freight vehicle license plate image;

s200, inputting the freight vehicle license plate image into the image recognition convolution network, and generating a preliminary feature map after the freight vehicle license plate image sequentially passes through each feature extraction module;

s300, performing global pooling operation on each layer of the preliminary feature map by using the main body pooling layer to generate a preliminary feature vector;

s400, after the preliminary feature vectors sequentially pass through the feedforward layer and the main body classifier, generating a recognition result of the license plate image of the freight vehicle;

the process of extracting the image feature information by the feature extraction module is represented as the following mathematical model:

wherein, X _L-1 A feature map, X, representing the features input to said feature extraction module _L Representing said featureThe extraction module outputs a feature diagram after feature extraction operation, wherein f1, f2, f3, f4 and f5 all represent common convolution operation, convolution kernels of f1, f2 and f3 are different in size, and f is different from one another _R 1、f _R 2、f _R 3、f _R 4、f _R 5 and f _R 6 each represents a ReLU activation function, | -, represents a splicing operation to a feature diagram therein, SP1 represents a first modulation module, SP2 represents a second modulation module, SP3 represents a third modulation module, x represents a product operation corresponding to an element, YW represents a bypass fusion module, CP represents a bypass modulation module, fK represents a step convolution operation, SCP represents a span dimension modulation module, sv1 represents first modulation information output from the first modulation module, sv2 represents second modulation information output from the second modulation module, sv3 represents third modulation information output from the third modulation module, cv represents bypass modulation information output from the bypass modulation module, PL represents an internal pooling operation, M1, M2, M3, M5, and M7 represent functions f, respectively _R 1、f _R 2、f _R 3、f _R 4 and f _R 5, M4, M6, M8 and M7, wherein the feature map is generated by modulating and splicing the feature map M1, the feature map M2 and the feature map M3, the feature map is generated by the bypass fusion module, and the feature map generated by adding the feature map generated after the internal pooling operation and the feature map generated by the bypass fusion module.

Further, f1 represents the convolution operation with step size 1 and convolution kernel size 1*1, f2, f4 and f5 each represent the convolution operation with step size 1 and convolution kernel size 3*3, and f3 represents the convolution operation with step size 1 and convolution kernel size 5*5.

Further, the step size of the step convolution operation is 2, and the convolution kernel size is 3*3.

Furthermore, the internal operation processes of the first modulation module, the second modulation module and the third modulation module are the same, a modulation global pooling layer and a modulation activation function which are sequentially connected are arranged inside the first modulation module, the second modulation module and the third modulation module, the modulation global pooling layer is used for performing global maximum pooling operation on the feature map in the channel direction, and the modulation activation function is a sigmoid function; the first modulation information, the second modulation information, and the third modulation information are matrices output after the modulation global pooling layer in the first modulation module, the second modulation module, and the third modulation module operates, respectively.

Further, the internal operation process of the bypass fusion module is represented as the following mathematical model:

wherein, the characteristic diagrams M1, M2 and M3 are used as the input of the bypass merging module, the characteristic diagram M6 is used as the output of the bypass merging module, x represents the product operation corresponding to the element, | represents the splicing operation to the characteristic diagram, fP represents the ordinary convolution operation with the step size of 1 and the convolution kernel size of 1*1, f _R P denotes a ReLU activation function, Y1 denotes a feature map generated by adding the feature maps M1, M2, and M3, and Y2 denotes a feature map generated by multiplying the feature maps M1, M2, and M3 by element correspondence.

Furthermore, a bypass global pooling layer, a first full-link layer, a first bypass activation function, a second full-link layer and a second bypass activation function which are connected in sequence are arranged in the bypass modulation module; the bypass global pooling layer is used for performing global maximum pooling operation on the feature map in the spatial direction, the first bypass activation function is a ReLU function, and the second bypass activation function is a sigmoid function; and the bypass modulation information is a vector output after the operation of the bypass global pooling layer in the bypass modulation module.

Further, a mathematical model of an internal operation process of the cross-dimension modulation module is as follows;

wherein sv1 represents the first modulation information, sv2 represents the second modulation information, sv3 represents the third modulation information, cv represents the bypass modulation information, sv1, sv2, and,sv3 and cv are used as the input of the cross-dimension modulation module, wherein x represents the product operation corresponding to the element, | represents the splicing operation of the feature diagram therein, GFc represents the third fully-connected layer, gpl represents the cross-dimension global pooling layer, which is used for performing the global maximum pooling operation and δ of the feature diagram in the channel direction _G1 And delta _G2 All represent sigmoid activation functions, SC1 represents a feature diagram generated by splicing sv1, sv2 and sv3, and SC2 represents a function delta _G1 And the SC3 represents a cross-dimension modulation chart output by the cross-dimension modulation module.

Further, the pooling window size of the internal pooling operation is 2*2 and the step size is 2.

Further, the main body pooling layer is used for performing global average pooling operation on each layer of the feature map in the spatial direction; the feed-forward layer is a full connection layer, and the main body classifier is a softmax classifier.

The invention also provides logistics truck license plate recognition equipment which comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the logistics truck license plate recognition method by loading the computer program.

The invention has the beneficial effects that:

(1) The feature extraction module of the invention adopts a parallel multi-convolution structure (f) _R 1、f _R 2、f _R 3) Each branch is provided with a modulation module (SP 1, SP2 and SP 3) for modulating the characteristic information from different visual angles, so that different types of noises can be more pertinently and effectively suppressed, more useful information can be reserved in a characteristic diagram M5 obtained by fusion and filtration, and the model has strong anti-interference capability;

(2) The bypass fusion module is arranged in the feature extraction module, the feature information is extracted from other angles by the bypass fusion module, the feature graph M7 is finally generated after the operations such as step convolution and the like, the feature graph M7 and the feature graph output by the internal pooling operation (PL) have the effect of mutual verification and supplement, and the defect that the recognition error rate of the shot image is high when the camera shakes by a network in a single feature extraction mode is effectively avoided;

(3) In the prior art, the attention mechanism (or the modulation module) adopts single output, that is, the utilization of the attention mechanism is mainly the modulation information output by the tail end of the attention mechanism, and a large amount of intermediate information generated in the internal calculation process is not fully utilized, so that the problem of waste of calculation resources exists; the cross-dimension modulation module is utilized to well combine the intermediate information of the plurality of modulation modules, and the cross-dimension modulation module is matched with the first modulation module, the second modulation module, the third modulation module and the bypass modulation module to perform three-dimensional and multi-level modulation on the characteristic information, so that the utilization rate of effective information is high, and the method has an important effect on correctly identifying the stained license plate with partial information missing.

Drawings

FIG. 1 is a schematic diagram of the structure of an image recognition convolutional network of embodiment 1;

FIG. 2 is a schematic structural diagram of a feature extraction module in the image recognition convolutional network shown in FIG. 1;

FIG. 3 is a schematic diagram of the internal structure of a first modulation module in the feature extraction module shown in FIG. 2;

FIG. 4 is a schematic diagram of the internal structure of a bypass fusion module in the feature extraction module shown in FIG. 2;

FIG. 5 is a schematic diagram of the internal structure of a bypass modulation module in the feature extraction module shown in FIG. 2;

FIG. 6 is a schematic diagram of the internal structure of a span-dimension modulation module in the feature extraction module shown in FIG. 2;

fig. 7 is a schematic structural diagram of a feature extraction module in embodiment 2;

fig. 8 is a schematic structural diagram of a feature extraction module in embodiment 3;

in the drawings:

the method comprises the steps of 1-freight vehicle license plate image, 2-header convolutional layer, 3-feature extraction module, 31-first modulation module, 32-second modulation module, 33-third modulation module, 34-bypass fusion module, 35-bypass modulation module, 36-cross-dimension modulation module, 4-main body pooling layer, 5-feedforward layer, 6-main body classifier and 7-recognition result.

Detailed Description

The invention is further described below with reference to the accompanying drawings:

example 1:

before training the network, a suitable data set needs to be prepared. The method comprises the steps of collecting license plate images shot in an unmanned wagon balance system, carrying out target detection and segmentation on the license plate images by utilizing the prior art, enabling each image to only contain one Chinese character or letter or number, and then manually marking label information on the images to obtain corresponding training sets and test sets. Since these images are captured in the actual scene, a significant portion of the images contain different types of noise. The training set comprises 3728 images, the test set comprises 1548 images, and the training set and the test set comprise images of all Chinese characters, letters and numbers which may appear on the license plate.

The logistics truck license plate recognition method provided by the invention is exemplarily explained in more detail below. As shown in fig. 1, the head convolutional layer 2 (with a convolutional kernel size of 3*3 and a step size of 1) is provided at the forefront of the image recognition convolutional network, and the head convolutional layer 2 is used for performing a convolution operation on the freight vehicle license plate image 1 input to the image recognition convolutional network to generate a shallow feature map. When the size of the license plate image 1 of the freight vehicle is W, H and 3, respectively, the size of the shallow feature map output by the header convolution layer 2 is W × H × 32 (width × height lane, the same below). Every time the feature map passes through one feature extraction module 3, the width and the height of the feature map are reduced by half, the channel is doubled, and 5 feature extraction modules 3 which are connected in sequence are arranged in the middle of the image identification convolution network, so that the size of the initial feature map output by the last feature extraction module 3 is (W/32) ((H/32) × 1024). The main body pooling layer 4 is used for performing global average pooling operation on each layer of the feature map in the spatial direction, so that the output of the main body pooling layer 4 is a preliminary feature vector with the length of 1024. The feedforward layer 5 is a fully-connected layer, the number of input nodes is 1024, and the number of output nodes is set according to the total category to be classified. The main body classifier 6 is realized by adopting the existing softmax classifier, and the main body classifier 6 outputs an identification result 7.

Referring to the mathematical model of the feature extraction module 3 and fig. 2, a feature diagram X of the input feature extraction module 3 is set _L-1 The size of a × B × C (wide × high passage, the same applies below), the feature maps M1, M2, M3, M5, M6 generated therein are all a × B × C in size. After the step-by-step convolution operation, the size of the generated feature map M7 is A/2*B/2*C, after the internal pooling operation, the size of the generated feature map is A/2*B/2*C, and the size of the feature map M8 after the step-by-step convolution operation and the internal pooling operation are added and fused is A/2*B/2*C. Finally, after f5 convolution operation, outputting a characteristic diagram X with the size of A/2*B/2X 2C _L 。

In the feature extraction module 3, as shown in fig. 3, a modulation global pooling layer and a modulation activation function are sequentially arranged inside each of the first modulation module 31, the second modulation module 32 and the third modulation module 33, the modulation global pooling layer is used for performing global maximum pooling operation on the feature map in the channel direction, then a matrix with a size of a × B × 1 is output by the modulation global pooling layer, and after activation by the modulation activation function, each branch space modulation map with a size of a × B1 is obtained. Since the first modulation information sv1 is the two-dimensional matrix output after the operation of the modulation global pooling layer inside the first modulation module 31, the second modulation information sv2 is the two-dimensional matrix output after the operation of the modulation global pooling layer inside the second modulation module, and the third modulation information sv3 is the two-dimensional matrix output after the operation of the modulation global pooling layer inside the third modulation module, the sizes of the first modulation information, the second modulation information, and the third modulation information in the feature extraction module 3 are all a × B × 1.

As for the bypass fusion module 34 in the feature extraction module 3, as shown in fig. 4, the feature maps Y1 and Y2 generated therein are both a × B × C in size. For the interior of the bypass modulation module 35, as shown in fig. 5, the bypass global pooling layer is used to perform the global maximum pooling operation on the feature map in the spatial direction, so that the vector with the size of 1 × c is output by the bypass global pooling layer. Correspondingly, the number of input nodes of the first full connection layer in the bypass modulation module 35 is C, the number of output nodes is C/8, the number of input nodes of the second full connection layer is C/8, the number of output nodes is C, and finally, after passing through the second bypass activation function, the bypass channel modulation map with the size of 1 × C is output.

For the cross-dimension modulation module 36 in the feature extraction module 3, as shown in fig. 6, after the first modulation information, the second modulation information, and the third modulation information are concatenated, the size of the generated feature map SC1 is a × B × 3. On the other hand, the third full-link layer has an input node number of C and an output node number of 3, passing through delta _G1 Upon activation of the function, the resulting SC2 size was 1 × 3. And the SC1 and the SC2 are subjected to element corresponding multiplication, so that each layer of the SC1 is multiplied by a corresponding element value in the SC 2. Then processed through cross-dimension global pooling and delta _G2 After the function is activated, a cross-dimension modulation chart with the size of A, B and 1 is obtained.

In this embodiment, a cross entropy function is used as a loss function, the training sets are used to train the ResNet152 and the image recognition convolutional network provided by the present invention, and then the two trained models are tested on the test sets, so that the test result shows that the recognition accuracy of the ResNet152 is 91.93%, while the recognition accuracy of the image recognition convolutional network provided by the present invention is 98.42%, which is obviously higher than that of the ResNet152.

Example 2:

this example is a comparative experiment, and only all the cross-dimension modulation modules 36 in example 1 are removed (the structure of the removed feature extraction module 3 is shown in fig. 7), and the rest of the network remains the same as that in example 1. The results of the training and testing procedures exactly the same as those of example 1 show that the recognition accuracy of the image recognition convolutional network of example 2 on the test set is 94.88%. Compared with the test result, the identification effect is obviously improved after the cross-dimension modulation module 36 is arranged.

Example 3:

in this embodiment, for a comparative experiment, all the cross-dimension modulation module 36, the bypass fusion module 34, the bypass modulation module 35, and the stride convolution and the like in embodiment 1 are removed (the structure of the feature extraction module 3 after removal is shown in fig. 8), and other parts of the network are all the same as those in embodiment 1. The results of the training and testing procedures identical to those of example 1 show that the recognition accuracy of the image recognition convolutional network of example 3 on the test set is 81.54%. Comparing the test results of the embodiment 2 and the embodiment 3, it can be seen that after the bypass fusion module 34, the bypass modulation module 35, the stride convolution and other operations are set, the license plate recognition accuracy of the network is obviously improved.

The above-mentioned embodiments only express the specific embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims

1. A logistics truck license plate recognition method is characterized by comprising the following steps: the method comprises the following steps:

s100, acquiring a license plate image of a freight vehicle collected in an unmanned wagon balance system, and acquiring an image recognition convolution network trained in advance; the image recognition convolutional network is provided with a main body pooling layer, a feed-forward layer, a main body classifier and a plurality of feature extraction modules, and the feature extraction modules are used for extracting feature information of the freight vehicle license plate image;

wherein, X _L-1 A feature map, X, representing the features input to said feature extraction module _L The feature extraction module is used for extracting features of the image, the feature extraction module outputs a feature diagram after feature extraction operation, f1, f2, f3, f4 and f5 all represent common convolution operation, convolution kernels of f1, f2 and f3 are different in size, and f1, f2 and f3 are different in size _R 1、f _R 2、f _R 3、f _R 4、f _R 5 and f _R 6 each represents a ReLU activation function, | -, represents a splicing operation to a feature diagram therein, SP1 represents a first modulation module, SP2 represents a second modulation module, SP3 represents a third modulation module, x represents a product operation corresponding to an element, YW represents a bypass fusion module, CP represents a bypass modulation module, fK represents a step convolution operation, SCP represents a span dimension modulation module, sv1 represents first modulation information output from the first modulation module, sv2 represents second modulation information output from the second modulation module, sv3 represents third modulation information output from the third modulation module, cv represents bypass modulation information output from the bypass modulation module, PL represents an internal pooling operation, M1, M2, M3, M5, and M7 represent functions f, respectively _R 1、f _R 2、f _R 3、f _R 4 and f _R 5, M4, M6, M8 and M7, wherein the feature map is generated by modulating and splicing the feature map M1, the feature map M2 and the feature map M3, the feature map is generated by the bypass fusion module, and the feature map generated by adding the feature map generated after the internal pooling operation and the feature map generated by the bypass fusion module.

2. The logistics truck license plate identification method as claimed in claim 1, wherein: f1 represents the convolution operation with step size 1 and convolution kernel size 1*1, f2, f4 and f5 each represent the convolution operation with step size 1 and convolution kernel size 3*3, and f3 represents the convolution operation with step size 1 and convolution kernel size 5*5.

3. The logistics truck license plate identification method as claimed in claim 1, wherein: the step size of the stride convolution operation is 2, and the convolution kernel size is 3*3.

4. The logistics truck license plate identification method as claimed in claim 1, wherein: the internal operation processes of the first modulation module, the second modulation module and the third modulation module are the same, a modulation global pooling layer and a modulation activation function which are connected in sequence are arranged in each of the first modulation module, the second modulation module and the third modulation module, the modulation global pooling layer is used for performing global maximum pooling operation on the feature map in the channel direction, and the modulation activation function is a sigmoid function;

the first modulation information, the second modulation information, and the third modulation information are matrices output after the modulation global pooling layer in the first modulation module, the second modulation module, and the third modulation module operates, respectively.

5. The logistics truck license plate identification method as claimed in claim 4, wherein: the internal operation process of the bypass fusion module is expressed as the following mathematical model:

6. The logistics truck license plate identification method as claimed in claim 5, wherein: a bypass global pooling layer, a first full-connection layer, a first bypass activation function, a second full-connection layer and a second bypass activation function which are sequentially connected are arranged in the bypass modulation module; the bypass global pooling layer is used for performing global maximum pooling operation on the feature map in the spatial direction, the first bypass activation function is a ReLU function, and the second bypass activation function is a sigmoid function;

and the bypass modulation information is a vector output after the operation of the bypass global pooling layer in the bypass modulation module.

7. The logistics truck license plate identification method as claimed in claim 6, wherein: the mathematical model of the internal operation process of the cross-dimension modulation module is as follows;

wherein sv1 represents the first modulation information, sv2 represents the second modulation information, sv3 represents the third modulation information, cv represents the bypass modulation information, sv1, sv2, sv3 and cv are used together as the input of the cross-dimension modulation module, x represents the element corresponding product operation, | represents the splicing operation of the feature map therein, GFc represents the third fully-connected layer, gpl represents the cross-dimension global pooling layer, the cross-dimension global pooling layer is used for performing the global maximum pooling operation of the feature map in the channel direction, δ _G1 And delta _G2 All represent sigmoid activation functions, SC1 represents a feature diagram generated by splicing sv1, sv2 and sv3, and SC2 represents a function delta _G1 And the SC3 represents a cross-dimension modulation chart output by the cross-dimension modulation module.

8. The logistics truck license plate identification method as claimed in claim 1, wherein: the internal pooling operation has a pooling window size of 2*2 and a step size of 2.

9. The logistics truck license plate identification method as claimed in claim 1, wherein: the main body pooling layer is used for performing global average pooling operation on each layer of the feature map in the space direction;

the feed-forward layer is a full connection layer, and the main body classifier is a softmax classifier.

10. A logistics truck license plate recognition device comprises a processor and a memory, wherein the memory stores a computer program, and is characterized in that: the processor is used for executing the logistics truck license plate identification method according to any one of claims 1 to 9 by loading the computer program.