CN111079753B

CN111079753B - License plate recognition method and device based on combination of deep learning and big data

Info

Publication number: CN111079753B
Application number: CN201911325648.5A
Authority: CN
Inventors: 罗茜; 张斯尧; 王思远; 蒋杰; 张�诚; 李乾; 谢喜林; 黄晋
Original assignee: Changsha Qianshitong Intelligent Technology Co ltd
Current assignee: Changsha Qianshitong Intelligent Technology Co ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2023-08-22
Anticipated expiration: 2039-12-20
Also published as: CN111079753A

Abstract

The invention provides a license plate recognition method and device based on combination of deep learning and big data, wherein the method comprises the following steps: performing iterative training on a deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; the loaded more vehicle image data are trained in batches through an improved linear scaling and warming strategy, so that the accuracy of training a deep learning model of the big data vehicle image is improved; training a large number of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model; inputting the vehicle images trained by the deep learning model and the rapid training model into the RNN, and detecting whether the ROI of the vehicle images is a license plate; if yes, license plate recognition is carried out on the ROI through the integrated depth network model, so that the efficiency and accuracy of large-data license plate recognition can be improved.

Description

License plate recognition method and device based on combination of deep learning and big data

Technical Field

The invention belongs to the technical field of computer vision and intelligent traffic, and particularly relates to a license plate recognition method, device, terminal equipment and computer readable medium based on combination of deep learning and big data.

Background

At present, distributed training of a model based on big data and deep learning is an important research foundation of a deep learning network in the field of computer vision. In general, for deep learning applications, larger data sets and larger models can bring about a significant improvement in accuracy, but at the cost of longer training time. With the rise of deep learning in recent years, many researchers start trying to build a deep learning network training model based on the rise of deep learning, and meanwhile, accuracy and effectiveness can be considered. The method aims to train vehicle images, pedestrian images and the like in reality, so that the distributed training method has wide application value in a real scene. The existing training method of the deep learning model of the big data vehicle image has the defects of low training speed, high training cost and the like, for example, the training of the residual network-50 (ResNet-50) of millions of vehicle images by using the GPU (image processor) of M40 of the current Indellover is about 14 days. This training takes a total of single-precision operations to the power of 18 of 10. This has no doubt a drawback in terms of both time costs and cost of outlay. Moreover, in the existing license plate recognition technology, a large amount of calculation is required to be performed for license plate character segmentation, so that the problems of slow license plate recognition, inaccuracy, poor real-time performance and the like are caused.

Disclosure of Invention

In view of the above, embodiments of the present invention provide a license plate recognition method, apparatus, terminal device and computer readable medium based on combination of deep learning and big data, which can improve efficiency and accuracy of big data license plate recognition.

The first aspect of the embodiment of the invention provides a license plate recognition method based on combination of deep learning and big data, which comprises the following steps:

performing iterative training on a deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein each time a deep learning model of the big data vehicle image is iteratively trained, more vehicle image data is loaded using more processors than in the previous iteration training;

the loaded more vehicle image data are trained in batches through an improved linear scaling and warming strategy, so that the accuracy of training a deep learning model of the big data vehicle image is improved; the improved linear scaling includes: increasing the learning rate from η to kη simultaneously as the batch is increased from B to kB; the improved preheat strategy includes: if a relatively large learning rate kη is used, starting from a relatively small learning rate η value, increasing the relatively small learning rate η to the relatively large learning rate kη over a first few time periods;

Training a large number of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model;

inputting the vehicle images trained by the deep learning model and the rapid training model into an RNN (RNN) to detect whether the RoI of the vehicle images is a license plate or not;

if the vehicle license plate is the vehicle license plate, the vehicle license plate recognition is carried out on the RoI through an integrated depth network model; the integrated depth network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

A second aspect of the embodiment of the present invention provides a license plate recognition device based on combination of deep learning and big data, including:

the iterative training module is used for carrying out iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein each time a deep learning model of the big data vehicle image is iteratively trained, more vehicle image data is loaded using more processors than in the previous iteration training;

the accuracy training module is used for carrying out batch training on the loaded more vehicle image data through an improved linear scaling and preheating strategy so as to improve the accuracy of training the deep learning model of the big data vehicle image; the improved linear scaling includes: increasing the learning rate from η to kη simultaneously as the batch is increased from B to kB; the improved preheat strategy includes: if a relatively large learning rate kη is used, starting from a relatively small learning rate η value, increasing the relatively small learning rate η to the relatively large learning rate kη over a first few time periods;

The scaling improvement module is used for training a large number of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model;

the license plate detection module is used for inputting the vehicle images trained by the deep learning model and the rapid training model into an RNN so as to detect whether the RoI of the vehicle images is a license plate or not;

the license plate recognition module is used for recognizing the RoI license plate through the integrated deep network model when the license plate is detected; the integrated depth network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

A third aspect of the embodiment of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the license plate recognition method based on deep learning and big data combination described above when the processor executes the computer program.

A fourth aspect of the embodiments of the present invention provides a computer readable medium storing a computer program, which when processed and executed implements the steps of the above license plate recognition method based on deep learning in combination with big data.

In the license plate recognition method based on combination of deep learning and big data, provided by the embodiment of the invention, the big data license plate image can be trained by combining a related model through an improved random gradient descent iterative algorithm and a linear scaling and preheating strategy adaptation rate scaling algorithm, the trained vehicle data is input into an RNN (RNN) to detect whether the RoI of the vehicle image is a license plate, and when the RoI is detected as the license plate, the license plate recognition is carried out on the RoI through an integrated deep network model, so that the efficiency and accuracy of the big data license plate recognition can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a license plate recognition method based on combination of deep learning and big data provided by an embodiment of the invention;

fig. 2 is a schematic diagram of a process for identifying the RoI of a license plate image through an integrated depth network model according to an embodiment of the present invention;

Fig. 3 is a schematic structural diagram of a license plate recognition device based on combination of deep learning and big data according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a detailed structure of the license plate detection module in FIG. 3;

FIG. 5 is a schematic diagram of a detailed structure of the license plate recognition module in FIG. 3;

fig. 6 is a schematic diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

Referring to fig. 1, fig. 1 is a schematic diagram of a license plate recognition method based on combination of deep learning and big data according to an embodiment of the present invention. As shown in fig. 1, the license plate recognition method based on combination of deep learning and big data of the present embodiment includes the following steps:

S101: and performing iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm.

In the embodiment of the invention, generally speaking, an asynchronous method using a parameter server cannot guarantee stability on a large-scale system. For very large Deep Neural Network (DNN) training, the data parallel synchronization method is more stable. This idea is also simple-by using a large batch size for random gradient descent (SGD), the work of each iteration can be easily distributed to multiple processors. In the ideal vehicle image training case, resNet-50 requires 772 hundred million single precision operations to process a 225x225 vehicle image. If 90 epochs are run for an image network (ImageNet) dataset, the operands are 90 x 128 ten thousand x 77.2 hundred million (18 th power of 10). Currently, the most powerful supercomputer can perform 200 x 1015 single-precision operations per second. If an algorithm can fully utilize the supercomputer, the training of ResNet-50 can be theoretically completed within 5 seconds. For this reason, it is necessary to have the algorithm use more processors and load more vehicle image data at each iteration. Thereby reducing the total training time. Generally, a larger batch will, over a range, result in a higher speed for a single GPU (as shown in fig. 2). As the low-level matrix computation library will be more efficient. For training the Res-Net50 model using ImageNet, the optimal batch size for each GPU is 512. If it is desired to use many GPUs and have each GPU active, a larger batch size is required. For example, the number of the cells to be processed, If there are 16 GPUs, then the batch size should be set to 16×512=8192. Ideally, if the total number of accesses is fixed, the number of SGD iterations is linearly reduced if the batch size is linearly increased with the number of processors, and the time cost per iteration remains unchanged, so the total time is linearly reduced with the number of processors. The specific modified SGD iterative algorithm is as follows: let w represent the weight of (deep neural network) DNN, X represent training data, n be the number of samples in X, and Y represent the label of training data X. Let x _i Is a sample of X, l (X _i ，y _i W) is x _i And the label y _i (i.e {1, 2.,), n)) calculated loss. The objective of DNN training using a loss function such as a cross entropy function in an embodiment of the present invention is to minimize the loss function in equation (1), as follows:

wherein w represents the weight of DNN, X is training data, n is the number of samples in X, Y represents the label of training data X, and X _i Is a sample in training data X.

In the t-th iteration, embodiments of the present invention use forward and backward propagation to find the gradient of the loss function versus weight. This gradient is then used to update the weights, and equation (2) for updating weights according to the gradient is as follows:

Wherein w is _t Is the weight after the t-1 th iteration, w _t+1 For the weight after the t iteration, eta is the learning rate, and the batch size of the t iteration is B _t And B is _t The size of (2) is b. The embodiment of the invention makes the batch size of the t-th iteration be B _t And B is _t The weights can then be updated based on the following equation (3) with a size b:

wherein w is _t Is the weight after the t-1 th iteration, w _t+1 For the weight after the t iteration, eta is the learning rate, and the batch size of the t iteration is B _t And B is _t The size of (2) is b.

To simplify the expression we can say that the update rule in equation (4) represents the gradient we use weightsUpdating weight w _t Is w _t+1 。

By adopting the method, iteration is carried out, and meanwhile, more image data are loaded by using processors as much as possible, so that the training time can be greatly reduced in a linear manner. In addition, before the iterative training of the deep learning model by the improved random gradient descent iterative algorithm, the method further comprises the step of establishing the deep learning model, and the method for establishing the deep learning model is the same as the prior art, so that the description is omitted here.

S102: and carrying out batch training on the loaded more image data through an improved linear scaling and preheating strategy so as to improve the accuracy of training the deep learning model.

In embodiments of the present invention, when training large batches, we need to ensure that test accuracy is achieved that is comparable to small batches, with the same number of time periods (epochs) running. Here we fix the number of time periods (epochs) because statistically one time period (epoch) means that the algorithm will touch the entire data set once; while computationally the number of fixed time periods (epochs) means the number of fixed floating point operations. The embodiments of the present invention train large volumes of data using an improved linear scaling and warm-up strategy: 1. linear scaling: increase the batch from BAdding to kB, while also increasing the learning rate from η to kη;2. preheating strategy: if a relatively large learning rate kη is used, the relatively small learning rate η is increased to the relatively large learning rate kη over the first few time periods starting with a relatively small learning rate η value. By these techniques, relatively large batch data images can be used over a range. Further, in order to more accurately adjust the weight, the embodiment of the invention can also perform corresponding training on a large number of training layers in the batch training by finally applying improved adaptive rate scaling (LARS) to obtain a final rapid training model. Specifically, to improve accuracy of mass training, the method embodiments of the present invention use a new update Learning Rate (LR) rule. The single machine case must be considered for use herein To update the weights. Using the data parallel approach, the multi-machine version can be handled in the same way. Each layer of the deep learning model has its own weight w and gradient +.>The standard SGD algorithm uses the same LR (η) for all layers, however, from routine experimentation it can be observed that different layers may require different LR because of the weight and weight gradient norms>The ratio between them varies considerably in the different layers. The embodiment of the present invention uses a modified LARS algorithm (new updated learning rate rule) to solve this problem, the basic LR rule is defined in equation (1). L in equation (1) is a scaling factor, and in embodiments of the present invention, l may be set to 0.001 in AlexNet and ResNet training. Gamma is the adjustment parameter of the user. Usually a good gamma is found in the values of [1, 50]Between them. In this equation, different layers may have different LR. Momentum (represented by μ) and weight decay (represented by β) may be added to the SGD and the following method steps are used for the LARS: acquiring the batch of trainingThe local learning rate η of each learnable parameter in the bulk training layer; acquiring a real learning rate eta' of each layer in a large batch of training layers in the batch of training; the true learning rate is η' = γ×α×η; wherein, gamma is the adjustment parameter of the user, and the value range of gamma is [1, 50 ]Alpha is an acceleration term; by the formula->Updating the weight gradient; wherein (1)>Is a weight gradient, w is a weight, and beta is a weight decay; by the formula->Updating the acceleration term alpha; wherein μ is momentum; the weights are updated using the formula w=w- α. By using the method for preheating (warming up), the same precision as the reference can be realized by adopting SGDs with large batches, so as to obtain a final trained rapid training model. To extend to larger batch sizes (e.g., 32 k), the Local Response Normalization (LRN) needs to be changed to Batch Normalization (BN). The method of the present invention adds BN after each convolutional layer of the deep neural network. The improved LARS provided by the embodiment of the invention can help ResNet-50 maintain high test precision. The current method (unmodified linear scaling and preheating) is much less accurate for batch sizes of 16k and 32 k. It will be appreciated that the method proposed in the embodiments of the present invention may be used in practice in the distributed training of deep learning models of large data vehicle images during actual operation.

S103: and training a large number of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model.

Specifically, to improve accuracy of mass training, the method embodiments of the present invention use a new update Learning Rate (LR) rule. The single machine case must be considered for use hereinTo update the weights. Using the data parallel approach, the multi-machine version can be handled in the same way. Each layer of the deep learning model has its own weight w and gradient +.>The standard SGD algorithm uses the same LR (η) for all layers, however, from routine experimentation it can be observed that different layers may require different LR because of the weight and weight gradient norms>The ratio between them varies considerably in the different layers. The embodiment of the present invention uses a modified LARS algorithm (new updated learning rate rule) to solve this problem, the basic LR rule is defined in equation (1). L in equation (1) is a scaling factor, and in embodiments of the present invention, l may be set to 0.001 in AlexNet and ResNet training. Gamma is the adjustment parameter of the user. Usually a good gamma is found in the values of [1, 50]Between them. In this equation, different layers may have different LR. Momentum (represented by μ) and weight decay (represented by β) may be added to the SGD and the following method steps are used for the LARS: acquiring a local learning rate eta of each learnable parameter in a large batch training layer in the batch training; acquiring a real learning rate eta' of each layer in a large batch of training layers in the batch of training; the true learning rate is η' = γ×α×η; wherein, gamma is the adjustment parameter of the user, and the value range of gamma is [1, 50 ]Alpha is an acceleration term; by the formula->Updating the weight gradient; wherein (1)>Is a weight gradient, w is a weight, and beta is a weight decay; by the formula->Update accelerationThe term α; wherein μ is momentum; the weights are updated using the formula w=w- α. By using the method for preheating (warming up), the same precision as the reference can be realized by adopting SGDs with large batches, so as to obtain a final trained rapid training model. To extend to larger batch sizes (e.g., 32 k), the Local Response Normalization (LRN) needs to be changed to Batch Normalization (BN). The method of the present invention adds BN after each convolutional layer of the deep neural network. The improved LARS provided by the embodiment of the invention can help ResNet-50 maintain high test precision. The current method (unmodified linear scaling and preheating) is much less accurate for batch sizes of 16k and 32 k. It will be appreciated that the method presented in the embodiments of the present invention may be used in practice in the distributed training of deep learning models of large data vehicle images during actual operation.

S104: and inputting the vehicle images trained by the deep learning model and the rapid training model into a Recurrent Neural Network (RNN) to detect whether a region of interest (RoI) of the vehicle images is a license plate.

In the embodiment of the invention, the vehicle images trained by the deep learning model and the rapid training model can be input into an RNN, and the RNN is utilized to carry out RoI pooling on the trained vehicle images; then, an extraction layer is added to two Full Connection (FC) layers in the RNN to convert the pooled features (or called regional features) into feature vectors; and then scoring and frame regression can be carried out on the RoI through the feature vector, and whether the RoI is a license plate is judged according to the scoring and frame regression. If the RoI bit license plate is detected, the flow goes to S105. It can be understood that, since an extraction layer is added in the two FC layers, and the license plate is detected by means of scoring and frame regression, the embodiment of the present invention constructs a new RNN different from the prior art.

S105: and carrying out license plate recognition on the RoI through an integrated depth network model.

In the embodiment of the invention, since a large amount of calculation is required for character segmentation of license plate images, in order to improve recognition efficiency, the invention provides an integrated depth network model for license plate recognition, wherein the integrated depth network model comprises a convolution layer, a Bidirectional Recurrent Neural Network (BRNN) layer, a linear transformation layer and a joint sense time classification (CTC) layer. In particular, a specific method for identifying a region of interest of the vehicle image by means of an integrated depth network model may be understood in connection with fig. 2 as follows:

Firstly, performing RoI-pooled feature extraction on a region of interest (such as osma.02U10) of the vehicle image, and processing the extracted features (such as regional features C X X X Y) through two convolution layers and a rectangular pooling layer between the two convolution layers to transform the extracted features into a feature sequence D X L; wherein d=512, l=19, and the signature sequence is represented using v= (V1, V2,..vl).

Secondly, applying the characteristic sequence V on a BRNN layer to form two mutually separated RNNs, wherein one RNN forwards processes the characteristic sequence V, the other RNN backwards processes the characteristic sequence V, the two implicit states are cascaded together and input into a linear transformation layer with 37 outputs, and the two implicit states are transferred to a Softmax layer, the 37 outputs are transferred to probabilities, the probabilities correspond to probabilities of 26 letters, 10 numbers and one non-character class, the probabilities are coded by the BRNN layer, the characteristic sequence V is transferred to probability estimates q= (q 1, q2, qL) with the same length as L, and meanwhile, a long-short-term memory network (LSTM) is used for defining memory cells containing three multiplication gates so as to selectively save related information and solve the gradient disappearance problem in RNN training.

Thirdly, performing sequence decoding on the probability estimation q through a CTC layer, and searching an approximate optimal path with the maximum probability through the decoded probability estimation q:

wherein pi ^* Is the near optimal path with the highest probability (e.g., a02U 10), the B operator is used for one-point repeated labeling andnon-character labels, P is a probabilistic operation, exemplified by: b (a-a-B-) =b (-a-bb) = (aab), and the specific details of CTCs are the structures of existing CTCs, and thus are not described herein.

And fourthly, determining a loss function of the integrated depth network model through the approximate optimal path, and carrying out license plate identification on the RoI through the loss function (for example, identifying that the license plate is Gui A.02U10). The method for identifying the RoI by the loss function of the model is the same as the prior art, and therefore will not be described in detail here. It should be noted that the integrated depth network model may include, in addition to the main convolutional layers (two), BRNN layer, linear transform layer and CTC layer, a Softmax layer and a rectangular pooling layer between the two convolutional layers, and the convolutional layers may also be regarded as convolutional neural networks.

In the license plate recognition method based on combination of deep learning and big data provided in fig. 1, the deep learning model of the big data vehicle image can be iteratively trained through an improved random gradient descent iterative algorithm, when the deep learning model of the big data vehicle image is iteratively trained each time, more image data are loaded by using more processors than the previous iterative training, the loaded more image data are trained in batches through an improved linear scaling and preheating strategy so as to adjust the accuracy of training, a large number of training layers in the batch training are trained through an improved adaptive rate scaling algorithm so as to obtain a quick training model, and license plate detection and recognition are performed on the vehicle image trained through the deep learning model and the quick training model through an improved RNN and an integrated deep network model, so that the license plate recognition training cost can be reduced, the large-calculation-amount character segmentation is avoided, and the efficiency and the effectiveness of the big data license plate recognition are improved.

Referring to fig. 3, fig. 3 is a block diagram of a license plate recognition device based on combination of deep learning and big data according to an embodiment of the present invention. As shown in fig. 3, the license plate recognition device 30 based on the combination of deep learning and big data in the present embodiment includes an iterative training module 301, a accuracy training module 302, a scaling improvement module 303, a license plate detection module 304, and a license plate recognition module 305. The iterative training module 301, the accuracy training module 302, the scaling improvement module 303, the license plate detection module 304 and the license plate recognition module 305 are respectively used for executing the specific methods in S101 to S105 in fig. 1, and details can be referred to the relevant description of fig. 1, and only a simple description is made here:

the iterative training module 301 is configured to iteratively train a deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein each time a deep learning model of the big data vehicle image is iteratively trained, more vehicle image data is loaded using more processors than the previous iterative training.

The accuracy training module 302 is configured to perform batch training on the loaded more vehicle image data through an improved linear scaling and preheating strategy, so as to improve accuracy of training a deep learning model of the big data vehicle image; the improved linear scaling includes: increasing the learning rate from η to kη simultaneously as the batch is increased from B to kB; the improved preheat strategy includes: if a relatively large learning rate kη is used, the relatively small learning rate η is increased to the relatively large learning rate kη over the first few time periods starting with a relatively small learning rate η value.

The scaling improvement module 303 is configured to train the large-scale training layers in the batch training by using an improved adaptive rate scaling algorithm, so as to obtain a fast training model.

The license plate detection module 304 is configured to input the vehicle image trained by the deep learning model and the rapid training model into an RNN, so as to detect whether the RoI of the vehicle image is a license plate.

The license plate recognition module 305 is configured to, when a license plate is detected, perform license plate recognition on the RoI through an integrated deep network model; the integrated depth network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

Further, referring to fig. 4, the license plate detection module 304 may specifically include a pooling unit 3041, a conversion unit 3042, and a determination unit 3043:

a pooling unit 3041, configured to input the vehicle image trained by the deep learning model and the rapid training model into an RNN, and perform RoI pooling on the trained vehicle image by using the RNN.

And a conversion unit 3042, configured to convert the pooled feature into a feature vector by adding an extraction layer to two fully connected layers in the RNN.

And the judging unit 3043 is configured to score and perform frame regression on the RoI through the feature vector, and judge whether the RoI is a license plate according to the score and the frame regression.

Further, referring to fig. 5, the license plate recognition module 305 may specifically include a feature extraction unit 3051, a probability estimation unit 3052, an optimal path unit 3053, and a recognition unit 3054:

the feature extraction unit 3051 is configured to perform ro-pooled feature extraction on the region of interest, process the extracted feature through two convolution layers and a rectangular pooled layer between the two convolution layers, so as to transform the extracted feature into a feature sequence d×l; wherein d=512, l=19, and the signature sequence is represented using v= (V1, V2,..vl).

The probability estimation unit 3052 is configured to apply the feature sequence V to the BRNN layer to form two mutually separated cyclic neural networks RNN, where one RNN processes the feature sequence V in a forward direction, and the other RNN processes the feature sequence V in a backward direction, cascade two implicit states together, input the two implicit states into a linear transformation layer with 37 outputs, and transfer the two implicit states to the Softmax layer, and transfer the 37 outputs to probabilities corresponding to probabilities of 26 letters, 10 numbers and one non-character class, where the probabilities are encoded by the BRNN layer, so that the feature sequence V is transferred to a probability estimate q= (q 1, q2,..ql) with the same length as L, and meanwhile, use LSTM to define a memory cell containing three gates to selectively save related information, so as to solve the gradient disappearance problem in RNN training.

An optimal path unit 3053, configured to perform sequence decoding on the probability estimate q through the CTC layer, and find an approximately optimal path with a maximum probability through the decoded probability estimate q:

wherein pi ^* The method is an approximate optimal path with the highest probability, the operator B is used for one repeated mark and non-character mark, and the operator P is a probability operation.

And the identifying unit 3054 is used for determining a loss function of the integrated depth network model through the approximate optimal path, and identifying the license plate of the RoI through the loss function.

The license plate recognition device based on combination of deep learning and big data provided in fig. 3 can perform iterative training on a deep learning model of a big data vehicle image through an improved random gradient descent iterative algorithm, when the deep learning model of the big data vehicle image is iteratively trained each time, more image data is loaded by using more processors than the previous iterative training, the loaded more image data is trained in batches through an improved linear scaling and preheating strategy, so that the training accuracy is adjusted, and license plate detection and recognition are performed on the vehicle image through an improved RNN and an integrated deep network model, so that the training cost of license plate recognition can be reduced, the character segmentation with a large calculation amount is avoided, and the efficiency and the effectiveness of big data license plate recognition are improved.

Fig. 6 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 6, the terminal device 6 of this embodiment includes: a processor 60, a memory 61 and a computer program 62 stored in the memory 61 and executable on the processor 60, such as a program for license plate recognition based on deep learning in combination with big data. The steps in the above-described method embodiments are implemented by the processor 60 when executing the computer program 62, e.g. S101 to S105 shown in fig. 1. Alternatively, the processor 60, when executing the computer program 62, performs the functions of the modules/units of the apparatus embodiments described above, such as the functions of the modules 301 to 305 shown in fig. 2.

Illustratively, the computer program 62 may be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments are used for describing the execution of the computer program 62 in the terminal device 6. For example, the computer program 62 may be partitioned into an iterative training module 301, a accuracy training module 302, a scaling improvement module 303, a license plate detection module 304, and a license plate recognition module 305. (modules in the virtual device), each module specifically functions as follows:

The terminal device 6 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. Terminal device 6 may include, but is not limited to, a processor 60, a memory 61. It will be appreciated by those skilled in the art that fig. 6 is merely an example of the terminal device 6 and does not constitute a limitation of the terminal device 6, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.

The processor 60 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the terminal device 6. The memory 61 is used for storing the computer program as well as other programs and data required by the terminal device 6. The memory 61 may also be used for temporarily storing data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. The license plate recognition method based on the combination of deep learning and big data is characterized by comprising the following steps of:

inputting the vehicle images trained by the deep learning model and the rapid training model into a cyclic neural network (RNN) to detect whether a region of interest (ROI) of the vehicle images is a license plate;

if the target is a license plate, license plate recognition is carried out on the ROI through an integrated depth network model; the integrated depth network model comprises a convolution layer, a bidirectional cyclic neural network BRNN layer, a linear transformation layer and a junction sense time classification CTC layer;

the iterative training of the deep learning model of the big data vehicle image by the improved random gradient descent iterative algorithm comprises the following steps:

constructing a loss function L (W) of a deep learning model of the big data vehicle image:

wherein w represents the weight of the deep neural network DNN, X is training data, n is the number of samples in X, Y represents the label of the training data X, and X _i For samples in training data X, l (X _i ,y _i W) is directed to x _i And the label y _i (i e {1,2,., n.);

updating the weight of DNN according to the gradient of the weight by the loss function when the deep learning model of the big data vehicle image is iteratively trained each time:

Wherein w is _t Is the weight after the t-1 th iteration, w _t+1 For the weight after the t iteration, eta is the learning rate, and the batch size of the t iteration is B _t And B is _t The size of (a) is b; at each iterative training, more processors are used to load more image data than the previous iterative training;

training a large number of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model, wherein the method comprises the following steps of:

acquiring a local learning rate eta of each learnable parameter in a large batch training layer in the batch training;

acquiring a real learning rate eta' of each layer in a large batch of training layers in the batch of training; the true learning rate is η' = γ×α×η; wherein, gamma is the adjustment parameter of the user, the value range of gamma is [1, 50], and alpha is the acceleration term;

by the formulaUpdating the weight gradient; wherein (1)>Is a weight gradient, w is a weight, and beta is a weight decay;

by the formulaUpdating the acceleration term alpha; wherein μ is momentum;

updating the weights using the formula w=w- α to arrive at a fast training model;

the license plate recognition of the ROI through the integrated depth network model comprises the following steps:

extracting the feature of the region of interest after ROI pooling, and processing the extracted feature through two convolution layers and a rectangular pooling layer between the two convolution layers to transform the extracted feature into a feature sequence D multiplied by L; wherein d=512, l=19, the signature sequence is represented using v= (V1, V2,., VL);

Applying the characteristic sequence V on a BRNN layer to form two mutually separated cyclic neural networks RNNs, wherein one RNN forwards processes the characteristic sequence V, the other RNN backwards processes the characteristic sequence V, two implicit state cascades are input into a linear transformation layer with 37 outputs together and are transferred to a Softmax layer, the 37 outputs are converted into probabilities, the probabilities correspond to probabilities of 26 letters, 10 numbers and one non-character class, the probabilities are coded by the BRNN layer, the characteristic sequence V is converted into a probability estimate q= (q 1, q2, qL) with the same length as L, meanwhile, a long-short-term memory network LSTM is used for defining memory cells containing three multiplication gates to selectively save related information, the gradient disappearance problem in RNN training is solved,

performing sequence decoding on the probability estimation q through a CTC layer, and searching an approximate optimal path with the maximum probability through the decoded probability estimation q:

wherein pi is the approximate optimal path with the highest probability, the operator B is used for one repeated mark and non-character mark, and the operator P is the probability operation;

and determining a loss function of the integrated depth network model through the approximate optimal path, and carrying out license plate recognition on the ROI through the loss function.

2. The license plate recognition method based on combination of deep learning and big data according to claim 1, wherein the inputting the vehicle image trained by the deep learning model and the fast training model into the integrated deep network model to detect whether the ROI of the vehicle image is a license plate comprises:

inputting the vehicle images trained by the deep learning model and the rapid training model into an RNN, and using the RNN to pool the trained vehicle images by ROI,

adding an extraction layer in two full connection layers in the RNN to convert the pooled features into feature vectors;

and scoring and frame regression are carried out on the ROI through the feature vector, and whether the ROI is a license plate is judged according to the scoring and frame regression.

3. License plate recognition device based on deep learning and big data combination, characterized by comprising:

the license plate recognition module is used for recognizing the license plate of the ROI through an integrated depth network model when the license plate is detected; the integrated depth network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer;

where w represents the weight of the deep neural network DNN,x is training data, n is the number of samples in X, Y represents the label of the training data X, and X _i For samples in training data X, l (X _i ,y _i W) is directed to x _i And the label y _i (i e {1,2,., n.);

by the formulaUpdating the acceleration term alpha; wherein μ is momentum;

the identification module comprises:

the feature extraction unit is used for extracting the feature of the region of interest after the ROI pooling, and processing the extracted feature through two convolution layers and a rectangular pooling layer between the two convolution layers so as to transform the extracted feature into a feature sequence D multiplied by L; wherein d=512, l=19, the signature sequence is represented using v= (V1, V2,., VL);

the probability estimation unit is used for applying the characteristic sequence V to the BRNN layer to form two mutually separated cyclic neural networks RNNs, wherein one RNN forwards processes the characteristic sequence V, the other RNN backwards processes the characteristic sequence V, two implicit states are cascaded together and input into a linear transformation layer with 37 outputs, and the linear transformation layer is converted to a Softmax layer to convert the 37 outputs into probabilities, the probabilities correspond to the probabilities of 26 letters, 10 numbers and one non-character class, the probabilities are coded by the BRNN layer, so that the characteristic sequence V is converted into probability estimates q= (q 1, q2, qL) with the same length as L, and meanwhile, the LSTM is used for defining memory cells containing three multiplication gates to selectively store related information, so that the gradient disappearance problem in RNN training is solved;

An optimal path unit, configured to perform sequence decoding on the probability estimate q through a CTC layer, and find an approximately optimal path with a maximum probability through the decoded probability estimate q:

and the identification unit is used for determining a loss function of the integrated depth network model through the approximate optimal path and carrying out license plate identification on the ROI through the loss function.

4. The license plate recognition device based on combination of deep learning and big data according to claim 3, wherein the license plate detection module comprises:

a pooling unit for inputting the vehicle images trained by the deep learning model and the rapid training model into an RNN, and performing ROI pooling on the trained vehicle images by using the RNN,

the conversion unit is used for converting the pooled features into feature vectors by adding an extraction layer in two fully connected layers in the RNN;

and the judging unit is used for scoring and frame regression of the ROI through the feature vector, and judging whether the ROI is a license plate or not according to the scoring and frame regression.

5. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-2 when the computer program is executed.

6. A computer readable medium storing a computer program, characterized in that the computer program when being processed is executed to implement the steps of the method according to any one of claims 1-2.