CN108108746B - License plate character recognition method based on Caffe deep learning framework - Google Patents
License plate character recognition method based on Caffe deep learning framework Download PDFInfo
- Publication number
- CN108108746B CN108108746B CN201710823771.4A CN201710823771A CN108108746B CN 108108746 B CN108108746 B CN 108108746B CN 201710823771 A CN201710823771 A CN 201710823771A CN 108108746 B CN108108746 B CN 108108746B
- Authority
- CN
- China
- Prior art keywords
- image
- layer
- license plate
- output
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/63—Scene text, e.g. street names
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/625—License plates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Abstract
In some occasions of license plate character recognition, the obtained characters usually have strong noise pollution such as uneven cutting, strong illumination contrast, serious shielding and the like. Aiming at characters polluted by strong noise, the invention provides a license plate character recognition method based on a Caffe deep learning framework, a convolutional neural network is built under the Caffe framework, a network structure with strong robustness and high recognition precision is obtained by training network parameters, and meanwhile, the problems of low recognition precision of oblique, noise and similar characters in the existing license plate character recognition method are solved by carrying out scaling processing, oblique correction and normalization processing on training samples and recognition images, so that the recognition precision of the license plate characters is greatly improved.
Description
The technical field is as follows:
the invention belongs to the technical field of license plate recognition, and particularly relates to a license plate character recognition method of a Caffe deep learning framework.
Background art:
the license plate character recognition is an important technology in the fields of public transportation, safety protection and the like, and has important practical application value. At present, license plate recognition methods can be divided into two types, one type is a recognition method based on non-machine learning, such as a template matching method, the method does not need training, the processing process is simpler, and the problems of easy interference of factors such as noise in a complex environment and the like, long single character recognition time and the like exist; the other type is a recognition method based on machine learning, for example, a recognition method based on a BP neural network, and a license plate character recognition method combining an SVM and an orthogonal Gauss moment, which can achieve a good recognition effect for specific license plate characters, but are easily affected by the number of training samples and have low robustness.
The application provides a license plate character recognition method based on a Caffe deep learning framework. The experimental result shows that the method provided by the article has a good recognition effect, particularly under a complex environment with noise interference, the method has strong robustness, and the recognition rate of the license plate characters can still reach more than 95%. The method has good potential application value in practical application due to the advantages.
The invention content is as follows:
aiming at the typical problems existing in the current license plate character recognition, the application provides a license plate character recognition method based on a Caffe deep learning framework, and the license plate character recognition is realized by introducing the Caffe deep learning framework and building a convolutional neural network on the basis.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
a license plate character recognition method based on a Caffe deep learning framework comprises the following steps:
step 1: multiple color license plate sample images shot under different illumination intensity, inclination angle, shielding degree and noise pollution conditions;
step 2: preprocessing the acquired license plate sample image to obtain a segmented sub-image sample, selecting the sub-image sample containing characters, and forming a character sub-image sample set S;
and step 3: further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S';
and 4, step 4: storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list;
and 5: constructing a six-layer Caffe deep learning network structure, wherein the six layers are a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2 in sequence; initializing parameters of a network structure after the network structure is established, wherein the initial parameters are generated randomly by a system;
step 6: inputting each sample in the character sub-image sample set S' and the corresponding classification mark thereof into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain a trained Caffe deep learning network structure;
and 7: a license plate image is captured by an electronic police for license plate character recognition;
and 8: preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters;
and step 9: further processing each character sub-image to obtain a processed character sub-image;
step 10: performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 6 to obtain a recognition result;
step 11: combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate;
the further processing in step 3 and step 9 comprises:
1) normalizing the image to n × m;
2) performing inclination correction on the normalized image;
3) carrying out scaling operation on the characters; firstly, carrying out binarization operation on an image after inclination correction to obtain a black-and-white image, scanning downwards from the first line of the black-and-white image, finding out a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of adjacent coordinates at the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H ifIf the image size is smaller than the set first threshold value, the image after inclination correction is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by mThen cutting the amplified image, and reserving a part with the middle size of nxm; if it is notIf the image size is larger than the set second threshold, the image after the inclination correction is subjected to reduction processing, wherein the reduction processing mode is that the image with the size of n multiplied by m is reduced into the image with the size of n multiplied by m by adopting a mode of equal interval samplingThen, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is notIs greater than or equal toIf the predetermined first threshold value is equal to or less than the set second threshold value, the image after the tilt correction is not subjected to the scaling processing.
Furthermore, the calculation process of convolutional layer 1 and convolutional layer 2 in the Caffe deep learning network structure is as follows:
in the formula (1), MjRepresenting a convolved kernelA set of input feature maps of an action,is MjThe ith input feature map in the set,as a convolution kernelF (-) is the ReLU activation function,outputting a feature map for the convolution;
the calculation formulas of the pooling layers 1 and 2 are as follows:
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,andrespectively for each output bitThe corresponding pooling weights and additive biases of the token map,the feature map is input for pooling,outputting a characteristic diagram for pooling;
the calculation formula of the full connection layer 1 and the full connection layer 2 is as follows:
in the formula (3)In order to fully connect the weight value,in order to be fully connected with the bias,the characteristic diagram is input for the full connection,for fully connected output feature vectors, f "(. cndot.) is the ReLU activation function.
Further, after each sample and the corresponding classification label are input into the Caffe deep learning network structure in step 6, the following processing is further included:
the image output from the full connection layer 2 is subjected to the processing of formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiIs processedPixel of the output image, xiPixels that are the output image of the fully connected layer 2; then carrying out normalization processing on yi to obtain the probability value p of each output nodei:
Finally, a loss function is calculated through the loss layer, and the calculation formula of the loss value L (theta) is shown as the formula (6):
in the formula (6), j is y(i)Corresponding class label, I {. is an illustrative function, i.e., when y(i)J is the output 1, the other is 0, m is the batch size, finally the average loss is calculated, theta represents the parameter to be optimized, i.e.
Further, the step 6 further comprises:
the step 6 further comprises:
calculating Caffe deep learning network structure parameters in a back propagation mode; first, the gradient of the loss layerThe calculation formula is shown in formula (7):
l (θ) in equation (7) is the output of the lossy layer, yiFor the pixels of the output image after processing according to said formula (4), piCalculating a probability value for each output node obtained for the formula (5);
then, calculating parameters of the full connection layerAndthe calculation formula is shown in formula (8):
is the gradient of the output of the fully connected layer,in order to output the characteristic diagram for the full connection layer,inputting a characteristic diagram for the full connection layer;
in equation (10) up (-) is the inverse of the down-sampling,is the gradient output by the pooling layer,is an output characteristic diagram of the pooling layer;
finally, convolution layer parameters are calculatedThe calculation method is shown in formula (11):
in formula (11)Representation and convolution kernelsThe input feature map is acted on and the input feature map is acted on,is a kernel of convolution withGradients of feature maps generated after the action, u, v representing the sum of convolution kernelsA set of related input and output feature maps;
through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
Furthermore, in the step 4, the marks 0-9 are used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 are used for marking samples with characters of A-Z in the sample image respectively.
Furthermore, the preprocessing comprises license plate positioning, license plate image extraction and character segmentation operation of the image.
The invention has the beneficial effects that: the license plate character recognition method based on the Caffe deep learning framework provided by the invention solves the problem of different processing effects of pictures with different accuracies shot in different environments in the prior art; the method does not need to manually define the characteristics, has high algorithm robustness, and can obtain high identification precision under the complex environment with noise interference.
Description of the drawings:
FIG. 1 is a flow chart of a license plate character recognition method based on a Caffe deep learning framework in the invention.
FIG. 2 is a license plate image preprocessing process of the present invention.
FIG. 3 is a Caffe deep learning network structure of the present invention.
The specific implementation mode is as follows:
for a further understanding of the invention, reference will now be made in detail to the embodiments illustrated in the drawings and examples, but the invention is not limited to the embodiments.
The invention discloses a license plate character recognition method based on a Caffe deep learning framework, which has a flow diagram shown in figure 1 and comprises the following steps.
Step 101: multiple color license plate sample images taken under different conditions. The different conditions are shooting by selecting different conditions of illumination intensity, inclination angle, shielding degree and noise pollution. And comprehensively considering the training speed and the reliability of the training result, and selecting to shoot 2000 license plate sample images.
Step 102: and preprocessing the acquired license plate sample image to obtain a segmented sub-image, and selecting the sub-image containing characters to form a character sub-image sample set S. The preprocessing process is shown in fig. 2 and includes license plate positioning, license plate image extraction, and character segmentation operation.
Step 103: and further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S'.
Wherein, the further processing of the sample specifically comprises the following substeps:
step A: carrying out normalization processing on the sample image s, and processing the size of the sample image s into n multiplied by m; the sample image size is set to 24 × 40 in this application.
And B: and performing inclination correction on the sample image after the normalization processing to obtain a sample image s'. The tilt correction method can adopt the hough algorithm which is used more currently.
And C: and carrying out scaling operation on the character. Although the size of each sample is the same after normalization, the proportion of characters in different samples is not consistent, and the difference of samples can cause the accuracy of the model, so the size of the uniform character is needed. Firstly, carrying out binarization operation on a sample image s' to obtain a black-and-white image, scanning downwards from a first line of the black-and-white image to find a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of coordinates adjacent to the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H ifIf the image size is smaller than the set first threshold value, the sample image s' is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by mThen cutting the amplified image, and reserving a part with the middle size of nxm; if it is notIf the value is larger than the set second threshold value, the sample image s' is reduced by adopting a mode of sampling at equal intervals to reduce the image with the size of n multiplied by m into an image with the size of n multiplied by mThen, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is notIf the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the sample image s' is not scaled. In this step, the first threshold and the second threshold may be set to 0.9 and 1.1, respectively.
Step 104: and storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list. For example, the marks 0-9 can be used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 can be used for marking samples with characters of A-Z in the sample image respectively;
step 105: constructing a six-layer Caffe deep learning network structure, and then initializing parameters of the network structure. The six layers are sequentially a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2, and the structure of the six layers is shown in fig. 3. The initial parameters are randomly generated by the system.
Step 106: and (3) training the Caffe deep learning network structure by using each sample in the character sub-image sample set S' to obtain the trained Caffe deep learning network structure.
Inputting each sample and the corresponding classification mark in the character sub-image sample set S' into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain the trained Caffe deep learning network structure.
First, a sample image is input to the convolution layer 1 to be subjected to convolution processing, and the convolution calculation process is as shown in formula (1):
in the formula (1), MjRepresenting a convolved kernelA set of input feature maps of an action,is MjThe ith input feature map in the set,as a convolution kernelF (-) is the activation function, the present application uses ReLU as the activation function,the feature map is output for convolution. The input data of the convolutional layer 1 is a 24 × 40 sample image, the size of the convolutional kernel is 5 × 5, the convolution step is 1, and the output image size after passing through the convolutional layer 1 is 20 × 36.
Then, the output image of the convolutional layer 1 is input to the pooling layer 1 to be pooled. The purpose of pooling is to reduce the spatial resolution of the convolutional layer by downsampling. The pooling layer can be regarded as a special convolution layer, the pooling kernel is set according to algorithm requirements and cannot be learned, and the pooling kernel generates corresponding output feature maps by one-to-one action of each input feature map. Therefore, the number of the output characteristic graphs is consistent with that of the input characteristic graphs, but the data is compressed, and the robustness is enhanced. Pooling can be divided into mean pooling and maximum pooling, and the two pooling formulas are shown as follows:
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, ReLU is adopted as the activation function,andthe pooling weight and additive bias corresponding to each output feature map can be learned during the training process,the feature map is input for pooling,and outputting the characteristic diagram for pooling. The pooling layer 1 adopts maximum pooling, the size of a pooling core is 2 multiplied by 2, the step length is 2, and the size of a network output picture after pooling is 10 multiplied by 18.
The output image of the pooling layer 1 is input to the convolutional layer 2 for processing, the convolutional parameters and the convolution calculation method of the convolutional layer 2 are the same as those of the convolutional layer 1, and the size of the output picture of the layer is 6 × 14.
The fourth layer is a pooling layer 2, corresponding parameters and pooling calculation mode pooling layers 1 are the same, except that the layer adopts mean pooling, and the output picture size is 3 × 7.
The forward transmission of the full-connection layer is similar to a BP neural network algorithm, an input feature map needs to be expanded to a one-dimensional vector from a two-dimensional map before calculation, and then full-connection calculation is carried out. The calculation formula of the full connection is as follows:
in the formula (3)In order to fully connect the weight value,the bias is fully connected, automatically adjusted through training,the characteristic diagram is input for the full connection,for a fully connected output feature vector, f "(. cndot.) is the activation function. The image processed by the pooling layer 2 is input to a full-connection layer 1 for processing, the layer has 400 neuron nodes, is fully connected with the output neurons of the fourth layer, and is output through a ReLU activation function.
The sixth layer is a fully-connected layer 2, 10 output neurons of the layer correspond to 10 types of labels, and the fully-connected calculation mode of the layer is the same as that of the fully-connected layer 1.
In the training process, in order to ensure that the maximum value of the output image is 0, the feature map output by the full connection layer 2 needs to be processed as shown in formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiFor pixels of the processed output image, xiAre the pixels of the output image of the fully connected layer 2. Then need to be right for yiNormalization processing is carried out, and the probability value p of each output node is obtainedi. The processing method comprises the following steps:
finally, in order to measure the accuracy of identification, a loss function needs to be calculated through a loss layer, the invention adopts Softmax as the output of the loss function, and the calculation formula of the loss value L (theta) is shown as the formula (6):
in the formula (6), j is y(i)Corresponding class label, I {. is an illustrative function, i.e., when y(i)J is the output 1, the other is 0, m is the batch size, finally the average loss is calculated, theta represents the parameter to be optimized, i.e.And
after the training samples are identified through the Caffe deep learning network, the gradient obtained by back propagation of each layer of network parameters needs to be adjusted according to the identification result, and the gradient is reversely calculated layer by the convolution layer, the pooling layer and the loss layer through a chain type derivation method. Gradient of loss layerThe calculation formula is as shown in formula (7):
y in formula (7)iIs the pixel of the output image after processing of equation (4), L (θ) is the output of the loss layer, piThe probability value for each output node calculated for equation (5). Then, calculating parameters of the full connection layerAndthe calculation formula is shown as follows:
is the gradient of the output of the fully connected layer,in order to output the characteristic diagram for the full connection layer,inputting a feature map for the full connection layer. Since the application uses the ReLU activation function, the gradient reverse conduction needs to be multipliedThat is, when the output of the full connection layer is greater than 0, the gradient continues to feed back, and when the output is less than 0, the gradient stops feeding back.
In equation (10), up (-) is the inverse of down-sampling,is the gradient output by the pooling layer,is the output characteristic diagram of the pooling layer.
in the formula (11)Representation and convolution kernelsThe input feature map is acted on and the input feature map is acted on,is a kernel of convolution withGradient of feature map generated after the action. u, v representation and convolution kernelAnd (4) relevant input and output feature diagram sets.
Through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
Step 107: and (4) capturing a license plate image through an electronic police for license plate character recognition.
Step 108: and preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing the Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters. The preprocessing in this step is the same as the preprocessing in step 102.
According to the well-known method, the first character in the license plate is usually a Chinese character, so that the first image on the leftmost side of the license plate which is segmented is only required to be removed when the sub-image containing the character is selected.
Step 109: and further processing each character sub-image to obtain a processed character sub-image.
The further processing of the character subimage t specifically comprises the following substeps:
step a: carrying out normalization processing on the character sub-image t, and processing the size of the character sub-image t into n multiplied by m; in the present application, n is set to be 24 and m is set to be 40.
Step b: and (4) performing inclination correction on the normalized image to obtain an image t'. The tilt correction method can adopt the hough algorithm which is used more currently.
Step c: and carrying out scaling operation on the character. Firstly, carrying out binarization operation on an image t' to obtain a black-and-white image, scanning downwards from a first line of the black-and-white image to find a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of coordinates adjacent to the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H ifIf the image is smaller than the set first threshold value, the image t' is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by mThen cutting the amplified image, and reserving a part with the middle size of nxm; if it is notIf the value is larger than the set second threshold value, the image t' is reduced by adopting a mode of sampling at equal intervals to reduce the image with the size of n multiplied by m into the image with the size of n multiplied by mThen, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is notIf the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the image t' is not scaled. In this step, the first threshold and the second threshold may be set to 0.9 and 1.1, respectively.
Step 110: and (4) performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 106 to obtain a recognition result.
The identification process is as follows:
first, the character sub-image is input to the convolution layer 1 for convolution processing, and the convolution calculation process is as shown in formula (12):
in formula (12), MjRepresenting a convolved kernelA set of input feature maps of an action,is MjThe ith input feature map in the set,as a convolution kernelF (-) is the activation function, the present application uses ReLU as the activation function,the feature map is output for convolution. The input data of the convolutional layer 1 is a 24 × 40 sample image, the size of the convolutional kernel is 5 × 5, the convolution step is 1, and the output image size after passing through the convolutional layer 1 is 20 × 36.
Then, the output image of the convolutional layer 1 is input to the pooling layer 1 to be pooled. The pooling calculation formula is shown in the following formula (13):
in the formula (13), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,andthe pooling weight and additive bias corresponding to each output feature map can be learned during the training process,the feature map is input for pooling,and outputting the characteristic diagram for pooling. The pooling layer 1 adopts maximum pooling, the size of a pooling core is 2 multiplied by 2, the step length is 2, and the size of a network output picture after pooling is 10 multiplied by 18.
The output image of the pooling layer 1 is input to the convolutional layer 2 for processing, the convolutional parameters and the convolution calculation method of the convolutional layer 2 are the same as those of the convolutional layer 1, and the size of the output picture of the layer is 6 × 14.
The output image of the convolutional layer 2 is input into the pooling layer 2 for further processing, the corresponding parameters and the pooling calculation mode of the pooling layer 1 are the same, except that the layer adopts mean pooling, and the size of the output image is 3 multiplied by 7.
The output image of the pooling layer 2 is input to the full-link layer 1 to be subjected to full-link processing. The calculation formula of the full connection layer is as follows:
in formula (14)In order to fully connect the weight value,in order to be fully connected with the bias,the characteristic diagram is input for the full connection,for a fully connected output feature vector, f "(. cndot.) is the activation function. The fully-connected layer 1 has 400 neuron nodes, is fully connected with the output neurons of the pooling layer 2, and is output through a ReLU activation function.
The image output by the full connection layer 1 is further input into a full connection layer 2, 10 output neurons of the layer correspond to 10 types of labels, and the full connection calculation mode of the layer is the same as that of the full connection layer 1.
After six layers of processing, the output result is a label of the category, and the recognition result of the character is obtained through the label.
Step 111: and combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (7)
1. A license plate character recognition method based on a Caffe deep learning framework is characterized in that the detection method comprises the following steps:
step 1: multiple color license plate sample images shot under different illumination intensity, inclination angle, shielding degree and noise pollution conditions;
step 2: preprocessing the acquired license plate sample image to obtain a segmented sub-image sample, selecting the sub-image sample containing characters, and forming a character sub-image sample set S;
and step 3: further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S';
and 4, step 4: storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list;
and 5: constructing a six-layer Caffe deep learning network structure, wherein the six layers are a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2 in sequence; initializing parameters of a network structure after the network structure is established, wherein the initial parameters are generated randomly by a system;
step 6: inputting each sample in the character sub-image sample set S' and the corresponding classification mark thereof into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain a trained Caffe deep learning network structure;
and 7: a license plate image is captured by an electronic police for license plate character recognition;
and 8: preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters;
and step 9: further processing each character sub-image to obtain a processed character sub-image;
step 10: performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 6 to obtain a recognition result;
step 11: combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate;
said further processing in said steps 3 and 9 comprises:
1) normalizing the image to n × m;
2) performing inclination correction on the normalized image;
3) carrying out scaling operation on the characters; firstly, carrying out binarization operation on an image after inclination correction to obtain a black-and-white image, scanning downwards from the first line of the black-and-white image, finding out a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of adjacent coordinates at the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point 1, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the first pixel point, if so, considering that the point 1 is an end point of the character in the vertical direction, and recording the line coordinate lx of the character; calculating the height h ═ fx-lx |; h is compared with the set standard height H ifIf the image size is smaller than the set first threshold value, the image after inclination correction is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by mThen cutting the amplified image, and reserving a part with the middle size of nxm; if it is notIf the image size is larger than the set second threshold, the image after the inclination correction is subjected to reduction processing, wherein the reduction processing mode is that the image with the size of n multiplied by m is reduced into the image with the size of n multiplied by m by adopting a mode of equal interval samplingThen, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is notIf the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the image after the tilt correction is not subjected to the scaling processing.
2. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: the calculation process of the convolution layer 1 and the convolution layer 2 in the Caffe deep learning network structure is as follows:
in the formula (1), MjRepresenting a convolved kernelA set of input feature maps of an action,is MjThe ith input feature map in the set,as a convolution kernelF (-) is the ReLU activation function,outputting a feature map for the convolution;
the calculation formulas of the pooling layers 1 and 2 are as follows:
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,andrespectively for the pooling weight and additive bias corresponding to each output feature map,the feature map is input for pooling,outputting a characteristic diagram for pooling;
the calculation formula of the full connection layer 1 and the full connection layer 2 is as follows:
4. The license plate character recognition method of the Caffe deep learning framework according to claim 2 or 3, characterized in that: after each sample and the corresponding classification label are input into the Caffe deep learning network structure in step 6, the following processing is also included:
the image output from the full connection layer 2 is subjected to the processing of formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiFor pixels of the processed output image, xiPixels that are the output image of the fully connected layer 2; then to yiNormalization processing is carried out to obtain the probability value p of each output nodei:
Finally, a loss function is calculated through the loss layer, and the calculation formula of the loss value L (theta) is shown as the formula (6):
5. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 4, wherein: the step 6 further comprises:
calculating Caffe deep learning network structure parameters in a back propagation mode; first, the gradient of the loss layerThe calculation formula is shown in formula (7):
l (θ) in equation (7) is the output of the lossy layer, yiFor the pixels of the output image after processing according to said formula (4), piCalculating a probability value for each output node obtained for the formula (5);
then, calculating parameters of the full connection layerAndthe calculation formula is shown in formula (8):
is the gradient of the output of the fully connected layer,in order to output the characteristic diagram for the full connection layer,inputting a characteristic diagram for the full connection layer;
in equation (10) up (-) is the inverse of the down-sampling,is the gradient output by the pooling layer,is an output characteristic diagram of the pooling layer;
finally, convolution layer parameters are calculatedThe calculation method is shown in formula (11):
in formula (11)Representation and convolution kernelsThe input feature map is acted on and the input feature map is acted on,is a kernel of convolution withGradients of feature maps generated after the action, u, v representing the sum of convolution kernelsA set of related input and output feature maps;
through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
6. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: in the step 4, the marks 0-9 are used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 are used for marking samples with characters of A-Z in the sample image respectively.
7. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: the preprocessing comprises the steps of license plate positioning, license plate image extraction and character segmentation operation of the image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710823771.4A CN108108746B (en) | 2017-09-13 | 2017-09-13 | License plate character recognition method based on Caffe deep learning framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710823771.4A CN108108746B (en) | 2017-09-13 | 2017-09-13 | License plate character recognition method based on Caffe deep learning framework |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108746A CN108108746A (en) | 2018-06-01 |
CN108108746B true CN108108746B (en) | 2021-04-09 |
Family
ID=62207466
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710823771.4A Expired - Fee Related CN108108746B (en) | 2017-09-13 | 2017-09-13 | License plate character recognition method based on Caffe deep learning framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108746B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214443A (en) * | 2018-08-24 | 2019-01-15 | 北京第视频科学技术研究院有限公司 | Car license recognition model training method, licence plate recognition method, device and equipment |
CN109325489B (en) * | 2018-09-14 | 2021-05-28 | 浙江口碑网络技术有限公司 | Image recognition method and device, storage medium and electronic device |
CN109766805B (en) * | 2018-12-28 | 2022-12-09 | 安徽清新互联信息科技有限公司 | Deep learning-based double-layer license plate character recognition method |
CN109815956B (en) * | 2018-12-28 | 2022-12-09 | 安徽清新互联信息科技有限公司 | License plate character recognition method based on self-adaptive position segmentation |
CN109754011B (en) * | 2018-12-29 | 2019-11-12 | 北京中科寒武纪科技有限公司 | Data processing method, device and Related product based on Caffe |
CN110427937B (en) * | 2019-07-18 | 2022-03-22 | 浙江大学 | Inclined license plate correction and indefinite-length license plate identification method based on deep learning |
CN110598703B (en) * | 2019-09-24 | 2022-12-20 | 深圳大学 | OCR (optical character recognition) method and device based on deep neural network |
CN110659640B (en) * | 2019-09-27 | 2021-11-30 | 深圳市商汤科技有限公司 | Text sequence recognition method and device, electronic equipment and storage medium |
CN110956133A (en) * | 2019-11-29 | 2020-04-03 | 上海眼控科技股份有限公司 | Training method of single character text normalization model, text recognition method and device |
CN111209858B (en) * | 2020-01-06 | 2023-06-20 | 电子科技大学 | Real-time license plate detection method based on deep convolutional neural network |
CN111291761B (en) * | 2020-02-17 | 2023-08-04 | 北京百度网讯科技有限公司 | Method and device for recognizing text |
CN111401139B (en) * | 2020-02-25 | 2024-03-29 | 云南昆钢电子信息科技有限公司 | Method for obtaining mine underground equipment position based on character image intelligent recognition |
CN113496227A (en) * | 2020-04-08 | 2021-10-12 | 顺丰科技有限公司 | Training method and device of character recognition model, server and storage medium |
CN111539426B (en) * | 2020-04-27 | 2023-03-14 | 合肥工业大学 | High-precision license plate recognition system based on FPGA |
CN113392814B (en) * | 2021-08-16 | 2021-11-02 | 冠传网络科技(南京)有限公司 | Method and device for updating character recognition model and storage medium |
CN113761961B (en) * | 2021-09-07 | 2023-08-04 | 杭州海康威视数字技术股份有限公司 | Two-dimensional code identification method and device |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101183427A (en) * | 2007-12-05 | 2008-05-21 | 浙江工业大学 | Computer vision based peccancy parking detector |
CN105335745A (en) * | 2015-11-27 | 2016-02-17 | 小米科技有限责任公司 | Recognition method, device and equipment for numbers in images |
CN105893968A (en) * | 2016-03-31 | 2016-08-24 | 华南理工大学 | Text-independent end-to-end handwriting recognition method based on deep learning |
CN105975968A (en) * | 2016-05-06 | 2016-09-28 | 西安理工大学 | Caffe architecture based deep learning license plate character recognition method |
CN106127248A (en) * | 2016-06-24 | 2016-11-16 | 平安科技(深圳)有限公司 | Car plate sorting technique based on degree of depth study and system |
CN106295645A (en) * | 2016-08-17 | 2017-01-04 | 东方网力科技股份有限公司 | A kind of license plate character recognition method and device |
CN106384112A (en) * | 2016-09-08 | 2017-02-08 | 西安电子科技大学 | Rapid image text detection method based on multi-channel and multi-dimensional cascade filter |
CN106503707A (en) * | 2016-10-21 | 2017-03-15 | 浙江宇视科技有限公司 | Licence plate recognition method and device under the conditions of a kind of infrared light filling |
CN106650740A (en) * | 2016-12-15 | 2017-05-10 | 深圳市华尊科技股份有限公司 | License plate identification method and terminal |
CN106778730A (en) * | 2016-12-29 | 2017-05-31 | 深圳爱拼信息科技有限公司 | A kind of adaptive approach and system for quickly generating OCR training samples |
US20170177965A1 (en) * | 2015-12-17 | 2017-06-22 | Xerox Corporation | Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks |
CN107025452A (en) * | 2016-01-29 | 2017-08-08 | 富士通株式会社 | Image-recognizing method and image recognition apparatus |
-
2017
- 2017-09-13 CN CN201710823771.4A patent/CN108108746B/en not_active Expired - Fee Related
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101183427A (en) * | 2007-12-05 | 2008-05-21 | 浙江工业大学 | Computer vision based peccancy parking detector |
CN105335745A (en) * | 2015-11-27 | 2016-02-17 | 小米科技有限责任公司 | Recognition method, device and equipment for numbers in images |
US20170177965A1 (en) * | 2015-12-17 | 2017-06-22 | Xerox Corporation | Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks |
CN107025452A (en) * | 2016-01-29 | 2017-08-08 | 富士通株式会社 | Image-recognizing method and image recognition apparatus |
CN105893968A (en) * | 2016-03-31 | 2016-08-24 | 华南理工大学 | Text-independent end-to-end handwriting recognition method based on deep learning |
CN105975968A (en) * | 2016-05-06 | 2016-09-28 | 西安理工大学 | Caffe architecture based deep learning license plate character recognition method |
CN106127248A (en) * | 2016-06-24 | 2016-11-16 | 平安科技(深圳)有限公司 | Car plate sorting technique based on degree of depth study and system |
CN106295645A (en) * | 2016-08-17 | 2017-01-04 | 东方网力科技股份有限公司 | A kind of license plate character recognition method and device |
CN106384112A (en) * | 2016-09-08 | 2017-02-08 | 西安电子科技大学 | Rapid image text detection method based on multi-channel and multi-dimensional cascade filter |
CN106503707A (en) * | 2016-10-21 | 2017-03-15 | 浙江宇视科技有限公司 | Licence plate recognition method and device under the conditions of a kind of infrared light filling |
CN106650740A (en) * | 2016-12-15 | 2017-05-10 | 深圳市华尊科技股份有限公司 | License plate identification method and terminal |
CN106778730A (en) * | 2016-12-29 | 2017-05-31 | 深圳爱拼信息科技有限公司 | A kind of adaptive approach and system for quickly generating OCR training samples |
Non-Patent Citations (3)
Title |
---|
A Novel Kernel PCA Support Vector Machine Algorithm with Feature Transition Function;Wang Lianhong,et al.;《Proceedings of the 26th Chinese Control Conference》;20071231;全文 * |
License plate character recognition based on wavelet kernel LS-SVM;Guang Yang.;《 2011 3rd International Conference on Computer Research and Development》;20110505;全文 * |
基于CNN的车牌数字字符识别算法;欧先锋,等.;《成都工业学院学报》;20161231;第19卷(第4期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108108746A (en) | 2018-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108746B (en) | License plate character recognition method based on Caffe deep learning framework | |
CN109784333B (en) | Three-dimensional target detection method and system based on point cloud weighted channel characteristics | |
CN108229479B (en) | Training method and device of semantic segmentation model, electronic equipment and storage medium | |
US8750619B2 (en) | Character recognition | |
CN108121991B (en) | Deep learning ship target detection method based on edge candidate region extraction | |
CN109903331B (en) | Convolutional neural network target detection method based on RGB-D camera | |
CN106682629B (en) | Identification algorithm for identity card number under complex background | |
CN107038416B (en) | Pedestrian detection method based on binary image improved HOG characteristics | |
CN110766020A (en) | System and method for detecting and identifying multi-language natural scene text | |
CN113888461A (en) | Method, system and equipment for detecting defects of hardware parts based on deep learning | |
CN111709980A (en) | Multi-scale image registration method and device based on deep learning | |
CN114693661A (en) | Rapid sorting method based on deep learning | |
CN113657528B (en) | Image feature point extraction method and device, computer terminal and storage medium | |
CN112084952B (en) | Video point location tracking method based on self-supervision training | |
CN113052170A (en) | Small target license plate recognition method under unconstrained scene | |
CN110472632B (en) | Character segmentation method and device based on character features and computer storage medium | |
CN110516731B (en) | Visual odometer feature point detection method and system based on deep learning | |
CN107704864B (en) | Salient object detection method based on image object semantic detection | |
CN111079585B (en) | Pedestrian re-identification method combining image enhancement with pseudo-twin convolutional neural network | |
CN107358244A (en) | A kind of quick local invariant feature extraction and description method | |
CN116363535A (en) | Ship detection method in unmanned aerial vehicle aerial image based on convolutional neural network | |
CN110795995A (en) | Data processing method, device and computer readable storage medium | |
CN106530300A (en) | Flame identification algorithm of low-rank analysis | |
CN110414301B (en) | Train carriage crowd density estimation method based on double cameras | |
CN110705568A (en) | Optimization method for image feature point extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210409 Termination date: 20210913 |