CN108108746B - License plate character recognition method based on Caffe deep learning framework - Google Patents

License plate character recognition method based on Caffe deep learning framework Download PDF

Info

Publication number
CN108108746B
CN108108746B CN201710823771.4A CN201710823771A CN108108746B CN 108108746 B CN108108746 B CN 108108746B CN 201710823771 A CN201710823771 A CN 201710823771A CN 108108746 B CN108108746 B CN 108108746B
Authority
CN
China
Prior art keywords
image
layer
license plate
output
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710823771.4A
Other languages
Chinese (zh)
Other versions
CN108108746A (en
Inventor
欧先锋
张国云
彭鑫
吴健辉
郭龙源
涂兵
周建婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Institute of Science and Technology
Original Assignee
Hunan Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Institute of Science and Technology filed Critical Hunan Institute of Science and Technology
Priority to CN201710823771.4A priority Critical patent/CN108108746B/en
Publication of CN108108746A publication Critical patent/CN108108746A/en
Application granted granted Critical
Publication of CN108108746B publication Critical patent/CN108108746B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

In some occasions of license plate character recognition, the obtained characters usually have strong noise pollution such as uneven cutting, strong illumination contrast, serious shielding and the like. Aiming at characters polluted by strong noise, the invention provides a license plate character recognition method based on a Caffe deep learning framework, a convolutional neural network is built under the Caffe framework, a network structure with strong robustness and high recognition precision is obtained by training network parameters, and meanwhile, the problems of low recognition precision of oblique, noise and similar characters in the existing license plate character recognition method are solved by carrying out scaling processing, oblique correction and normalization processing on training samples and recognition images, so that the recognition precision of the license plate characters is greatly improved.

Description

License plate character recognition method based on Caffe deep learning framework
The technical field is as follows:
the invention belongs to the technical field of license plate recognition, and particularly relates to a license plate character recognition method of a Caffe deep learning framework.
Background art:
the license plate character recognition is an important technology in the fields of public transportation, safety protection and the like, and has important practical application value. At present, license plate recognition methods can be divided into two types, one type is a recognition method based on non-machine learning, such as a template matching method, the method does not need training, the processing process is simpler, and the problems of easy interference of factors such as noise in a complex environment and the like, long single character recognition time and the like exist; the other type is a recognition method based on machine learning, for example, a recognition method based on a BP neural network, and a license plate character recognition method combining an SVM and an orthogonal Gauss moment, which can achieve a good recognition effect for specific license plate characters, but are easily affected by the number of training samples and have low robustness.
The application provides a license plate character recognition method based on a Caffe deep learning framework. The experimental result shows that the method provided by the article has a good recognition effect, particularly under a complex environment with noise interference, the method has strong robustness, and the recognition rate of the license plate characters can still reach more than 95%. The method has good potential application value in practical application due to the advantages.
The invention content is as follows:
aiming at the typical problems existing in the current license plate character recognition, the application provides a license plate character recognition method based on a Caffe deep learning framework, and the license plate character recognition is realized by introducing the Caffe deep learning framework and building a convolutional neural network on the basis.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
a license plate character recognition method based on a Caffe deep learning framework comprises the following steps:
step 1: multiple color license plate sample images shot under different illumination intensity, inclination angle, shielding degree and noise pollution conditions;
step 2: preprocessing the acquired license plate sample image to obtain a segmented sub-image sample, selecting the sub-image sample containing characters, and forming a character sub-image sample set S;
and step 3: further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S';
and 4, step 4: storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list;
and 5: constructing a six-layer Caffe deep learning network structure, wherein the six layers are a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2 in sequence; initializing parameters of a network structure after the network structure is established, wherein the initial parameters are generated randomly by a system;
step 6: inputting each sample in the character sub-image sample set S' and the corresponding classification mark thereof into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain a trained Caffe deep learning network structure;
and 7: a license plate image is captured by an electronic police for license plate character recognition;
and 8: preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters;
and step 9: further processing each character sub-image to obtain a processed character sub-image;
step 10: performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 6 to obtain a recognition result;
step 11: combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate;
the further processing in step 3 and step 9 comprises:
1) normalizing the image to n × m;
2) performing inclination correction on the normalized image;
3) carrying out scaling operation on the characters; firstly, carrying out binarization operation on an image after inclination correction to obtain a black-and-white image, scanning downwards from the first line of the black-and-white image, finding out a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of adjacent coordinates at the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H if
Figure GDA0001604011080000021
If the image size is smaller than the set first threshold value, the image after inclination correction is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by m
Figure GDA0001604011080000022
Then cutting the amplified image, and reserving a part with the middle size of nxm; if it is not
Figure GDA0001604011080000023
If the image size is larger than the set second threshold, the image after the inclination correction is subjected to reduction processing, wherein the reduction processing mode is that the image with the size of n multiplied by m is reduced into the image with the size of n multiplied by m by adopting a mode of equal interval sampling
Figure GDA0001604011080000024
Then, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is not
Figure GDA0001604011080000025
Is greater than or equal toIf the predetermined first threshold value is equal to or less than the set second threshold value, the image after the tilt correction is not subjected to the scaling processing.
Furthermore, the calculation process of convolutional layer 1 and convolutional layer 2 in the Caffe deep learning network structure is as follows:
Figure GDA0001604011080000026
in the formula (1), MjRepresenting a convolved kernel
Figure GDA0001604011080000027
A set of input feature maps of an action,
Figure GDA0001604011080000028
is MjThe ith input feature map in the set,
Figure GDA0001604011080000029
as a convolution kernel
Figure GDA00016040110800000210
F (-) is the ReLU activation function,
Figure GDA00016040110800000211
outputting a feature map for the convolution;
the calculation formulas of the pooling layers 1 and 2 are as follows:
Figure GDA0001604011080000031
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,
Figure GDA0001604011080000032
and
Figure GDA0001604011080000033
respectively for each output bitThe corresponding pooling weights and additive biases of the token map,
Figure GDA0001604011080000034
the feature map is input for pooling,
Figure GDA0001604011080000035
outputting a characteristic diagram for pooling;
the calculation formula of the full connection layer 1 and the full connection layer 2 is as follows:
Figure GDA0001604011080000036
in the formula (3)
Figure GDA0001604011080000037
In order to fully connect the weight value,
Figure GDA0001604011080000038
in order to be fully connected with the bias,
Figure GDA0001604011080000039
the characteristic diagram is input for the full connection,
Figure GDA00016040110800000310
for fully connected output feature vectors, f "(. cndot.) is the ReLU activation function.
Further, the bias values of the layers are the same, i.e. the layers are offset by the same amount
Figure GDA00016040110800000311
Further, after each sample and the corresponding classification label are input into the Caffe deep learning network structure in step 6, the following processing is further included:
the image output from the full connection layer 2 is subjected to the processing of formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiIs processedPixel of the output image, xiPixels that are the output image of the fully connected layer 2; then carrying out normalization processing on yi to obtain the probability value p of each output nodei
Figure GDA00016040110800000312
Finally, a loss function is calculated through the loss layer, and the calculation formula of the loss value L (theta) is shown as the formula (6):
Figure GDA00016040110800000313
in the formula (6), j is y(i)Corresponding class label, I {. is an illustrative function, i.e., when y(i)J is the output 1, the other is 0, m is the batch size, finally the average loss is calculated, theta represents the parameter to be optimized, i.e.
Figure GDA00016040110800000314
Further, the step 6 further comprises:
the step 6 further comprises:
calculating Caffe deep learning network structure parameters in a back propagation mode; first, the gradient of the loss layer
Figure GDA00016040110800000315
The calculation formula is shown in formula (7):
Figure GDA00016040110800000316
l (θ) in equation (7) is the output of the lossy layer, yiFor the pixels of the output image after processing according to said formula (4), piCalculating a probability value for each output node obtained for the formula (5);
then, calculating parameters of the full connection layer
Figure GDA00016040110800000317
And
Figure GDA00016040110800000318
the calculation formula is shown in formula (8):
Figure GDA00016040110800000319
Figure GDA00016040110800000320
Figure GDA00016040110800000321
is the gradient of the output of the fully connected layer,
Figure GDA00016040110800000322
in order to output the characteristic diagram for the full connection layer,
Figure GDA00016040110800000323
inputting a characteristic diagram for the full connection layer;
the next step is to calculate pooling layer parameters
Figure GDA0001604011080000041
The calculation is shown in equation (10):
Figure GDA0001604011080000042
in equation (10) up (-) is the inverse of the down-sampling,
Figure GDA0001604011080000043
is the gradient output by the pooling layer,
Figure GDA0001604011080000044
is an output characteristic diagram of the pooling layer;
finally, convolution layer parameters are calculated
Figure GDA0001604011080000045
The calculation method is shown in formula (11):
Figure GDA0001604011080000046
in formula (11)
Figure GDA0001604011080000047
Representation and convolution kernels
Figure GDA0001604011080000048
The input feature map is acted on and the input feature map is acted on,
Figure GDA0001604011080000049
is a kernel of convolution with
Figure GDA00016040110800000410
Gradients of feature maps generated after the action, u, v representing the sum of convolution kernels
Figure GDA00016040110800000411
A set of related input and output feature maps;
through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
Furthermore, in the step 4, the marks 0-9 are used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 are used for marking samples with characters of A-Z in the sample image respectively.
Furthermore, the preprocessing comprises license plate positioning, license plate image extraction and character segmentation operation of the image.
The invention has the beneficial effects that: the license plate character recognition method based on the Caffe deep learning framework provided by the invention solves the problem of different processing effects of pictures with different accuracies shot in different environments in the prior art; the method does not need to manually define the characteristics, has high algorithm robustness, and can obtain high identification precision under the complex environment with noise interference.
Description of the drawings:
FIG. 1 is a flow chart of a license plate character recognition method based on a Caffe deep learning framework in the invention.
FIG. 2 is a license plate image preprocessing process of the present invention.
FIG. 3 is a Caffe deep learning network structure of the present invention.
The specific implementation mode is as follows:
for a further understanding of the invention, reference will now be made in detail to the embodiments illustrated in the drawings and examples, but the invention is not limited to the embodiments.
The invention discloses a license plate character recognition method based on a Caffe deep learning framework, which has a flow diagram shown in figure 1 and comprises the following steps.
Step 101: multiple color license plate sample images taken under different conditions. The different conditions are shooting by selecting different conditions of illumination intensity, inclination angle, shielding degree and noise pollution. And comprehensively considering the training speed and the reliability of the training result, and selecting to shoot 2000 license plate sample images.
Step 102: and preprocessing the acquired license plate sample image to obtain a segmented sub-image, and selecting the sub-image containing characters to form a character sub-image sample set S. The preprocessing process is shown in fig. 2 and includes license plate positioning, license plate image extraction, and character segmentation operation.
Step 103: and further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S'.
Wherein, the further processing of the sample specifically comprises the following substeps:
step A: carrying out normalization processing on the sample image s, and processing the size of the sample image s into n multiplied by m; the sample image size is set to 24 × 40 in this application.
And B: and performing inclination correction on the sample image after the normalization processing to obtain a sample image s'. The tilt correction method can adopt the hough algorithm which is used more currently.
And C: and carrying out scaling operation on the character. Although the size of each sample is the same after normalization, the proportion of characters in different samples is not consistent, and the difference of samples can cause the accuracy of the model, so the size of the uniform character is needed. Firstly, carrying out binarization operation on a sample image s' to obtain a black-and-white image, scanning downwards from a first line of the black-and-white image to find a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of coordinates adjacent to the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H if
Figure GDA0001604011080000051
If the image size is smaller than the set first threshold value, the sample image s' is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by m
Figure GDA0001604011080000052
Then cutting the amplified image, and reserving a part with the middle size of nxm; if it is not
Figure GDA0001604011080000053
If the value is larger than the set second threshold value, the sample image s' is reduced by adopting a mode of sampling at equal intervals to reduce the image with the size of n multiplied by m into an image with the size of n multiplied by m
Figure GDA0001604011080000054
Then, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is not
Figure GDA0001604011080000055
If the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the sample image s' is not scaled. In this step, the first threshold and the second threshold may be set to 0.9 and 1.1, respectively.
Step 104: and storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list. For example, the marks 0-9 can be used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 can be used for marking samples with characters of A-Z in the sample image respectively;
step 105: constructing a six-layer Caffe deep learning network structure, and then initializing parameters of the network structure. The six layers are sequentially a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2, and the structure of the six layers is shown in fig. 3. The initial parameters are randomly generated by the system.
Step 106: and (3) training the Caffe deep learning network structure by using each sample in the character sub-image sample set S' to obtain the trained Caffe deep learning network structure.
Inputting each sample and the corresponding classification mark in the character sub-image sample set S' into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain the trained Caffe deep learning network structure.
First, a sample image is input to the convolution layer 1 to be subjected to convolution processing, and the convolution calculation process is as shown in formula (1):
Figure GDA0001604011080000061
in the formula (1), MjRepresenting a convolved kernel
Figure GDA0001604011080000062
A set of input feature maps of an action,
Figure GDA0001604011080000063
is MjThe ith input feature map in the set,
Figure GDA0001604011080000064
as a convolution kernel
Figure GDA0001604011080000065
F (-) is the activation function, the present application uses ReLU as the activation function,
Figure GDA0001604011080000066
the feature map is output for convolution. The input data of the convolutional layer 1 is a 24 × 40 sample image, the size of the convolutional kernel is 5 × 5, the convolution step is 1, and the output image size after passing through the convolutional layer 1 is 20 × 36.
Then, the output image of the convolutional layer 1 is input to the pooling layer 1 to be pooled. The purpose of pooling is to reduce the spatial resolution of the convolutional layer by downsampling. The pooling layer can be regarded as a special convolution layer, the pooling kernel is set according to algorithm requirements and cannot be learned, and the pooling kernel generates corresponding output feature maps by one-to-one action of each input feature map. Therefore, the number of the output characteristic graphs is consistent with that of the input characteristic graphs, but the data is compressed, and the robustness is enhanced. Pooling can be divided into mean pooling and maximum pooling, and the two pooling formulas are shown as follows:
Figure GDA0001604011080000067
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, ReLU is adopted as the activation function,
Figure GDA0001604011080000068
and
Figure GDA0001604011080000069
the pooling weight and additive bias corresponding to each output feature map can be learned during the training process,
Figure GDA00016040110800000610
the feature map is input for pooling,
Figure GDA00016040110800000611
and outputting the characteristic diagram for pooling. The pooling layer 1 adopts maximum pooling, the size of a pooling core is 2 multiplied by 2, the step length is 2, and the size of a network output picture after pooling is 10 multiplied by 18.
The output image of the pooling layer 1 is input to the convolutional layer 2 for processing, the convolutional parameters and the convolution calculation method of the convolutional layer 2 are the same as those of the convolutional layer 1, and the size of the output picture of the layer is 6 × 14.
The fourth layer is a pooling layer 2, corresponding parameters and pooling calculation mode pooling layers 1 are the same, except that the layer adopts mean pooling, and the output picture size is 3 × 7.
The forward transmission of the full-connection layer is similar to a BP neural network algorithm, an input feature map needs to be expanded to a one-dimensional vector from a two-dimensional map before calculation, and then full-connection calculation is carried out. The calculation formula of the full connection is as follows:
Figure GDA00016040110800000612
in the formula (3)
Figure GDA00016040110800000613
In order to fully connect the weight value,
Figure GDA00016040110800000614
the bias is fully connected, automatically adjusted through training,
Figure GDA00016040110800000615
the characteristic diagram is input for the full connection,
Figure GDA00016040110800000616
for a fully connected output feature vector, f "(. cndot.) is the activation function. The image processed by the pooling layer 2 is input to a full-connection layer 1 for processing, the layer has 400 neuron nodes, is fully connected with the output neurons of the fourth layer, and is output through a ReLU activation function.
The sixth layer is a fully-connected layer 2, 10 output neurons of the layer correspond to 10 types of labels, and the fully-connected calculation mode of the layer is the same as that of the fully-connected layer 1.
In the above six layers, the bias values of the layers are the same, i.e.
Figure GDA00016040110800000617
In the training process, in order to ensure that the maximum value of the output image is 0, the feature map output by the full connection layer 2 needs to be processed as shown in formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiFor pixels of the processed output image, xiAre the pixels of the output image of the fully connected layer 2. Then need to be right for yiNormalization processing is carried out, and the probability value p of each output node is obtainedi. The processing method comprises the following steps:
Figure GDA0001604011080000071
finally, in order to measure the accuracy of identification, a loss function needs to be calculated through a loss layer, the invention adopts Softmax as the output of the loss function, and the calculation formula of the loss value L (theta) is shown as the formula (6):
Figure GDA0001604011080000072
in the formula (6), j is y(i)Corresponding class label, I {. is an illustrative function, i.e., when y(i)J is the output 1, the other is 0, m is the batch size, finally the average loss is calculated, theta represents the parameter to be optimized, i.e.
Figure GDA0001604011080000073
And
Figure GDA0001604011080000074
after the training samples are identified through the Caffe deep learning network, the gradient obtained by back propagation of each layer of network parameters needs to be adjusted according to the identification result, and the gradient is reversely calculated layer by the convolution layer, the pooling layer and the loss layer through a chain type derivation method. Gradient of loss layer
Figure GDA0001604011080000075
The calculation formula is as shown in formula (7):
Figure GDA0001604011080000076
y in formula (7)iIs the pixel of the output image after processing of equation (4), L (θ) is the output of the loss layer, piThe probability value for each output node calculated for equation (5). Then, calculating parameters of the full connection layer
Figure GDA0001604011080000077
And
Figure GDA0001604011080000078
the calculation formula is shown as follows:
Figure GDA0001604011080000079
Figure GDA00016040110800000710
Figure GDA00016040110800000711
is the gradient of the output of the fully connected layer,
Figure GDA00016040110800000712
in order to output the characteristic diagram for the full connection layer,
Figure GDA00016040110800000713
inputting a feature map for the full connection layer. Since the application uses the ReLU activation function, the gradient reverse conduction needs to be multiplied
Figure GDA00016040110800000714
That is, when the output of the full connection layer is greater than 0, the gradient continues to feed back, and when the output is less than 0, the gradient stops feeding back.
The next step is to calculate pooling layer parameters
Figure GDA00016040110800000715
The equation (10) is calculated.
Figure GDA00016040110800000716
In equation (10), up (-) is the inverse of down-sampling,
Figure GDA00016040110800000717
is the gradient output by the pooling layer,
Figure GDA00016040110800000718
is the output characteristic diagram of the pooling layer.
Convolutional layer parameters
Figure GDA00016040110800000719
The calculation method is as follows:
Figure GDA00016040110800000720
in the formula (11)
Figure GDA00016040110800000721
Representation and convolution kernels
Figure GDA00016040110800000722
The input feature map is acted on and the input feature map is acted on,
Figure GDA00016040110800000723
is a kernel of convolution with
Figure GDA00016040110800000724
Gradient of feature map generated after the action. u, v representation and convolution kernel
Figure GDA00016040110800000725
And (4) relevant input and output feature diagram sets.
Through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
Step 107: and (4) capturing a license plate image through an electronic police for license plate character recognition.
Step 108: and preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing the Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters. The preprocessing in this step is the same as the preprocessing in step 102.
According to the well-known method, the first character in the license plate is usually a Chinese character, so that the first image on the leftmost side of the license plate which is segmented is only required to be removed when the sub-image containing the character is selected.
Step 109: and further processing each character sub-image to obtain a processed character sub-image.
The further processing of the character subimage t specifically comprises the following substeps:
step a: carrying out normalization processing on the character sub-image t, and processing the size of the character sub-image t into n multiplied by m; in the present application, n is set to be 24 and m is set to be 40.
Step b: and (4) performing inclination correction on the normalized image to obtain an image t'. The tilt correction method can adopt the hough algorithm which is used more currently.
Step c: and carrying out scaling operation on the character. Firstly, carrying out binarization operation on an image t' to obtain a black-and-white image, scanning downwards from a first line of the black-and-white image to find a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of coordinates adjacent to the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point l, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the point, if so, considering the point l as the end point of the character in the vertical direction, and recording the line coordinate lx of the point l; calculating the height h ═ fx-lx |; h is compared with the set standard height H if
Figure GDA0001604011080000081
If the image is smaller than the set first threshold value, the image t' is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by m
Figure GDA0001604011080000082
Then cutting the amplified image, and reserving a part with the middle size of nxm; if it is not
Figure GDA0001604011080000083
If the value is larger than the set second threshold value, the image t' is reduced by adopting a mode of sampling at equal intervals to reduce the image with the size of n multiplied by m into the image with the size of n multiplied by m
Figure GDA0001604011080000084
Then, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is not
Figure GDA0001604011080000085
If the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the image t' is not scaled. In this step, the first threshold and the second threshold may be set to 0.9 and 1.1, respectively.
Step 110: and (4) performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 106 to obtain a recognition result.
The identification process is as follows:
first, the character sub-image is input to the convolution layer 1 for convolution processing, and the convolution calculation process is as shown in formula (12):
Figure GDA0001604011080000086
in formula (12), MjRepresenting a convolved kernel
Figure GDA0001604011080000087
A set of input feature maps of an action,
Figure GDA0001604011080000088
is MjThe ith input feature map in the set,
Figure GDA0001604011080000089
as a convolution kernel
Figure GDA00016040110800000810
F (-) is the activation function, the present application uses ReLU as the activation function,
Figure GDA00016040110800000811
the feature map is output for convolution. The input data of the convolutional layer 1 is a 24 × 40 sample image, the size of the convolutional kernel is 5 × 5, the convolution step is 1, and the output image size after passing through the convolutional layer 1 is 20 × 36.
Then, the output image of the convolutional layer 1 is input to the pooling layer 1 to be pooled. The pooling calculation formula is shown in the following formula (13):
Figure GDA0001604011080000091
in the formula (13), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,
Figure GDA0001604011080000092
and
Figure GDA0001604011080000093
the pooling weight and additive bias corresponding to each output feature map can be learned during the training process,
Figure GDA0001604011080000094
the feature map is input for pooling,
Figure GDA0001604011080000095
and outputting the characteristic diagram for pooling. The pooling layer 1 adopts maximum pooling, the size of a pooling core is 2 multiplied by 2, the step length is 2, and the size of a network output picture after pooling is 10 multiplied by 18.
The output image of the pooling layer 1 is input to the convolutional layer 2 for processing, the convolutional parameters and the convolution calculation method of the convolutional layer 2 are the same as those of the convolutional layer 1, and the size of the output picture of the layer is 6 × 14.
The output image of the convolutional layer 2 is input into the pooling layer 2 for further processing, the corresponding parameters and the pooling calculation mode of the pooling layer 1 are the same, except that the layer adopts mean pooling, and the size of the output image is 3 multiplied by 7.
The output image of the pooling layer 2 is input to the full-link layer 1 to be subjected to full-link processing. The calculation formula of the full connection layer is as follows:
Figure GDA0001604011080000096
in formula (14)
Figure GDA0001604011080000097
In order to fully connect the weight value,
Figure GDA0001604011080000098
in order to be fully connected with the bias,
Figure GDA0001604011080000099
the characteristic diagram is input for the full connection,
Figure GDA00016040110800000910
for a fully connected output feature vector, f "(. cndot.) is the activation function. The fully-connected layer 1 has 400 neuron nodes, is fully connected with the output neurons of the pooling layer 2, and is output through a ReLU activation function.
The image output by the full connection layer 1 is further input into a full connection layer 2, 10 output neurons of the layer correspond to 10 types of labels, and the full connection calculation mode of the layer is the same as that of the full connection layer 1.
After six layers of processing, the output result is a label of the category, and the recognition result of the character is obtained through the label.
Step 111: and combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A license plate character recognition method based on a Caffe deep learning framework is characterized in that the detection method comprises the following steps:
step 1: multiple color license plate sample images shot under different illumination intensity, inclination angle, shielding degree and noise pollution conditions;
step 2: preprocessing the acquired license plate sample image to obtain a segmented sub-image sample, selecting the sub-image sample containing characters, and forming a character sub-image sample set S;
and step 3: further processing each sample S in the character sub-image sample set S to obtain a processed character sub-image sample set S';
and 4, step 4: storing the samples in the character sub-image sample set S' in a list mode, and carrying out classification marking on each item in the list;
and 5: constructing a six-layer Caffe deep learning network structure, wherein the six layers are a convolution layer 1, a pooling layer 1, a convolution layer 2, a pooling layer 2, a full-connection layer 1 and a full-connection layer 2 in sequence; initializing parameters of a network structure after the network structure is established, wherein the initial parameters are generated randomly by a system;
step 6: inputting each sample in the character sub-image sample set S' and the corresponding classification mark thereof into a Caffe deep learning network structure for character recognition, comparing the recognition result with the classification mark of the image sample, and adjusting the parameters of the network structure according to the comparison result to obtain a trained Caffe deep learning network structure;
and 7: a license plate image is captured by an electronic police for license plate character recognition;
and 8: preprocessing the acquired license plate image to obtain a segmented sub-image, removing the sub-image containing Chinese characters according to the position of the sub-image in the license plate image, and reserving other sub-images containing characters;
and step 9: further processing each character sub-image to obtain a processed character sub-image;
step 10: performing character recognition on each processed character sub-image by using the Caffe deep learning network structure obtained by training in the step 6 to obtain a recognition result;
step 11: combining the recognition results of the character sub-images in sequence to obtain the recognition result of the license plate;
said further processing in said steps 3 and 9 comprises:
1) normalizing the image to n × m;
2) performing inclination correction on the normalized image;
3) carrying out scaling operation on the characters; firstly, carrying out binarization operation on an image after inclination correction to obtain a black-and-white image, scanning downwards from the first line of the black-and-white image, finding out a first pixel point with a pixel value of 255, marking the first pixel point as a point f, judging whether a pixel point with the pixel value of 255 also exists in points of adjacent coordinates at the lower side of the first pixel point, if so, considering the point f as a starting point of a character in the vertical direction, and recording a line coordinate fx of the first pixel point; scanning upwards from the last line of the black-and-white image, finding a first pixel point with a pixel value of 255, marking as a point 1, judging whether a pixel point with the pixel value of 255 also exists in the points of the adjacent coordinates on the upper side of the first pixel point, if so, considering that the point 1 is an end point of the character in the vertical direction, and recording the line coordinate lx of the character; calculating the height h ═ fx-lx |; h is compared with the set standard height H if
Figure FDA0002964889320000011
If the image size is smaller than the set first threshold value, the image after inclination correction is amplified by adopting a bilinear interpolation mode to amplify the image with the size of n multiplied by m to the size of n multiplied by m
Figure FDA0002964889320000012
Then cutting the amplified image, and reserving a part with the middle size of nxm; if it is not
Figure FDA0002964889320000013
If the image size is larger than the set second threshold, the image after the inclination correction is subjected to reduction processing, wherein the reduction processing mode is that the image with the size of n multiplied by m is reduced into the image with the size of n multiplied by m by adopting a mode of equal interval sampling
Figure FDA0002964889320000021
Then, carrying out boundary filling on the image, and filling the image into an image with the size of n multiplied by m; if it is not
Figure FDA0002964889320000022
If the value of (d) is equal to or greater than the set first threshold value and equal to or less than the set second threshold value, the image after the tilt correction is not subjected to the scaling processing.
2. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: the calculation process of the convolution layer 1 and the convolution layer 2 in the Caffe deep learning network structure is as follows:
Figure FDA0002964889320000023
in the formula (1), MjRepresenting a convolved kernel
Figure FDA0002964889320000024
A set of input feature maps of an action,
Figure FDA0002964889320000025
is MjThe ith input feature map in the set,
Figure FDA0002964889320000026
as a convolution kernel
Figure FDA0002964889320000027
F (-) is the ReLU activation function,
Figure FDA0002964889320000028
outputting a feature map for the convolution;
the calculation formulas of the pooling layers 1 and 2 are as follows:
Figure FDA0002964889320000029
in the formula (2), down (·) represents a pooling kernel function, f' (·) is an activation function, a ReLU function is adopted,
Figure FDA00029648893200000210
and
Figure FDA00029648893200000211
respectively for the pooling weight and additive bias corresponding to each output feature map,
Figure FDA00029648893200000212
the feature map is input for pooling,
Figure FDA00029648893200000213
outputting a characteristic diagram for pooling;
the calculation formula of the full connection layer 1 and the full connection layer 2 is as follows:
Figure FDA00029648893200000214
in the formula (3)
Figure FDA00029648893200000215
In order to fully connect the weight value,
Figure FDA00029648893200000216
in order to be fully connected with the bias,
Figure FDA00029648893200000217
the characteristic diagram is input for the full connection,
Figure FDA00029648893200000218
for fully connected output feature vectors, f "(. cndot.) is the ReLU activation function.
3. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 2, wherein: the bias of each layer is equal, i.e.
Figure FDA00029648893200000219
4. The license plate character recognition method of the Caffe deep learning framework according to claim 2 or 3, characterized in that: after each sample and the corresponding classification label are input into the Caffe deep learning network structure in step 6, the following processing is also included:
the image output from the full connection layer 2 is subjected to the processing of formula (4):
yi=xi-max(x1,...,xn) (4)
wherein, yiFor pixels of the processed output image, xiPixels that are the output image of the fully connected layer 2; then to yiNormalization processing is carried out to obtain the probability value p of each output nodei
Figure FDA00029648893200000220
Finally, a loss function is calculated through the loss layer, and the calculation formula of the loss value L (theta) is shown as the formula (6):
Figure FDA0002964889320000031
in the formula (6), j is y(i)Corresponding class label, I {. is an illustrative function, i.e., when y(i)J is the output 1, the other is 0, m is the batch size, finally the average loss is calculated, theta represents the parameter to be optimized, i.e.
Figure FDA0002964889320000032
5. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 4, wherein: the step 6 further comprises:
calculating Caffe deep learning network structure parameters in a back propagation mode; first, the gradient of the loss layer
Figure FDA0002964889320000033
The calculation formula is shown in formula (7):
Figure FDA0002964889320000034
l (θ) in equation (7) is the output of the lossy layer, yiFor the pixels of the output image after processing according to said formula (4), piCalculating a probability value for each output node obtained for the formula (5);
then, calculating parameters of the full connection layer
Figure FDA0002964889320000035
And
Figure FDA0002964889320000036
the calculation formula is shown in formula (8):
Figure FDA0002964889320000037
Figure FDA0002964889320000038
Figure FDA0002964889320000039
is the gradient of the output of the fully connected layer,
Figure FDA00029648893200000310
in order to output the characteristic diagram for the full connection layer,
Figure FDA00029648893200000311
inputting a characteristic diagram for the full connection layer;
the next step is to calculate pooling layer parameters
Figure FDA00029648893200000312
The calculation is shown in equation (10):
Figure FDA00029648893200000313
in equation (10) up (-) is the inverse of the down-sampling,
Figure FDA00029648893200000314
is the gradient output by the pooling layer,
Figure FDA00029648893200000315
is an output characteristic diagram of the pooling layer;
finally, convolution layer parameters are calculated
Figure FDA00029648893200000316
The calculation method is shown in formula (11):
Figure FDA00029648893200000317
in formula (11)
Figure FDA00029648893200000318
Representation and convolution kernels
Figure FDA00029648893200000323
The input feature map is acted on and the input feature map is acted on,
Figure FDA00029648893200000320
is a kernel of convolution with
Figure FDA00029648893200000321
Gradients of feature maps generated after the action, u, v representing the sum of convolution kernels
Figure FDA00029648893200000322
A set of related input and output feature maps;
through the direction propagation process, the gradient of each layer of network parameters can be obtained, and then the parameters are updated layer by layer to complete the learning process of the network.
6. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: in the step 4, the marks 0-9 are used for marking samples with characters of 0-9 in the sample image respectively, and the marks 10-35 are used for marking samples with characters of A-Z in the sample image respectively.
7. The license plate character recognition method of the Caffe deep learning framework as claimed in claim 1, wherein: the preprocessing comprises the steps of license plate positioning, license plate image extraction and character segmentation operation of the image.
CN201710823771.4A 2017-09-13 2017-09-13 License plate character recognition method based on Caffe deep learning framework Expired - Fee Related CN108108746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710823771.4A CN108108746B (en) 2017-09-13 2017-09-13 License plate character recognition method based on Caffe deep learning framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710823771.4A CN108108746B (en) 2017-09-13 2017-09-13 License plate character recognition method based on Caffe deep learning framework

Publications (2)

Publication Number Publication Date
CN108108746A CN108108746A (en) 2018-06-01
CN108108746B true CN108108746B (en) 2021-04-09

Family

ID=62207466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710823771.4A Expired - Fee Related CN108108746B (en) 2017-09-13 2017-09-13 License plate character recognition method based on Caffe deep learning framework

Country Status (1)

Country Link
CN (1) CN108108746B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214443A (en) * 2018-08-24 2019-01-15 北京第视频科学技术研究院有限公司 Car license recognition model training method, licence plate recognition method, device and equipment
CN109325489B (en) * 2018-09-14 2021-05-28 浙江口碑网络技术有限公司 Image recognition method and device, storage medium and electronic device
CN109766805B (en) * 2018-12-28 2022-12-09 安徽清新互联信息科技有限公司 Deep learning-based double-layer license plate character recognition method
CN109815956B (en) * 2018-12-28 2022-12-09 安徽清新互联信息科技有限公司 License plate character recognition method based on self-adaptive position segmentation
CN109754011B (en) * 2018-12-29 2019-11-12 北京中科寒武纪科技有限公司 Data processing method, device and Related product based on Caffe
CN110427937B (en) * 2019-07-18 2022-03-22 浙江大学 Inclined license plate correction and indefinite-length license plate identification method based on deep learning
CN110598703B (en) * 2019-09-24 2022-12-20 深圳大学 OCR (optical character recognition) method and device based on deep neural network
CN110659640B (en) * 2019-09-27 2021-11-30 深圳市商汤科技有限公司 Text sequence recognition method and device, electronic equipment and storage medium
CN110956133A (en) * 2019-11-29 2020-04-03 上海眼控科技股份有限公司 Training method of single character text normalization model, text recognition method and device
CN111209858B (en) * 2020-01-06 2023-06-20 电子科技大学 Real-time license plate detection method based on deep convolutional neural network
CN111291761B (en) * 2020-02-17 2023-08-04 北京百度网讯科技有限公司 Method and device for recognizing text
CN111401139B (en) * 2020-02-25 2024-03-29 云南昆钢电子信息科技有限公司 Method for obtaining mine underground equipment position based on character image intelligent recognition
CN113496227A (en) * 2020-04-08 2021-10-12 顺丰科技有限公司 Training method and device of character recognition model, server and storage medium
CN111539426B (en) * 2020-04-27 2023-03-14 合肥工业大学 High-precision license plate recognition system based on FPGA
CN113392814B (en) * 2021-08-16 2021-11-02 冠传网络科技(南京)有限公司 Method and device for updating character recognition model and storage medium
CN113761961B (en) * 2021-09-07 2023-08-04 杭州海康威视数字技术股份有限公司 Two-dimensional code identification method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183427A (en) * 2007-12-05 2008-05-21 浙江工业大学 Computer vision based peccancy parking detector
CN105335745A (en) * 2015-11-27 2016-02-17 小米科技有限责任公司 Recognition method, device and equipment for numbers in images
CN105893968A (en) * 2016-03-31 2016-08-24 华南理工大学 Text-independent end-to-end handwriting recognition method based on deep learning
CN105975968A (en) * 2016-05-06 2016-09-28 西安理工大学 Caffe architecture based deep learning license plate character recognition method
CN106127248A (en) * 2016-06-24 2016-11-16 平安科技(深圳)有限公司 Car plate sorting technique based on degree of depth study and system
CN106295645A (en) * 2016-08-17 2017-01-04 东方网力科技股份有限公司 A kind of license plate character recognition method and device
CN106384112A (en) * 2016-09-08 2017-02-08 西安电子科技大学 Rapid image text detection method based on multi-channel and multi-dimensional cascade filter
CN106503707A (en) * 2016-10-21 2017-03-15 浙江宇视科技有限公司 Licence plate recognition method and device under the conditions of a kind of infrared light filling
CN106650740A (en) * 2016-12-15 2017-05-10 深圳市华尊科技股份有限公司 License plate identification method and terminal
CN106778730A (en) * 2016-12-29 2017-05-31 深圳爱拼信息科技有限公司 A kind of adaptive approach and system for quickly generating OCR training samples
US20170177965A1 (en) * 2015-12-17 2017-06-22 Xerox Corporation Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks
CN107025452A (en) * 2016-01-29 2017-08-08 富士通株式会社 Image-recognizing method and image recognition apparatus

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101183427A (en) * 2007-12-05 2008-05-21 浙江工业大学 Computer vision based peccancy parking detector
CN105335745A (en) * 2015-11-27 2016-02-17 小米科技有限责任公司 Recognition method, device and equipment for numbers in images
US20170177965A1 (en) * 2015-12-17 2017-06-22 Xerox Corporation Coarse-to-fine cascade adaptations for license plate recognition with convolutional neural networks
CN107025452A (en) * 2016-01-29 2017-08-08 富士通株式会社 Image-recognizing method and image recognition apparatus
CN105893968A (en) * 2016-03-31 2016-08-24 华南理工大学 Text-independent end-to-end handwriting recognition method based on deep learning
CN105975968A (en) * 2016-05-06 2016-09-28 西安理工大学 Caffe architecture based deep learning license plate character recognition method
CN106127248A (en) * 2016-06-24 2016-11-16 平安科技(深圳)有限公司 Car plate sorting technique based on degree of depth study and system
CN106295645A (en) * 2016-08-17 2017-01-04 东方网力科技股份有限公司 A kind of license plate character recognition method and device
CN106384112A (en) * 2016-09-08 2017-02-08 西安电子科技大学 Rapid image text detection method based on multi-channel and multi-dimensional cascade filter
CN106503707A (en) * 2016-10-21 2017-03-15 浙江宇视科技有限公司 Licence plate recognition method and device under the conditions of a kind of infrared light filling
CN106650740A (en) * 2016-12-15 2017-05-10 深圳市华尊科技股份有限公司 License plate identification method and terminal
CN106778730A (en) * 2016-12-29 2017-05-31 深圳爱拼信息科技有限公司 A kind of adaptive approach and system for quickly generating OCR training samples

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Novel Kernel PCA Support Vector Machine Algorithm with Feature Transition Function;Wang Lianhong,et al.;《Proceedings of the 26th Chinese Control Conference》;20071231;全文 *
License plate character recognition based on wavelet kernel LS-SVM;Guang Yang.;《 2011 3rd International Conference on Computer Research and Development》;20110505;全文 *
基于CNN的车牌数字字符识别算法;欧先锋,等.;《成都工业学院学报》;20161231;第19卷(第4期);全文 *

Also Published As

Publication number Publication date
CN108108746A (en) 2018-06-01

Similar Documents

Publication Publication Date Title
CN108108746B (en) License plate character recognition method based on Caffe deep learning framework
CN109784333B (en) Three-dimensional target detection method and system based on point cloud weighted channel characteristics
CN108229479B (en) Training method and device of semantic segmentation model, electronic equipment and storage medium
US8750619B2 (en) Character recognition
CN108121991B (en) Deep learning ship target detection method based on edge candidate region extraction
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN106682629B (en) Identification algorithm for identity card number under complex background
CN107038416B (en) Pedestrian detection method based on binary image improved HOG characteristics
CN110766020A (en) System and method for detecting and identifying multi-language natural scene text
CN113888461A (en) Method, system and equipment for detecting defects of hardware parts based on deep learning
CN111709980A (en) Multi-scale image registration method and device based on deep learning
CN114693661A (en) Rapid sorting method based on deep learning
CN113657528B (en) Image feature point extraction method and device, computer terminal and storage medium
CN112084952B (en) Video point location tracking method based on self-supervision training
CN113052170A (en) Small target license plate recognition method under unconstrained scene
CN110472632B (en) Character segmentation method and device based on character features and computer storage medium
CN110516731B (en) Visual odometer feature point detection method and system based on deep learning
CN107704864B (en) Salient object detection method based on image object semantic detection
CN111079585B (en) Pedestrian re-identification method combining image enhancement with pseudo-twin convolutional neural network
CN107358244A (en) A kind of quick local invariant feature extraction and description method
CN116363535A (en) Ship detection method in unmanned aerial vehicle aerial image based on convolutional neural network
CN110795995A (en) Data processing method, device and computer readable storage medium
CN106530300A (en) Flame identification algorithm of low-rank analysis
CN110414301B (en) Train carriage crowd density estimation method based on double cameras
CN110705568A (en) Optimization method for image feature point extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210409

Termination date: 20210913