CN107273832B

CN107273832B - License plate recognition method and system based on integral channel characteristics and convolutional neural network

Info

Publication number: CN107273832B
Application number: CN201710416776.5A
Authority: CN
Inventors: 房建宏
Original assignee: Qingdao Institute Of Traffic Sciences
Current assignee: Qingdao Institute Of Traffic Sciences
Priority date: 2017-06-06
Filing date: 2017-06-06
Publication date: 2020-09-22
Anticipated expiration: 2037-06-06
Also published as: CN107273832A

Abstract

The invention provides a license plate recognition method and a license plate recognition system based on integral channel characteristics and a convolutional neural network, wherein the license plate recognition method comprises the following steps: acquiring a sample image of the license plate image, and generating a convolutional neural network detector according to the sample image; acquiring an image to be detected, and calculating to generate a feature pyramid of the image to be detected in different scales; detecting the characteristic pyramid through a sliding window by using a convolutional neural network detector to obtain target candidate regions under different scales; and distinguishing characters from non-characters in the target candidate area by using a first fully-connected layer of the convolutional neural network detector, and identifying the characters in the target candidate area by using a second fully-connected layer of the convolutional neural network detector.

Description

License plate recognition method and system based on integral channel characteristics and convolutional neural network

Technical Field

The invention relates to a computer vision identification technology, in particular to a license plate identification method and a license plate identification system based on integral channel characteristics and a convolutional neural network.

Background

The automatic detection and identification of the license plate number of the passing vehicle at the positions of an expressway entrance, a parking lot entrance and the like are important technologies in an intelligent traffic system. In general, the detection and recognition of the license plate become difficult due to the extremely difficult conditions of strong illumination, large side angle, blur and the like. Currently, there are three major types of license plate detection methods:

the first is an edge information based approach. The method combines edge information with Hough transformation, morphological operation and the like to process an image to obtain a candidate region of a target in the image, then processes the regions through specific priori knowledge such as region edge density information, aspect ratio, shape and the like, screens layer by layer, and finally obtains the region where a license plate is located. Faradji et al, in documents "f.faradji, a.h.rezaie, and m.ziaratban, a morphological-based license plate location, in proc.ieee int.conf.image Process, pp. 57-60, sep. -oct.2007" propose a method for detecting a license plate by combining edge information and morphological operations, which obtains an object candidate region by performing edge detection on an image and then performing morphological operations. And then screening by using the geometric attributes of the candidate regions, and finally realizing the positioning of the license plate. The defects of the method are that the method is easily interfered by objects with abundant texture characteristics and similar shapes, and in addition, when the license plate is deformed, the license plate is easily missed.

The second method is a method based on color information, which comprises modules of color segmentation, target positioning and the like, and adopts a multilayer perceptron to segment color images, and then segments potential license plate areas by a projection method. For example, in documents "w.jia, h.zhang, x.he, and m.piccrardi, Mean shift for acquisition license plate localization. in proc.ieee conf.inner. trans.syst.pp.566-571, sep.2005" by w.jia et al, a number of candidate regions are obtained by dividing a color image by means of a Mean shift algorithm, and a detection result is finally obtained by classifying whether the candidate regions are license plate regions according to geometric attributes and edge density information. The method has the defects of sensitivity to illumination change and susceptibility to interference of areas with the same color characteristics in the image.

The third method is a method based on machine learning, such as the method proposed by l.dlagnekov et al in the documents "l.dlagnekov, and s.belongie, recogning cars.dept.comput.sci.eng.ucsd, San Diego, tech.rep.cs2005-083,2005", and based on Harr-Like features, the AdaBoost classifier is used to classify candidate regions to realize the positioning of the license plate. The method has the defects that the false alarm rate is high, a plurality of false detections occur, and the license plate region is difficult to detect completely, so that the algorithm effect of machine learning alone is not good.

Disclosure of Invention

The embodiment of the invention mainly aims to provide a license plate recognition method and system based on integral channel characteristics and a convolutional neural network, so as to solve the problem of license plate detection and recognition in a complex scene.

In order to achieve the above object, an embodiment of the present invention provides a license plate recognition method based on an integral channel feature and a convolutional neural network, where the license plate recognition method includes: obtaining a sample image of a license plate image, and generating a convolutional neural network detector according to the sample image; acquiring an image to be detected, and calculating to generate a feature pyramid of the image to be detected with different scales; detecting the characteristic pyramid through a sliding window by using the convolutional neural network detector to obtain target candidate regions under different scales; and distinguishing characters from non-characters in the target candidate area by using a first fully-connected layer of the convolutional neural network detector, and identifying the characters in the target candidate area by using a second fully-connected layer of the convolutional neural network detector.

In an embodiment, the above generating a convolutional neural network detector according to the sample image specifically includes: forming a training set by the sample images, wherein the sample images comprise positive samples and negative samples, the positive samples are images containing license plates, and the negative samples are background images not containing license plates; calculating integral channel characteristics of each sample image in the training set; pooling the integral channel characteristics of the sample image to generate pooled characteristics of the sample image; and inputting the pooled features into a decision tree forest, optimizing a distribution function of Adaboost by adopting an Adaboost algorithm and a spatial distribution probability, and generating the convolutional neural network detector.

In an embodiment, the calculating the integral channel characteristics of each sample in the training set specifically includes: step a: converting the sample image to an HSV color space, calculating color characteristics of three channels:

wherein i, j represents the spatial coordinates of the sample image; h, S and V respectively correspond to values of three different channels, H represents color and is 60-330, red corresponds to 0, green corresponds to 120, and blue corresponds to 240; s represents saturation and brightness of color; v represents a color tone; step b: calculating gradient direction histogram characteristics of the sample image; step c: and generating the integral channel characteristics according to the characteristics of the three channels and the gradient direction histogram characteristics.

Further, the step b: calculating the gradient direction histogram feature of the sample image, specifically comprising: step b 1: calculating a gradient direction value of each pixel in the sample image: g_x(x,y)＝H(x+1,y)- H(x-1,y)；G_y(x, y) ═ H (x, y +1) -H (x, y-1); where H (x, y) represents a pixel value in a gray space, G_x(x,y)，G_y(x, y) indicating the gradient in the horizontal and vertical directions at (x, y), respectively; step b 2: calculating gradient magnitude G (x, y) and direction according to the gradient direction value

Step b 3: projecting the gradient of each pixel in the unit in different intervals according to the direction according to the size of the specified unit to generate a gradient direction histogram of the whole unit; wherein, the angle range of each interval is 360/N, and N is the number of gradient directions; step b 4: determining the histogram of gradient directions features according to the histogram of gradient directions:

wherein i, j is the coordinate of the sample image; n is 1,2 … N, and indicates the section numbers corresponding to different gradient directions.

In an embodiment, the above calculating and generating the feature pyramid of the image to be detected in different scales specifically includes: step 1: calculating the multichannel characteristics of the image to be detected under different scales by the following formula:

wherein, F_sRepresenting the feature corresponding to the scale s, wherein R represents resampling the image by using the scale s, F ═ Ω (I) represents the feature of the channel corresponding to the image, and Ω corresponds to different feature channels; step 2: according to the multichannel characteristics, calculating and generating characteristic graphs of the image to be detected under different scales through the following formula:

wherein, F_s′Is a feature map, λ, of the image to be detected at a scale s_ΩIs the scale factor corresponding to the omega channel and the characteristic F corresponding to other scales s_sAccording to different scale imagesThe scale proportion relation between the two and the calculated characteristic diagram under a certain scale are approximated to obtain the characteristic diagram under another scale; and step 3: and generating a characteristic pyramid according to the characteristic graph of the image to be detected under each scale.

In one embodiment, λ corresponding to different channels Ω is calculated_ΩThe process comprises the following steps: mean of global features of statistical data set with scale transformation

By the formula

Can obtain mu_sAnd λ_ΩThe relationship of (A) is as follows:

wherein, E2]Expected value, f, indicating error_Ω(I_s) Is a weighted sum of all channels, i.e. f_Ω(I)＝∑_i,j,kω_i,j,kF (i, j, k), ω feature is the weight of the corresponding channel, k represents the sequence of channels.

The embodiment of the invention also provides a license plate recognition system based on the integral channel characteristics and the convolutional neural network, and the license plate recognition system comprises: the convolutional neural network detector generating unit is used for acquiring a sample image of the license plate image and generating a convolutional neural network detector according to the sample image; the characteristic pyramid generating unit is used for acquiring an image to be detected and calculating to generate characteristic pyramids of different scales of the image to be detected; a target candidate region acquisition unit, configured to detect the feature pyramid through a sliding window by using the convolutional neural network detector, and acquire target candidate regions at different scales; and the identification unit is used for distinguishing characters from non-characters in the target candidate area by using a first full-connection layer of the convolutional neural network detector and identifying the characters in the target candidate area by using a second full-connection layer of the convolutional neural network detector.

In an embodiment, the convolutional neural network detector generating unit is specifically configured to: forming a training set by the sample images, wherein the sample images comprise positive samples and negative samples, the positive samples are images containing license plates, and the negative samples are background images not containing license plates; calculating integral channel characteristics of each sample image in the training set; pooling the integral channel characteristics of the sample image to generate pooled characteristics of the sample image; and inputting the pooled features into a decision tree forest, optimizing a distribution function of Adaboost by adopting an Adaboost algorithm and a spatial distribution probability, and generating the convolutional neural network detector.

In an embodiment, the feature pyramid generating unit is specifically configured to: step 1: calculating the multichannel characteristics of the image to be detected under different scales by the following formula:

wherein, F_s′Is a feature map, λ, of the image to be detected at a scale s_ΩIs the scale factor corresponding to the omega channel and the characteristic F corresponding to other scales s_sObtaining a feature map under another scale approximately according to the scale proportion relation between the images with different scales and the calculated feature map under a certain scale; and step 3: and generating a characteristic pyramid according to the characteristic graph of the image to be detected under each scale.

By the formula

Can obtain mu_sAnd λ_ΩThe relationship of (A) is as follows:

The recognition method has the advantages that the recognition method can obtain better detection effect under different illumination conditions (different brightness degrees in the day and at night) and weather conditions (different states in sunny days, rainy days and the like), can accurately recognize key information of the license plate, can be applied to detection and recognition of the license plate in a static mode (aiming at a certain frame of image captured) and can also be applied to detection and recognition of the license plate in a dynamic mode (aiming at continuous video streams), and has the advantages of powerful function, flexible application, high running speed, strong adaptability and less resource occupation.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.

FIG. 1 is a flowchart of a license plate recognition method based on integral channel features and a convolutional neural network according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a specific process for generating a convolutional neural network detector from a sample image according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a specific process of computing integral channel characteristics of each sample image in a training set according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a license plate recognition system based on an integral channel feature and a convolutional neural network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a license plate recognition method and system based on integral channel characteristics and a convolutional neural network. The present invention will be described in detail below with reference to the accompanying drawings.

The embodiment of the invention provides a license plate recognition method based on integral channel characteristics and a convolutional neural network, which mainly comprises the following steps of:

step S101: acquiring a sample image of the license plate image, and generating a convolutional neural network detector according to the sample image;

step S102: acquiring an image to be detected, and calculating to generate a feature pyramid of the image to be detected in different scales;

step S103: detecting the characteristic pyramid through a sliding window by using a convolutional neural network detector to obtain target candidate regions under different scales;

step S104: and distinguishing characters from non-characters in the target candidate area by using a first fully-connected layer of the convolutional neural network detector, and identifying the characters in the target candidate area by using a second fully-connected layer of the convolutional neural network detector.

Through the steps S101 to S104, the license plate recognition method based on the integral channel feature and the convolutional neural network according to the embodiment of the present invention realizes the solution of the image feature map by learning the integral channel feature of the object and by using a sliding window method, quickly detects the object by using the cascaded decision tree forest, and realizes the recognition of the license plate by using the convolutional neural network.

Each step of the license plate recognition method based on the integral channel feature and the convolutional neural network of the embodiment of the present invention is further described below with reference to specific embodiments.

In the step S101, a sample image of the license plate image is obtained, and the convolutional neural network detector is generated according to the sample image.

Specifically, as shown in fig. 2, this step S101 mainly includes the following processes:

step S1011: and forming a training set by using the sample images, wherein the sample images comprise positive samples and negative samples, the positive samples are images containing license plates, and the negative samples are background images not containing license plates. In specific implementation, the sample image may include samples of license plates under different inclination angles, different illumination conditions and different degrees of dirt interference as positive samples, and corresponding regions without license plates are selected as negative samples, and the positive samples and the negative samples form a training set together.

Step S1012: and calculating integral channel characteristics of each sample image in the training set. As shown in fig. 3, the process of calculating the integral channel feature of each sample in the training set specifically includes the following steps:

step S301: converting the sample image into an HSV color space, and calculating the color characteristics of three channels:

wherein i, j represents the spatial coordinates of the sample image; h, S and V respectively correspond to values of three different channels, H represents color and is 60-330, red corresponds to 0, green corresponds to 120, and blue corresponds to 240; s represents saturation and brightness of color; v represents a color tone;

step S302: calculating the gradient direction histogram characteristics of the sample image; the method specifically comprises the following steps:

step b 1: calculating a gradient direction value of each pixel in the sample image:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；

where H (x, y) represents a pixel value in a gray space, G_x(x,y)，G_y(x, y) indicating the gradient in the horizontal and vertical directions at (x, y), respectively;

step b 2: calculating gradient magnitude G (x, y) and direction according to the gradient direction value

Step b 3: projecting the gradient of each pixel in the unit in different intervals according to the direction according to the size of the specified unit to generate a gradient direction histogram of the whole unit; wherein, the angle range of each interval is 360/N, and N is the number of gradient directions;

step b 4: determining the characteristics of the histogram of the gradient direction according to the histogram of the gradient direction:

After the gradient direction histogram feature of the sample image is obtained through calculation, through step S303: and generating an integral channel Feature (i, j) according to the features of the three channels and the gradient direction histogram Feature.

I.e. the integrated channel features at the i, j positions of the sample image are a combination of color features and gradient features.

After the integral channel features of the sample images in the training set are calculated, step S1013 is performed: and pooling the integral channel characteristics of the sample image to generate pooled characteristics of the sample image.

Specifically, the pooling process is as follows: and carrying out region segmentation on the obtained integral characteristic channel, and determining the size of each block of region as the pooling size. Then, max posing (using the maximum value of each pixel value in the image area as the result of the area) or average posing (using the average value of each pixel value in the image area as the result of the area) is performed on each area to obtain the pooled channel characteristics.

In one embodiment, during the pooling process, the features before and after pooling need to be processed, feature dimensions are unified, and the final features are combined to obtain a final feature for training the convolutional neural network detector, for example, the processing process described herein includes operations such as normalization and the like, which are common processing methods of general feature vectors, for example, the final result is a 4 × 4 result feature map, the actual form of which is 1 × 16 feature vector, and the normalization process is performed to satisfy the requirement of the feature vector

Unifying feature dimensions mainly comprises zero-padding the pooled features, i.e. zero-padding at the head and tail ends of the pooled features, so that the feature vector dimensions after pooling are the same as those before pooling.

Then, step S1014 is executed: inputting the pooled features into a decision tree forest, optimizing a distribution function of Adaboost by adopting an Adaboost algorithm and spatial distribution probability to generate a convolutional neural network detector, and training the convolutional neural network detector.

The general Adaboost is to directly add several weak classifiers obtained by training, that is, the weight of each weak classifier is 1 by default. In this embodiment, the distribution function of Adaboost is optimized by using the samples classified by the first (n-1) weak classifiers as the training of the nth weak classifier, and in the process of combining the last weak classifiers, the weights of the weak classifiers are not all 1, but different weights are assigned according to the performances of the weak classifiers in the training and testing processes.

In specific implementation, the training of the whole detector is mainly carried out in four stages. The first stage is 64 decision trees, (a decision tree is a tree-like decision graph based on the prior art and is generated with additional probability results, is a visual graph method applying statistical probability analysis, each non-leaf node of the decision tree represents a test on a characteristic attribute, each branch represents the output of the characteristic attribute on a certain value range, and each leaf node stores a category, the decision process using the decision tree is to start from a root node, test the corresponding characteristic attribute in the item to be classified, select the output branch according to the value of the corresponding characteristic attribute until reaching the leaf node, and take the category stored by the leaf node as the decision result), input training data into the decision tree forest (a classifier comprising a plurality of decision trees, wherein the decision trees in the forest are not related, when the data enters the decision tree forest, each decision tree classifies the sample to which category the sample belongs, and finally, taking the class with the highest classification result in all the decision trees as a final result), and taking the detection result thereof to place the data with wrong classification as 'important training data' into the next stage, namely the second stage, wherein 128 decision trees … … exist in the second stage, and so on until the training of the four stages is finished. The decision trees of the four stages are 64, 128, 256 and 1024 respectively, and the data are determined by referring to the common parameters in various applications.

In the step S102, the image to be detected is obtained, and the feature pyramid of the image to be detected with different scales is calculated and generated.

In an embodiment, the step S102 specifically includes the following processes:

step 1: calculating the multichannel characteristics of the image to be detected under different scales by the following formula:

wherein, F_sRepresenting the feature corresponding to the scale s, wherein R represents resampling the image by using the scale s, F ═ Ω (I) represents the feature of the channel corresponding to the image, and Ω corresponds to different feature channels;

step 2: according to the multichannel characteristics, calculating and generating characteristic graphs of the images to be detected under different scales by the following formula:

wherein, F_s′Is a feature map, λ, of the image to be detected at a scale s_ΩIs the scale factor corresponding to the omega channel and the characteristic F corresponding to other scales s_sAnd approximating to obtain a characteristic diagram under another scale according to the scale proportion relation between the images with different scales and the calculated characteristic diagram under a certain scale.

Wherein λ corresponds to different channel Ω in the above equations (1-1) and (1-2)_ΩThe calculation process of (2) is as follows:

mean value mu of integral features of statistical data set along with scale transformation_s：

By the formula:

the following can be obtained:

in the above formula, E [, ]]Expected value, f, indicating error_Ω(I_s) For the weighted sum of all channels, i.e.:

f_Ω(I)＝∑_i,j,kω_i,j,kF(i,j,k) (1-6)，

the omega characteristic is pairK denotes the sequence of channels, depending on their weight. By combining the above equations (1-3) to (1-6), λ can be obtained_Ω。

And step 3: and generating a characteristic pyramid according to the characteristic graph of the image to be detected under each scale.

In the step S103, the convolutional neural network detector is used to detect the feature pyramid through a sliding window, so as to obtain target candidate regions at different scales. And obtaining the characteristics of the image to be detected in different scales by using the characteristic pyramid calculation method, and detecting the image to be detected in each scale through a sliding window by using a trained detector to obtain a target candidate region.

In one embodiment, a non-maximum suppression method (sorting the candidate regions with overlapping detection results according to their detection scores, obtaining the highest score as the detection result region, and removing other regions) is used for the obtained target candidate regions, and a suitable threshold is used to screen out the final detection region. In the non-maximum suppression, the overlapping ratio of the detected bounding boxes is calculated as follows:

the overlap is the intersection (bbs), the intersection (bbs) calculates the intersection rate of the detection result areas with intersection, the two detection frames with intersection are bb1 and bb2 respectively, the inter (bb1, bb2) represents the area of intersection of the two rectangular frames, the intersection (bb1, bb2) represents the union area of the two rectangular frames, and the intersection (bbs) is defined as inter (bb1, bb 2)/intersection (bb1, bb 2).

If overlap is greater than threshold τ, only bbs with the highest score is retained_i. In the invention, the value of tau is set to be 0.5. bbs are summed by the thresholds corresponding to the nodes of the decision tree.

In the step S104, the first fully-connected layer of the convolutional neural network detector is used to distinguish between characters and non-characters in the target candidate region, and the second fully-connected layer of the convolutional neural network detector is used to identify characters in the target candidate region.

Training a convolutional neural network by using positive and negative samples in the data set, and extracting the characteristics of the license plate area in the detection result; training a full connection layer for distinguishing seven character features, and distinguishing the seven character features in a license plate region; and training a full connection layer for character recognition, and classifying characters to realize the recognition of the characters in the license plate area.

The two fully-connected layers are essentially used for classification, and are connected in parallel, that is, after a feature map is obtained in a front convolution part (the feature map is a plurality of numerical value matrixes (similar to a form that an image is a two-dimensional matrix in a computer) obtained after an image to be detected is processed by each layer of a convolutional neural network, in the convolutional neural network, a front convolution layer part is mainly used for feature extraction, and a rear fully-connected layer part is mainly used for classification, so that data obtained through the convolution part is called the feature map and is used as a feature extracted from the image to be detected by the convolution part), and then the two fully-connected layers are connected in parallel. Firstly, bounding box information labeled by a feature map (the bounding box information refers to labeling information of a rectangular frame of a license plate area in training data, for each training picture, a corresponding xml file corresponds to the bounding box information, the content of the file is a character string corresponding to the rectangular frame (such as coordinates of an upper left corner point and a lower right corner point) of the license plate in the labeled picture and the content of the license plate, and a convolutional neural network is trained by taking the labeling information as guidance to determine what a desired result is, so that the adjustment of network parameters is performed according to the difference between a network prediction result and actual labeling data, the prediction result of the network is as the same as the actual data as possible, namely how the convolutional neural network is trained to recognize), and the network is input into a first full-connection layer for training, wherein only the classification of the license plate and a non-license plate is required; the training of the second full-connection layer needs a feature map and a bounding box, and also needs the labeling information of the license plate content, and the character recognition, namely the multi-classification process, is carried out, and the two can be parallel during the training, and the training has no dependence due to different classification tasks; however, during testing, the feature map needs to pass through the first full connection layer, the license plate region is further determined, and then the result is input into the second full connection layer to perform character recognition, that is, the second full connection layer is dependent on the result of the first full connection layer in the testing stage.

According to the license plate recognition method based on the integral channel characteristics and the convolutional neural network, the information of the license plate in different states is fully mined by utilizing the integral channel characteristics, so that the license plate can be more accurately expressed in characteristics, and the influence caused by deformation can be effectively dealt with through pooling operation. And the cascaded decision tree forest detectors are adopted, so that the target can be quickly detected. And fully searching the space where the image is located by using the sliding window, and positioning the region where the license plate appears. The model obtained through learning can be better expanded to different scenes, and meanwhile, the influence of other factors such as different illumination changes can be better responded.

The recognition method of the embodiment of the invention can obtain better detection effect under different illumination conditions (different brightness degrees in the daytime and at night) and weather conditions (different states in sunny days, rainy days and the like), can accurately recognize key information of the license plate, can be applied to detection and recognition of the license plate in a static mode (aiming at a certain frame of image captured) and can also be applied to detection and recognition of the license plate in a dynamic mode (aiming at continuous video streams), and has the advantages of strong function, flexible application, high running speed, strong adaptability and less resource occupation.

The embodiment of the present invention further provides a license plate recognition system based on the integral channel feature and the convolutional neural network, as shown in fig. 4, the license plate recognition system based on the integral channel feature and the convolutional neural network mainly includes: the device comprises a convolutional neural network detector generating unit 1, a characteristic pyramid generating unit 2, a target candidate region acquiring unit 3 and a recognition unit 4.

The convolutional neural network detector generating unit 1 is used for acquiring a sample image of a license plate image and generating a convolutional neural network detector according to the sample image; the characteristic pyramid generating unit 2 is used for acquiring an image to be detected and calculating and generating characteristic pyramids of different scales of the image to be detected; the target candidate region acquisition unit 3 is configured to detect the feature pyramid through a sliding window by using a convolutional neural network detector, and acquire target candidate regions at different scales; the recognition unit 4 is configured to distinguish characters from non-characters in the target candidate region by using a first fully-connected layer of the convolutional neural network detector, and recognize characters in the target candidate region by using a second fully-connected layer of the convolutional neural network detector.

Through the cooperative work among the components, the license plate recognition system based on the integral channel characteristics and the convolutional neural network realizes the solution of the image characteristic diagram by learning the integral channel characteristics of the object and by a sliding window method, quickly detects the object by utilizing the cascaded decision tree forest, and realizes the recognition of the license plate by using the convolutional neural network.

The following further describes each component and functions of the license plate recognition system based on the integral channel feature and the convolutional neural network according to the embodiment of the present invention with reference to specific embodiments.

The convolutional neural network detector generating unit 1 is configured to obtain a sample image of the license plate image, and generate a convolutional neural network detector according to the sample image.

Specifically, as shown in fig. 2, the convolutional neural network detector generating unit 1 mainly performs the following processes:

Step S1012 calculates the integral channel characteristics of each sample image in the training set. As shown in fig. 3, the process of calculating the integral channel feature of each sample in the training set specifically includes the following steps:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；

After the integral channel features of each sample image in the training set are calculated, the convolutional neural network detector generating unit 1 performs step S1013: and pooling the integral channel characteristics of the sample image to generate pooled characteristics of the sample image.

Specifically, the pooling process is as follows: and carrying out region segmentation on the obtained integral characteristic channel, and determining the size of each block of region as the pooling size. Then, maxporoling (using the maximum value of each pixel value in the image area as the result of the area) or average posing (using the average value of each pixel value in the image area as the result of the area) is performed on each area to obtain the pooled channel characteristics.

Unifying feature dimensions mainly comprises zero-padding the pooled features, i.e. zero-padding at the head and tail ends of the pooled features to make the pooled features zero-paddedThe eigenvector dimensions are the same as before pooling.

Then, the convolutional neural network detector generating unit 1 performs step S1014: inputting the pooled features into a decision tree forest, optimizing a distribution function of Adaboost by adopting an Adaboost algorithm and spatial distribution probability to generate a convolutional neural network detector, and training the convolutional neural network detector.

The feature pyramid generation unit 2 is configured to obtain an image to be detected, and calculate and generate feature pyramids of different scales of the image to be detected.

The feature pyramid generation unit 2 specifically executes the following processes:

By the formula:

the following can be obtained:

f_Ω(I)＝∑_i,j,kω_i,j,kF(i,j,k) (1-6)，

the ω feature is the weight of the corresponding channel, and k represents the sequence of channels. By combining the above equations (1-3) to (1-6), λ can be obtained_Ω。

The target candidate region obtaining unit 3 is configured to use a convolutional neural network detector to detect the feature pyramid through a sliding window, so as to obtain target candidate regions at different scales. And obtaining the characteristics of the image to be detected in different scales by using the characteristic pyramid calculation method, and detecting the image to be detected in each scale through a sliding window by using a trained detector to obtain a target candidate region.

The above-mentioned recognition unit 4 is configured to distinguish between a character and a non-character in the target candidate region by using the first fully-connected layer of the convolutional neural network detector, and recognize the character in the target candidate region by using the second fully-connected layer of the convolutional neural network detector.

According to the license plate recognition system based on the integral channel characteristics and the convolutional neural network, the information of the license plate in different states is fully mined by utilizing the integral channel characteristics, so that the license plate can be more accurately expressed in characteristics, and the influence caused by deformation can be effectively dealt with through pooling operation. And the cascaded decision tree forest detectors are adopted, so that the target can be quickly detected. And fully searching the space where the image is located by using the sliding window, and positioning the region where the license plate appears. The model obtained through learning can be better expanded to different scenes, and meanwhile, the influence of other factors such as different illumination changes can be better responded.

The recognition system of the embodiment of the invention can obtain better detection effect under different illumination conditions (different brightness degrees in the daytime and at night) and weather conditions (different states in sunny days, rainy days and the like), can accurately recognize key information of the license plate, can be applied to detection and recognition of the license plate in a static mode (aiming at a certain frame of image captured) and can also be applied to detection and recognition of the license plate in a dynamic mode (aiming at continuous video streams), and has the advantages of strong function, flexible application, high running speed, strong adaptability and less resource occupation.

It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by relevant hardware instructed by a program, and the program may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A license plate recognition method based on integral channel characteristics and a convolutional neural network is characterized by comprising the following steps:

obtaining a sample image of a license plate image, and generating a convolutional neural network detector according to the sample image;

acquiring an image to be detected, and calculating to generate a feature pyramid of the image to be detected with different scales;

detecting the characteristic pyramid through a sliding window by using the convolutional neural network detector to obtain target candidate regions under different scales;

distinguishing characters from non-characters in the target candidate area by using a first fully-connected layer of the convolutional neural network detector, and identifying the characters in the target candidate area by using a second fully-connected layer of the convolutional neural network detector; generating a convolutional neural network detector according to the sample image, specifically comprising:

forming a training set by the sample images, wherein the sample images comprise positive samples and negative samples, the positive samples are images containing license plates, and the negative samples are background images not containing license plates;

calculating integral channel characteristics of each sample image in the training set;

pooling the integral channel characteristics of the sample image to generate pooled characteristics of the sample image;

inputting the pooled features into a decision tree forest, optimizing a distribution function of Adaboost by adopting an Adaboost algorithm and through spatial distribution probability, and generating the convolutional neural network detector;

calculating integral channel characteristics of each sample in the training set, specifically comprising:

step a: converting the sample image to an HSV color space, calculating color characteristics of three channels:

step b: calculating gradient direction histogram characteristics of the sample image;

step c: and generating the integral channel characteristics according to the characteristics of the three channels and the gradient direction histogram characteristics.

2. The license plate recognition method based on the integral channel feature and the convolutional neural network of claim 1, wherein the step b: calculating the gradient direction histogram feature of the sample image, specifically comprising:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；

step b 4: determining the histogram of gradient directions features according to the histogram of gradient directions:

3. A license plate recognition system based on integral channel characteristics and a convolutional neural network is characterized by comprising:

the convolutional neural network detector generating unit is used for acquiring a sample image of the license plate image and generating a convolutional neural network detector according to the sample image;

the characteristic pyramid generating unit is used for acquiring an image to be detected and calculating to generate characteristic pyramids of different scales of the image to be detected;

a target candidate region acquisition unit, configured to detect the feature pyramid through a sliding window by using the convolutional neural network detector, and acquire target candidate regions at different scales;

the recognition unit is used for distinguishing characters from non-characters in the target candidate area by using a first full-connection layer of the convolutional neural network detector and recognizing the characters in the target candidate area by using a second full-connection layer of the convolutional neural network detector;

the convolutional neural network detector generating unit is specifically configured to:

4. The license plate recognition system based on the integral channel feature and the convolutional neural network of claim 3, wherein the step b: calculating the gradient direction histogram feature of the sample image, specifically comprising:

G_x(x,y)＝H(x+1,y)-H(x-1,y)；

G_y(x,y)＝H(x,y+1)-H(x,y-1)；