CN113392837A

CN113392837A - License plate recognition method and device based on deep learning

Info

Publication number: CN113392837A
Application number: CN202110775812.3A
Authority: CN
Inventors: 闫军; 丁丽珠
Original assignee: Super Vision Technology Co Ltd
Current assignee: Super Vision Technology Co Ltd
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2021-09-14

Abstract

The embodiment of the invention provides a license plate recognition method and a license plate recognition device based on deep learning, wherein the method comprises the following steps: acquiring a plurality of image frames containing vehicle license plate information, and extracting the characteristics of each image frame through a convolutional neural network to obtain the characteristic representation of each image frame; detecting the feature representation of each image frame through a preset target detection network to obtain the category and position information of the license plate detection frame of each image frame, and segmenting the license plate of each image frame to obtain the feature map of each segmented image frame; acquiring original license plate label information in each image frame, and training to obtain a license plate recognition model; and inputting the picture to be recognized into the license plate recognition model for license plate recognition, and recognizing to obtain license plate characters in the picture to be recognized. The invention effectively solves the problem of lower license plate recognition precision caused by larger inclination angle, and simultaneously fully utilizes the characteristics of the license plate, thereby greatly improving the license plate recognition precision and recognition efficiency.

Description

License plate recognition method and device based on deep learning

Technical Field

The invention relates to the technical field of intelligent transportation, in particular to a license plate recognition method and device based on deep learning.

Background

The license plate recognition technology has an important role in a plurality of tasks such as urban traffic management, vehicle recognition, parking lot charging management, violation processing and the like, but the license plate recognition is still a challenging task due to the influence of a plurality of factors such as illumination conditions, weather conditions, image definition, the shooting angle of the license plate, the color of the license plate and the like.

The license plate recognition comprises two tasks, namely, the detection of the license plate position, namely, the position of a license plate area is obtained by positioning a shot image; and secondly, recognizing license plate characters, namely recognizing visible characters in the detected license plate area. In the prior art, the license plate recognition mainly comprises two methods, one is a two-stage license plate recognition method, the method generally comprises the steps of firstly detecting the position of a license plate, and then recognizing characters of the license plate on the basis of detecting the license plate, and the method needs to respectively train and optimize a detection module and a recognition module, so that the highly-related and complementary relationship between the detection module and the recognition module cannot be fully played; the other method is an end-to-end license plate recognition method, which trains the detection and recognition processes together to directly obtain the license plate recognition result, but the existing end-to-end license plate recognition method has the following defects: on one hand, a complete end-to-end mode cannot be realized for training, but the detector is trained firstly, and then the recognizer is loaded for continuous training, because the training of the license plate recognition requires accurate license plate position positioning, and the license plate position positioning in early iteration is often inaccurate, so that the license plate recognition effect is also influenced; on the other hand, because the shooting angles of the license plate in the real world are different, the position of the license plate area on the image is not completely horizontal, but is presented in various shapes, so that the problem of inaccurate license plate identification occurs in practical application; on the other hand, the existing license plate recognition method based on the output of the full-connection network or the recurrent neural network has large dependence on data quantity and needs mass license plate data for learning.

Disclosure of Invention

The embodiment of the invention provides a license plate recognition method and device based on deep learning, which realize end-to-end license plate recognition and greatly improve the precision of license plate recognition.

In one aspect, an embodiment of the present invention provides a license plate recognition method based on deep learning, including:

acquiring a plurality of image frames containing vehicle license plate information, extracting the characteristics of each image frame through a convolutional neural network, and fusing the extracted characteristics of each image frame to obtain the characteristic representation of each image frame;

detecting feature representation of each image frame through a preset target detection network to obtain category and position information of a license plate detection frame of each image frame, and segmenting the license plate of each image frame according to the category and position information to obtain a feature map of each segmented image frame;

acquiring original license plate label information in each image frame, and training to obtain a license plate recognition model according to the original license plate label information and the feature map of each image frame;

and inputting the picture to be recognized into the license plate recognition model for license plate recognition, and recognizing to obtain license plate characters in the picture to be recognized.

Further, the extracting features of each image frame through a convolutional neural network, and fusing the extracted features of each image frame to obtain a feature representation of each image frame includes:

adding a characteristic pyramid network structure in a convolutional neural network, and extracting high-level semantic characteristics of each image frame from different scales through the characteristic pyramid network structure;

and performing feature fusion on the extracted high-level semantic features of each image frame to obtain feature representation of each image frame.

Further, the segmenting the license plate of each image frame according to the category and the position information to obtain a feature map of each segmented image frame includes:

and extracting feature maps with second preset sizes corresponding to the image frames respectively by performing convolution operation for the image frames with the first preset sizes and deconvolution operation for the second preset times respectively according to the types and the position information of the license plate detection frames of the image frames.

Further, before the step of training to obtain a license plate recognition model according to the original license plate label information and the feature map of each image frame, the method comprises the following steps:

and converting each original license plate label into a global character feature map aiming at the global characters of the license plate and a single character feature map aiming at the single characters of the license plate according to the information of each original license plate label.

Further, the training according to the original license plate label information and the feature map of each image frame to obtain a license plate recognition model includes:

calculating each global character feature map, each single character feature map and the feature map of each image frame through a preset loss function to obtain a calculation result;

and correcting the calculation result through the loss function based on the information of each original license plate label, and training to obtain a license plate recognition model.

Further, the predetermined loss function is a multitask loss function, and the multitask loss function comprises a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage;

the multitask loss function is determined by respectively setting weight coefficients of a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage.

Further, the loss function of the license plate target detection stage comprises a position regression function and a classification loss function, and the loss function of the license plate target detection stage is determined by respectively setting the weight coefficients of the position regression function and the classification loss function;

the loss functions of the license plate segmentation stage comprise a global license plate instance segmentation loss function and a semantic segmentation loss function representing each character, wherein the global license plate instance segmentation loss function is calculated by using a binary cross entropy loss function, and the semantic segmentation loss function is calculated by using a weighted loss function.

Further, the step of inputting the picture to be recognized into the license plate recognition model for license plate recognition to obtain license plate characters in the picture to be recognized by recognition includes:

inputting the picture to be recognized into the license plate recognition model to obtain a license plate detection frame of the picture to be recognized;

detecting a license plate detection frame of the picture to be recognized through a preset target detection network in the license plate recognition model to obtain a global character feature map of the picture to be recognized and a single character feature map of the picture to be recognized;

calculating the average pixel value of each character region in the single character characteristic diagram of the picture to be recognized through a preset algorithm according to the license plate detection frame of the picture to be recognized, and generating a license plate character sequence;

and obtaining license plate characters in the picture to be recognized according to the license plate character sequence.

On the other hand, the embodiment of the invention provides a license plate recognition device based on deep learning, which comprises:

the extraction and fusion module is used for acquiring a plurality of image frames containing vehicle license plate information, extracting the characteristics of each image frame through a convolutional neural network, and fusing the extracted characteristics of each image frame to obtain the characteristic representation of each image frame;

the detection and segmentation module is used for detecting the feature representation of each image frame through a preset target detection network to obtain the category and position information of the license plate detection frame of each image frame, and segmenting the license plate of each image frame according to the category and position information to obtain the feature map of each segmented image frame;

the training module is used for acquiring original license plate label information in each image frame and training to obtain a license plate recognition model according to the original license plate label information and the feature map of each image frame;

and the recognition module is used for inputting the picture to be recognized into the license plate recognition model for license plate recognition, and recognizing license plate characters in the picture to be recognized.

Further, the extraction and fusion module comprises:

the extracting unit is used for adding a characteristic pyramid network structure in the convolutional neural network and extracting high-level semantic features of each image frame from different scales through the characteristic pyramid network structure;

and the fusion unit is used for performing feature fusion on the extracted high-level semantic features of each image frame to obtain feature representation of each image frame.

Further, the detection and segmentation module includes:

and the extraction unit is used for performing convolution operation for a first preset time and deconvolution operation for a second preset time on each image frame with a first preset size according to the category and the position information of the license plate detection frame of each image frame, and extracting and obtaining a feature map with a second preset size corresponding to each image frame.

Further, comprising:

and the conversion module is used for converting each original license plate label into a global character feature map aiming at the global characters of the license plate and a single character feature map aiming at the single characters of the license plate according to the information of each original license plate label.

Further, the training module includes:

the calculation unit is used for calculating the feature maps of all global characters, all single character and all image frames through a preset loss function to obtain a calculation result;

and the training unit is used for correcting the calculation result through the loss function based on the original license plate label information, and training to obtain a license plate recognition model.

Further, the identification module is particularly for

The technical scheme has the following beneficial effects: according to the invention, a small amount of more accurate character-level license plate labeling data is utilized, the multilayer convolutional neural network is used for carrying out feature extraction on the image, features with different sizes can be extracted, a license plate detection frame is generated through the preset target detection network, the license plate detection frame is segmented, an accurate license plate recognition result is obtained, the end-to-end license plate recognition is realized, the problem of low license plate recognition accuracy caused by a large inclination angle is effectively solved, meanwhile, the features of the license plate are fully utilized, and the license plate recognition accuracy and the recognition efficiency are greatly improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flowchart of a license plate recognition method based on deep learning according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a license plate recognition device based on deep learning according to another embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical scheme of the embodiment of the invention has the following beneficial effects: according to the invention, a small amount of more accurate character-level license plate labeling data is utilized, the multilayer convolutional neural network is used for carrying out feature extraction on the image, features with different sizes can be extracted, a license plate detection frame is generated through the preset target detection network, the license plate detection frame is segmented, an accurate license plate recognition result is obtained, the end-to-end license plate recognition is realized, the problem of low license plate recognition accuracy caused by a large inclination angle is effectively solved, meanwhile, the features of the license plate are fully utilized, and the license plate recognition accuracy and the recognition efficiency are greatly improved.

The above technical solutions of the embodiments of the present invention are described in detail below with reference to application examples:

the application example of the invention aims to realize the end-to-end license plate recognition and greatly improve the precision of the license plate recognition.

In one possible implementation mode, in the license plate recognition process, a license plate recognition model is pre-trained; in the process of training a license plate recognition model, firstly, acquiring a plurality of image frames containing vehicle license plate information, extracting the characteristics of each image frame through a convolutional neural network, and fusing the extracted characteristics of each image frame to obtain the characteristic representation of each image frame; subsequently, detecting the feature representation of each image frame through a predetermined target detection network to obtain the category and position information of the license plate detection frame of each image frame, and segmenting the license plate of each image frame according to the category and position information to obtain the feature map of each segmented image frame; acquiring original license plate label information in each image frame, and training to obtain a license plate recognition model according to the original license plate label information and the feature map of each image frame; and during license plate recognition, inputting the picture to be recognized into a license plate recognition model for license plate recognition, and recognizing to obtain license plate characters in the picture to be recognized.

It should be noted that, in the embodiment of the present invention, the Convolutional Neural Network includes, but is not limited to, AlexNet, ResNet (Residual Network), VGG (Visual Geometry Group Network, VGG model), etc., and the predetermined target detection Network includes, but is not limited to, Fast-based conditional Neural Networks (RCNN is a Region-based Convolutional Neural Network, Fast-RCNN is an upgrade of RCNN), YOLO (young Only Look one, an object recognition and location algorithm based on a deep Neural Network), SSD (Single Shot multi Detector, target detection algorithm), etc.; the pictures containing the vehicle license plate information cover images of the same vehicle captured by real road traffic scene monitoring under different visual angles, different backgrounds and different illumination intensities, and the content covered by the vehicle images comprises license plate appearance information, license plate character information and the like.

In one possible implementation manner, extracting features of each image frame through a convolutional neural network, and fusing the extracted features of each image frame to obtain a feature representation of each image frame, includes: adding a characteristic pyramid network structure in a convolutional neural network, and extracting high-level semantic characteristics of each image frame from different scales through the characteristic pyramid network structure; and performing feature fusion on the extracted high-level semantic features of each image frame to obtain feature representation of each image frame.

The step of segmenting the license plate of each image frame according to the category and the position information to obtain a feature map of each segmented image frame includes: and extracting feature maps with second preset sizes corresponding to the image frames respectively by performing convolution operation for the image frames with the first preset sizes and deconvolution operation for the second preset times respectively according to the types and the position information of the license plate detection frames of the image frames.

For example, in the license plate recognition process, a license plate recognition model is pre-trained; in the process of training a license plate recognition model, firstly, acquiring a plurality of image frames containing vehicle license plate information, adding a characteristic Pyramid network structure (FPN) in a convolutional neural network, and extracting high-level semantic features of each image frame from different scales through the characteristic Pyramid network structure; performing feature fusion on the extracted high-level semantic features of each image frame to obtain feature representation of each image frame; then, detecting feature representation of each image frame through a predetermined target detection network to obtain category and position information of a license plate detection frame of each image frame, and performing convolution operation for each image frame with a first predetermined size for a first predetermined number of times and deconvolution operation for a second predetermined number of times respectively according to the category and position information of the license plate detection frame of each image frame to extract a feature map with a second predetermined size corresponding to each image frame, wherein, for example, the feature size of the input image frame is 16 × 64 × 256(H × W × C), wherein H represents height, W represents width, and C represents channel number; and performing further feature extraction through four convolution operations and one deconvolution operation, wherein the output feature size is changed into 32 multiplied by 128 multiplied by N, and the feature graph with the output channel number of N and the size of 32 multiplied by 128 is represented, wherein the number of N is 86, and the feature graph comprises a global license plate example feature graph, 84 license plate character feature graphs and 1 license plate character background feature graph. The global license plate example feature map can accurately position license plate character areas, and 84 license plate character feature maps comprise 10 Arabic numerals (0-9), 24 capital letters (A-Z except I and O) and 50 Chinese characters; the background feature map of the license plate characters is a feature map of a background area which does not include the character area.

In a possible implementation manner, before the step of training to obtain a license plate recognition model according to each original license plate label information and the feature map of each image frame, the method includes: and converting each original license plate label into a global character feature map aiming at the global characters of the license plate and a single character feature map aiming at the single characters of the license plate according to the information of each original license plate label.

The method for training to obtain the license plate recognition model according to the original license plate label information and the feature map of each image frame comprises the following steps: calculating each global character feature map, each single character feature map and the feature map of each image frame through a preset loss function to obtain a calculation result; and correcting the calculation result through the loss function based on the information of each original license plate label, and training to obtain a license plate recognition model.

For example, before license plate recognition, license plate label information is labeled to a plurality of image frames containing vehicle license plate information in a manual labeling mode, and original license plate labels in each image are converted into a global character feature map for license plate global characters and a single character feature map for license plate single characters in the process of training a license plate recognition model; the original license plate label comprises a global character label and a single character label of each character, wherein the global character label is represented by P ═ { P ═ P₁,p₂...p_mDenotes wherein p is_iPosition information of each license plate region is shown, and a single character label of each character is C ═ C₁＝(cc₁,cl₁),c₂＝(cc₂,cl₂),...,c_n＝(cc_n,cl_n) Denotes wherein, cc_jAnd cl_jRespectively representing the category and location of each license plate character. For the conversion of the global character label, firstly initializing a mask with the same size as the input feature map, namely a first preset size, such as the pixel value of the size of 16 × 64 × 256(H × W × C) in the above example being 0, and then setting the pixel value of the area in the license plate frame to 1 according to the coordinate position of the global character label, thereby obtaining the global character feature map label;for the generation of the single character feature map label, firstly initializing a mask with the same size as the output feature map, namely, the size of a second preset size, such as the mask with the pixel value of the size of 32 × 128 × N being 0 in the above example, and then setting the pixel value in the boundary frame to be the same as the character type index value according to the position of the boundary frame of each character, namely obtaining the character-level feature map label, wherein the character type index value is 84 types in total from 1 to 84; then, calculating each global character feature map, each single character feature map and the feature map of each image frame through a preset loss function, such as a multitask loss function, and obtaining a calculation result; and correcting a calculation result through the loss function based on the information of each original license plate label, and training to obtain a license plate recognition model.

It should be noted that, as will be understood by those skilled in the art, the mask is a binary image composed of 0 and 1. When a mask is applied in a certain function, the 1-value region is processed, and the masked 0-value region is not included in the calculation. The image mask is defined by the specified data values, data ranges, limited or unlimited values, regions of interest, and annotation files, and any combination of the above options may also be applied as input to create the mask.

It should be further noted that, in the embodiment of the present invention, in terms of data labeling, unlike a traditional data labeling manner, a traditional license plate labeling method generally uses a rectangular frame to represent the whole license plate, and the labeling information includes a vertex coordinate of the top left corner of the license plate, the width and height of the license plate, and license plate characters; in the embodiment of the invention, a quadrilateral marking mode is adopted, four vertexes are used for representing the position of the whole license plate, namely x coordinates and y coordinates of the four vertexes of the upper left corner, the upper right corner, the lower right corner and the lower left corner of the license plate are respectively marked, the marking mode can more accurately represent the inclined license plate, and each character in a license plate area is marked by utilizing a polygonal frame to obtain the position of the license plate, the character position of each license plate and the character type of the license plate. Although the labeling mode is more complex compared with the traditional labeling mode, compared with the image labeling data of millions of sheets required by the conventional license plate recognition algorithm, the license plate recognition algorithm has a better effect under the condition of using tens of thousands of pieces of labeling data, and the license plate recognition algorithm can achieve a better license plate recognition effect by using less labeling data, so that the license plate recognition efficiency is greatly improved.

In a possible implementation manner, the predetermined loss function is a multitask loss function, and the multitask loss function includes a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage; the multitask loss function is determined by respectively setting weight coefficients of a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage; the loss function of the license plate target detection stage comprises a position regression function and a classification loss function, and the loss function of the license plate target detection stage is determined by respectively setting the weight coefficients of the position regression function and the classification loss function; the loss functions of the license plate segmentation stage comprise a global license plate instance segmentation loss function and a semantic segmentation loss function representing each character, wherein the global license plate instance segmentation loss function is calculated by using a binary cross entropy loss function, and the semantic segmentation loss function is calculated by using a weighted loss function.

For example, in the process of training the license plate recognition model, as described in the previous example, each global character feature map, each single character feature map, and the feature map of each image frame are sent to a multitask loss function, where the multitask loss function includes a loss function in a license plate target detection stage and a loss function in a segmentation stage, and specifically, the multitask loss function is represented by the following formula one:

L＝α₁L_det+α₂L_seg… … … … … … (formula one)

Wherein L is_detLoss function, L, representing the license plate target detection stage_segLoss function, alpha, representing the segmentation stage of license plate characters₁And alpha₂Respectively, are weight coefficients set to 1.

Loss function L in license plate target detection stage_detThe following formula two:

L_det＝λ₁L_reg+λ₂L_cls… … … … … … (formula two)

Wherein L is_regRepresenting the position regression loss function, L_clsRepresenting the classification loss function, λ₁And λ₂Representing weight coefficients set to 1, respectively, and a regression loss function L_regAnd a classification loss function L_clsA loss function commonly used in target detection may be employed.

Loss function L in license plate character segmentation stage_segThe following formula three:

L_seg＝β₁L_ins+β₂L_char… … … … … … (formula three)

Wherein, beta₁And beta₂Represents a weight coefficient set to 1, L_insRepresenting a global license plate instance segmentation loss function, and calculating by using a binary cross entropy loss function in the following formula four:

wherein p represents the predicted class, y-0 and y-1 represent the actual class as 0 or 1; l is_charAnd expressing the semantic segmentation loss function of each character, and calculating by using a weighted softmax loss function in the following formula five:

wherein the content of the first and second substances,

is the standard format of the softmax loss function, and represents the output probability of the current element,

is the output of the current element and,

and the output of all elements is represented, N represents the number of pixels in each feature map, C represents the number of categories of license plate characters, Y represents a label corresponding to the output feature map X, and W represents weight and is used for balancing the loss value of the characters and the background. Wherein the pixel format of the background class is N_negWhen the category index value of the background class is 0, the weight is calculated according to the following formula six:

and calculating the difference value between the value in the marked license plate label information and the value in the marked license plate label information through a multitask loss function, calculating a gradient according to the multitask loss function, and training and optimizing license plate recognition network model parameters.

It should be noted that the gradient is intended to mean a vector (vector) indicating that the directional derivative of a certain function at the point takes the maximum value along the direction, i.e. the function changes the fastest and the maximum rate of change (modulo the gradient) along the direction (direction of the gradient) at the point, as will be understood by those skilled in the art.

In a possible implementation manner, the inputting the picture to be recognized into the license plate recognition model for license plate recognition, and recognizing to obtain license plate characters in the picture to be recognized includes: inputting the picture to be recognized into the license plate recognition model to obtain a license plate detection frame of the picture to be recognized; detecting a license plate detection frame of the picture to be recognized through a preset target detection network in the license plate recognition model to obtain a global character feature map of the picture to be recognized and a single character feature map of the picture to be recognized; calculating the average pixel value of each character region in the single character characteristic diagram of the picture to be recognized through a preset algorithm according to the license plate detection frame of the picture to be recognized, and generating a license plate character sequence; and obtaining license plate characters in the picture to be recognized according to the license plate character sequence.

For example, when a license plate is identified, a picture to be identified is input, a candidate frame of a license plate region is obtained through a convolutional neural network and a license plate detection network, partial redundant frames are removed by utilizing a Non-Maximum Suppression (NMS) algorithm, and a reserved license plate detection output frame is sent to a segmentation branch to generate a global character feature map and a single character feature map; and then, calculating the outline of a text area on the global character feature map to directly obtain a license plate position boundary box with accurate prediction, and generating a license plate character sequence on the single character feature map by using a predetermined algorithm, such as a pixel voting algorithm. The pixel voting algorithm comprises the steps of firstly carrying out binarization on a background feature map, wherein the binarization threshold value is 192, obtaining all license plate character areas according to a connected area in the binarization feature map, then calculating the average pixel value of each character area in all the character feature maps, namely the character category probability of the area, for each license plate character area, wherein the character category of the maximum average value is distributed to the area, then obtaining the character category of each area in sequence, and obtaining the complete license plate character according to the character reading and writing habit from left to right.

The embodiment of the invention provides a license plate recognition device based on deep learning, which can realize the method embodiment provided above, and for specific function realization, reference is made to the description in the method embodiment, and details are not repeated here.

It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.

In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".

Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.

The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside in different components in a user terminal.

In one or more exemplary designs, the functions described above in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A license plate recognition method based on deep learning is characterized by comprising the following steps:

2. The method of claim 1, wherein the extracting features of each image frame through a convolutional neural network and fusing the extracted features of each image frame to obtain a feature representation of each image frame comprises:

3. The method according to claim 1 or 2, wherein the segmenting the license plate of each image frame according to the category and the position information to obtain the feature map of each segmented image frame comprises:

4. The method of claim 3, wherein before the step of training a license plate recognition model according to the original license plate label information and the feature map of the image frames, the method comprises:

5. The method of claim 4, wherein the training of the license plate recognition model according to the original license plate label information and the feature map of the image frames comprises:

6. The method of claim 5, wherein the predetermined loss function is a multitask loss function, the multitask loss function including a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage;

7. The method of claim 6, wherein the loss function of the license plate target detection stage comprises a position regression function and a classification loss function, and the loss function of the license plate target detection stage is determined by respectively setting weight coefficients of the position regression function and the classification loss function;

8. The method of claim 7, wherein the step of inputting the picture to be recognized into the license plate recognition model for license plate recognition to obtain license plate characters in the picture to be recognized comprises:

9. A license plate recognition device based on deep learning, characterized by comprising:

10. The apparatus of claim 9, wherein the extraction and fusion module comprises:

11. The apparatus according to claim 9 or 10, wherein the detection and segmentation module comprises:

12. The apparatus of claim 11, comprising:

13. The apparatus of claim 12, wherein the training module comprises:

14. The apparatus of claim 13, wherein the predetermined loss function is a multitasking loss function, the multitasking loss function including a loss function in a license plate target detection stage and a loss function in a license plate segmentation stage;

15. The apparatus of claim 13, wherein the loss function of the license plate target detection stage comprises a position regression function and a classification loss function, and the loss function of the license plate target detection stage is determined by setting weight coefficients of the position regression function and the classification loss function respectively;

16. Device according to claim 15, characterized in that the identification module is, in particular, adapted to