WO2021135254A1 - License plate number recognition method and apparatus, electronic device, and storage medium - Google Patents

License plate number recognition method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2021135254A1
WO2021135254A1 PCT/CN2020/108989 CN2020108989W WO2021135254A1 WO 2021135254 A1 WO2021135254 A1 WO 2021135254A1 CN 2020108989 W CN2020108989 W CN 2020108989W WO 2021135254 A1 WO2021135254 A1 WO 2021135254A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
image
decoding
trained
space
Prior art date
Application number
PCT/CN2020/108989
Other languages
French (fr)
Chinese (zh)
Inventor
曾卓熙
Original Assignee
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术股份有限公司 filed Critical 深圳云天励飞技术股份有限公司
Publication of WO2021135254A1 publication Critical patent/WO2021135254A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates

Definitions

  • the present invention relates to the field of artificial intelligence technology, and in particular to a method, device, electronic equipment and storage medium for recognizing a person's license plate number.
  • Image recognition is currently one of the commonly used technologies in traffic, community or parking lot management. For example, using image recognition-based license plate number recognition to identify the vehicle's license plate number.
  • traditional license plate number recognition is generally divided into multiple independent steps, such as: 1.
  • Image normalization the license plate image is programmed into a "formal image” through computer vision methods (such as homography matrix homography, etc.).
  • Image preprocessing processing the occlusion, dirt, light, etc. of the image here (such as binary distribution binarized, etc.) 3.
  • Character segmentation character segmentation through computer vision methods (such as edge detection, etc.) 4.
  • Character recognition Recognize the segmented characters (such as random forest, support vector machine svm, logistic regression and other machine learning or deep learning methods).
  • the embodiment of the present invention provides a method for recognizing a license plate number, which can improve the robustness of the recognition of a license plate number.
  • an embodiment of the present invention provides a method for recognizing a license plate number, including:
  • the image to be identified includes license plate information, and the feature image includes multiple channels corresponding to the multiple channels.
  • the channel In a characteristic area, the channel has a time sequence attribute;
  • the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
  • the preset feature encoding space includes a pre-trained space transformation network and a pre-trained encoding network, and the image to be recognized is input into the preset feature encoding space for correction and encoding, and the encoding has Feature images of multiple channels, including:
  • the number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
  • the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long- and short-term memory network, and the feature image is input into the preset feature decoding space according to the timing attributes , And decode the characteristic area corresponding to the channel according to the time sequence attribute through the attention mechanism, including:
  • the feature regions corresponding to the channels are sorted according to the time sequence attributes, and the pre-trained long- and short-term memory network is notified according to the sorting order to sequentially decode the sequence corresponding to the sorting Characteristic area.
  • the notifying the preset trained long and short-term memory network according to the ranking to sequentially decode the feature regions corresponding to the ranking includes:
  • the pre-trained attention mechanism When decoding the first feature region, the pre-trained attention mechanism outputs a second attention parameter according to the order, and the second attention parameter includes the position of the second feature region;
  • the pre-trained long and short-term memory network decodes the second characteristic region
  • the decoding of the second characteristic region by the pre-trained long-short-term memory network after the decoding of the first characteristic region is completed includes:
  • the decoded features of the first feature region and the second feature region are used as inputs, and input into the pre-trained long-short-term memory network for decoding.
  • the method further includes:
  • the channel According to the time series attribute of the channel, annotate the feature area to which each pixel in the up-sampled feature image belongs, so that the feature area to which each pixel in the up-sampled feature image belongs has time sequence Attributes to get the marked feature image;
  • the inputting the characteristic image into a preset characteristic decoding space according to the time series attribute, and decoding the characteristic region corresponding to the channel according to the time series attribute through an attention mechanism includes:
  • an embodiment of the present invention provides a license plate number recognition device, including:
  • the encoding module is used to input the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels.
  • the image to be identified includes license plate information, and the feature image includes A plurality of characteristic regions corresponding to a channel, the channel having a time sequence attribute;
  • the decoding module is used to input the characteristic image into a preset characteristic decoding space according to the time sequence attribute, and in the characteristic decoding space, the characteristic regions in the characteristic image are sequentially arranged according to the time sequence attribute through the attention mechanism.
  • the output module is used to output the decoding result according to the time sequence attribute to obtain the recognition result of the image to be recognized.
  • an embodiment of the present invention provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for recognizing a license plate number provided by the embodiment of the present invention is implemented A step of.
  • the image to be recognized is input into a preset feature encoding space for correction and encoding, and a feature image with multiple channels is obtained.
  • the image to be identified includes license plate information, and the feature image includes the multiple A plurality of feature regions corresponding to each channel, the channel has time series attributes; the feature image is input into the preset feature decoding space according to the time series attributes, and the feature region corresponding to the channel is set according to all the features through the attention mechanism.
  • the time sequence attribute is decoded; the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
  • FIG. 1 is a flowchart of a method for recognizing a license plate number provided by an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention
  • FIG. 3 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of the structure of a license plate number recognition device provided by an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention.
  • Fig. 6 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for recognizing a license plate number according to an embodiment of the present invention. As shown in FIG. 1, it includes the following steps:
  • the above-mentioned image to be recognized includes license plate information
  • the above-mentioned characteristic image includes a plurality of characteristic regions corresponding to a plurality of channels
  • the above-mentioned channels have time series attributes.
  • the above-mentioned image to be recognized can be a static image of a vehicle license plate or an image frame of a dynamic video uploaded by the user, or a static image or a static image of the vehicle license plate obtained by a camera deployed on a traffic road, entrances and exits of communities, and entrances and exits of parking lots.
  • the image frame of the dynamic video can be a static image of a vehicle license plate or an image frame of a dynamic video uploaded by the user, or a static image or a static image of the vehicle license plate obtained by a camera deployed on a traffic road, entrances and exits of communities, and entrances and exits of parking lots.
  • the above-mentioned license plate information in the image to be recognized may be one or more, that is, there are one or more license plate numbers to be recognized in an image to be recognized.
  • the aforementioned feature encoding space may be a fully convolutional network space, and the aforementioned fully convolutional network space may be calculated by convolution to predict the correction parameters of the image to be recognized, and correct the image to be recognized according to the predicted correction parameters.
  • the above-mentioned fully convolutional network space can be calculated by convolution to predict the feature area corresponding to each character in the license plate information.
  • the above correction can be understood as performing spatial transformation and alignment of the image to be recognized, and may include translation, scaling, and rotation of the image to be recognized.
  • the above-mentioned feature area is determined by the channel in the full convolutional network, and the above-mentioned channel is the output channel obtained after convolution calculation. Specifically, it is determined by the channel value of the channel.
  • the convolution kernel is used to perform convolution calculations on the image to be recognized, and the corresponding features are extracted, and one convolution kernel corresponds to one channel.
  • the parameters (3, W, H) of the license plate to be recognized where W, H is the height and width of the license plate, 3 is the RGB three-color channel of the license plate to be recognized, and the RGB three channels are convolved through a convolution kernel.
  • the output obtained is the channel after the sum of the corresponding channel values of the RGB three channels.
  • the channel after the addition is (R1+G1+B1, R2+G2+B2, R3 +G3+B3,..., Rn+Gn+Bn), so it can be considered that one convolution kernel corresponds to one channel.
  • different feature regions are determined. For example, on the same feature point, the largest channel value indicates that the feature point belongs to the feature region corresponding to the channel. Take the license plate as an example to further illustrate that a normal car license plate is composed of 7 characters.
  • each character becomes a characteristic area, which corresponds to a channel, which can also be called a character area.
  • each character area will be represented by a channel.
  • Different channels represent different character regions, and the character region to which a feature point belongs is the character region corresponding to the channel with the largest channel value at the feature.
  • the feature area corresponding to each feature point can be determined by traversing the maximum channel value of each feature point. Since the license plate number is composed of multiple characters, after feature encoding is performed in the feature encoding space, the output feature image needs to correspond to the feature area where multiple characters are located, so the output of the feature encoding space is multiple corresponding to the number of characters The characteristic image of the channel.
  • the above-mentioned multiple channels have timing attributes, which are determined by the convolution calculation order of the convolution kernel during the encoding process. For example, after the first convolution kernel performs the convolution calculation, the first channel is obtained, and the second After the convolution kernel is calculated, the second channel is obtained. It can be seen that due to the calculation sequence of the convolution kernel, the channel has a timing attribute.
  • the feature coding of the image to be recognized is a feature extraction process of the image to be recognized;
  • the correction of the image to be recognized is a predictive correction, and the effect of the correction is positively correlated with the perfection of the training data.
  • the feature image is the feature image obtained through feature encoding space encoding in step 101.
  • the feature image includes channels corresponding to the number of license plate characters. Each channel corresponds to a different feature area, which can also be understood as each channel corresponds to Different character areas. It can be understood that inputting the feature image into the preset feature decoding space according to the time series attribute refers to inputting the multiple channels corresponding to the feature image into the feature decoding space according to the time series attribute.
  • the feature decoding space sequentially decodes the feature regions corresponding to each channel to decode the characters represented by the corresponding feature regions.
  • the above-mentioned feature decoding space may be a neural network based on time sequence, such as recurrent neural network (English: Recurrent Neural Network, abbreviated as: CNN), long-short-term memory network (English: Long-Short Term Memory Cells, abbreviated as: LSTM).
  • recurrent neural network English: Recurrent Neural Network, abbreviated as: CNN
  • long-short-term memory network English: Long-Short Term Memory Cells, abbreviated as: LSTM.
  • the above-mentioned neural network based on time series can make predictions based on the relationship between the previous character and the next character.
  • the previous character is the Chinese character "Zhe"
  • the probability is 100%, that is, when the previous character is a Chinese character type, when decoding the next character, you don’t need to consider the latter character as a Chinese character or number. It is only in the case of a 24-letter character. Decode in the category. It is equivalent to the following characters to be decoded depending on the preceding characters.
  • the decoded characters are related to the structure of the license plate number.
  • the license plate includes three parts. The first part is the abbreviation of the province, autonomous region, and municipality, and the second part It is the code of the licensing organization, and the third part is the serial number.
  • the first part is "Zhe”
  • the second part is "J”
  • the third part is "L9098”.
  • the first part is the abbreviation characters of provinces, autonomous regions, and municipalities directly under the Central Government, with 31 corresponding Chinese characters
  • the second character is the code of the licensing agency, which is represented by the characters corresponding to uppercase letters.
  • the aforementioned attention mechanism may be a channel attention module (English: Attention Reinment Module, abbreviated as ARM).
  • the above-mentioned channel attention module can assign corresponding attention parameters to the feature regions corresponding to each channel, and the above attention parameters are the positions in the channels where the corresponding feature regions are located. For example, the channel value of each feature point on the channel where the feature area corresponding to the "Zhe" character is located is greater than the value of other channels. At this time, the position of the feature area corresponding to the "Zhe" character is used as the attention parameter.
  • the feature decoding space is notified to decode the position according to the attention parameter.
  • the above-mentioned attention mechanism can also be an attention mechanism directly aimed at the two-dimensional spatial position of the feature region in the feature image. According to the height and width of the feature image, the two-dimensional spatial position of the feature region corresponding to each character in the feature image is calculated. The attention mechanism assigns corresponding attention parameters to the two-dimensional spatial position of the feature region corresponding to each character in the order from top to bottom and from left to right. At the beginning of decoding, the feature decoding space is notified to decode the feature regions in sequence according to the attention parameter.
  • the above decoding result is the character corresponding to the license plate information in the image to be recognized. Since the decoding is performed according to the time series attribute in the feature decoding space, the obtained decoded character also has the time series attribute, and the obtained decoded character is according to the time series. The attributes are output to satisfy the character sorting of the license plate number.
  • the image to be recognized is input into a preset feature encoding space for correction and encoding, and a feature image with multiple channels is obtained.
  • the image to be identified includes license plate information, and the feature image includes the multiple A plurality of feature regions corresponding to each channel, the channel has time series attributes; the feature image is input into the preset feature decoding space according to the time series attributes, and the feature region corresponding to the channel is set according to all the features through the attention mechanism.
  • the time sequence attribute is decoded; the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
  • the aforementioned pre-trained spatial transformation network may be an STN (Spatial Transform Network) spatial transformation network.
  • the aforementioned spatial transformation network and coding network can form a fully convolutional neural network, so that the feature coding space is a fully convolutional neural network.
  • the parameters are different according to the form of the image transformation to be recognized. For example, when a 2D affine transformation is implemented, the parameter is the output of a 6-dimensional (2x3) vector.
  • the corresponding space transformation function is generated according to the parameters, and the image to be recognized is transformed into the image expected by the coding network according to the transformation function.
  • the image to be recognized is processed through three parts, namely Localisation net (location network), Grid generator (grid generation) and Sample (sample output).
  • the Localisation net determines the input required transformation parameter ⁇
  • the Grid generator finds the output and input feature mapping T( ⁇ ) through ⁇ and the defined transformation method
  • Sample combines the position mapping and transformation parameters to select the input features and combines the double line sexual interpolation sampling is output, so that the image to be recognized is transformed into the image expected by the coding network.
  • the STN spatial transformation network is a spatial transformation that can be trained Therefore, the STN spatial transformation network can adaptively learn the spatial transformation methods for different data through training. Moreover, the STN spatial transformation network can not only perform spatial transformation on the input, but also can be inserted into any layer of the coding network as a network module to realize the spatial transformation of different feature images. Finally, the coding network can learn to translate, scale, rotate and The invariance of more common distortions improves the robustness of feature coding of the coding network.
  • the number of the aforementioned channels is the same as the number of convolution kernels, and the timing attributes of the aforementioned channels are related to the order of calculation of the convolution kernels.
  • the above-mentioned corrected image is a to-be-identified image corrected by the spatial transformation network in the feature coding space.
  • the above-mentioned pre-trained coding network may be a convolutional neural network, which is used to extract the characteristic region where each character in the license plate information is located.
  • the aforementioned coding network has multiple calculation layers, and a space conversion network can be set between every two calculation layers to perform spatial conversion on the channels calculated by the previous calculation layer, so as to satisfy the next calculation.
  • the input expectation of each layer is to correct the input of each calculation layer to reduce the degree of error accumulation, thereby improving the recognition accuracy.
  • the aforementioned coding network is a coding network obtained by training based on character images as a data set.
  • the above-mentioned data set can be composed of 31 Chinese characters, 24 alphabetic characters, 10 numeric characters, a total of 65 characters, and each character corresponds to multiple images in different situations.
  • the coding network is trained through the data set, so that the coding network can learn to encode the characteristic region to which the character belongs, so as to encode the characteristic region where the character is located.
  • the weight parameter corresponding to the convolution kernel in the coding network is trained to make the coding network obtain the corresponding channel through the corresponding convolution kernel and convolution calculation when performing convolution calculation on the image to be recognized.
  • the channel corresponds to the characteristic area to which the character belongs.
  • the encoding network may be a fully convolutional neural network
  • the fully convolutional network may accept input images of any size, that is, the size of the image to be recognized does not need to be processed, and the fully convolutional network uses The deconvolution calculation layer up-samples the feature image of the last convolution layer so that the size of the feature image is the same as the size of the input image, so that a prediction can be generated for each pixel, while retaining the original input image Spatial information.
  • the output feature image is a feature image with the same spatial information as the image to be recognized, that is, the position information of each feature region extracted can be divided into pixels in the spatial information of the image to be recognized.
  • the point distribution position is characterized, and the pixel points can be traversed and classified on the up-sampled feature image.
  • the above-mentioned traversal classification is based on the channel of the feature image. The classification of each pixel corresponds to the channel with the highest channel value, and further to the feature area corresponding to the channel with the highest channel value.
  • the feature area corresponding to each channel can be labeled, so that each pixel on the up-sampled feature image corresponds to the channel label, which is equivalent to The feature regions in the up-sampled feature images are labeled, so that the feature regions have time series properties. In this case, the corresponding channel may not be reserved.
  • the aforementioned attention mechanism will prompt for the feature region and the corresponding label in the up-sampled feature image, so that the feature region is displayed in the feature according to the time series attribute. Decoding in the decoding space.
  • the time sequence attribute of each channel is reported to the pre-trained attention mechanism.
  • the attention mechanism obtains the channel sequence a according to the timing attribute of the channel, and the attention mechanism calculates the weight at ,i of each channel a i at the current time t, which can be calculated by the formula:
  • f aat in the formula is the attention perception function
  • a i is the current input vector
  • h t-1 is the decoding state at the previous moment
  • L is the number of channels.
  • the aforementioned long-short-term memory network decodes the characteristic region corresponding to the current channel, it will obtain the location of the next characteristic region to be decoded according to the output of the attention mechanism.
  • the pre-trained attention mechanism is used to output the first attention parameters in order, and the pre-trained long and short-term memory network is notified through the first attention parameters to decode the first feature region.
  • the pre-trained attention mechanism outputs the second attention parameter according to the order, and the second attention parameter includes the position of the second feature region; after the first feature region is decoded, pass The second attention parameter informs the pre-trained long and short-term memory network to pay attention to the location of the second feature region, so that the pre-trained long and short-term memory network decodes the second feature region;
  • the feature region is decoded, according to the second attention parameter, take the decoded features of the first feature region and the second feature region as input, and input them into the pre-trained long-short-term memory network for decoding; loop decoding until all feature regions are completed in turn Decoding.
  • the spatial transformation network and the coding network are deployed in the feature coding space, the attention mechanism and the long-short-term memory network are deployed in the feature decoding space, end-to-end training can be realized, that is, the feature coding space and the Feature decoding space for training. Therefore, before the image to be recognized is input into the feature decoding space, there is no need to preprocess the image.
  • the above decoding result is the character corresponding to the license plate information in the image to be recognized. Since the decoding is performed according to the time series attribute in the feature decoding space, the obtained decoded character also has the time series attribute, and the obtained decoded character is according to the time series. The attributes are output to satisfy the character sorting of the license plate number.
  • the image to be recognized of the license plate number is corrected by the spatial transformation network in the feature encoding space
  • the image to be recognized is corrected by the encoding network
  • the feature region is processed in the feature decoding space in time sequence.
  • Decoding is an end-to-end decoding form, which avoids the accumulation of errors in multiple steps in image preprocessing, and improves the robustness of license plate number recognition; moreover, the entire training process and recognition process only go through the encoding space and the decoding space. Realize end-to-end license plate number recognition.
  • FIG. 3 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention, which is composed of an encoder and a decoder, and the STN layer is deployed in the encoder to be recognized.
  • the image is rectified and the convolutional neural network is used for feature extraction, and the decoder is a combination of long and short-term memory network and attention mechanism.
  • the image information of the license plate to be recognized is "Zhe J ⁇ L9098”
  • the input includes image parameters such as color channel (3, RGB), width (W), height (H), and feature encoding is performed in the encoding space Then, the feature image is obtained.
  • the channel corresponding feature areas in the feature image are the first feature area corresponding to Chinese characters, the second feature area corresponding to alphabetic characters, and the third feature area corresponding to letters/numbers.
  • the attention mechanism sorts the feature regions corresponding to each channel according to the time sequence attribute, and prompts the feature regions corresponding to each channel.
  • the attention mechanism outputs The first attention parameter, the first attention parameter is composed of the start instruction ⁇ start> + the location of the first feature region.
  • the long and short-term memory network in the decoder will decode which Chinese character the first feature area belongs to among the 31 Chinese characters, and the decoding result is "Zhe".
  • the current decoding state will be saved, and the attention mechanism will output the first Two attention parameters, the second attention parameter is composed of the previous decoding state + the location of the second feature region.
  • the decoding state of the first feature region and the length of the second feature region enter the decoding state of the first feature region and the length of the second feature region to the decoder
  • the memory network when decoding, because the previous decoding state is the Chinese character decoding state, in the ordinary car license plate rules, the probability of a Chinese character followed by a letter is 100%, and the long and short-term memory network in the decoder will decode in 24 letters Find out which letter the second feature area belongs to, and the decoding result is "J".
  • the current decoding state will be saved, and the attention mechanism will output the third attention parameter.
  • the third attention parameter is from the previous decoding state.
  • the position of the third feature area is composed of, at h3, input the decoding state of the second feature area and the long and short-term memory network of the third feature area into the decoder.
  • the previous decoding state is a letter state In the ordinary car license plate rules, the probability of a letter followed by a Chinese character is 0%.
  • the long and short-term memory network in the decoder will decode which letter or number the third characteristic area belongs to among 24 letters and 10 numbers.
  • the decoding result is "L", at this time, the current decoding state will be saved, and the attention mechanism will output the fourth attention parameter. Until the long and short-term memory network outputs ⁇ end> to end the recognition, it is considered that the recognition has been completed and the decoding result is output.
  • the decoder is an architecture that combines a long and short-term memory network and an attention mechanism, so that the encoder + decoder With the characteristics of deep neural networks, deep learning methods can be used to drive the training of the entire encoder + decoder model with data. The more complete the training data, the more scenes that can be identified, which improves the robustness of the model.
  • the encoder + decoder is an end-to-end model, there is no need to preprocess the image, which improves the speed of the license plate number recognition. Since there are no multiple steps in the preprocessing process, it will not cause errors to accumulate. Improve the recognition accuracy of the license plate number.
  • FIG. 4 is a schematic structural diagram of a license plate number recognition device provided by an embodiment of the present invention. As shown in FIG. 4, it includes:
  • the encoding module 401 is used to input the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels.
  • the image to be identified includes license plate information, and the feature image includes the Multiple characteristic regions corresponding to each channel, the channel having a time sequence attribute;
  • the decoding module 402 is configured to input the characteristic image into a preset characteristic decoding space according to the time sequence attribute, and in the characteristic decoding space, use an attention mechanism to adjust the characteristic region in the characteristic image according to the time sequence attribute. Decoding sequentially;
  • the output module 403 is configured to output the decoding result according to the time sequence attribute to obtain the recognition result of the image to be recognized.
  • the preset feature coding space includes a pre-trained space transformation network and a pre-trained coding network.
  • the coding module 401 includes:
  • the correction unit 4011 is configured to perform correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correct the image to be recognized according to the prediction result to obtain a corrected image;
  • the encoding unit 4012 is configured to input the corrected image into the pre-trained encoding network, and perform convolution calculation on the corrected image through multiple convolution cores in the encoding network to obtain multiple channels
  • the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long and short-term memory network
  • the decoding module 402 includes:
  • the attention unit 4021 is configured to report the time series attribute of each channel to the pre-trained attention mechanism when the characteristic image is input into the characteristic decoding space according to the time series attribute;
  • the decoding unit 4022 is configured to sort the feature regions corresponding to the channel according to the time sequence attribute through the pre-trained attention mechanism, and notify the pre-trained long-short-term memory network to decode sequentially according to the sorting Corresponds to the sorted characteristic area.
  • the decoding unit 4022 includes:
  • the first decoding subunit 40221 is configured to output first attention parameters in the order through the pre-trained attention mechanism, and notify the pre-trained long and short-term memory network to perform the first attention parameters through the first attention parameters. Decoding the characteristic area;
  • the output subunit 40222 is configured to output second attention parameters according to the order by the pre-trained attention mechanism when decoding the first feature region, and the second attention parameters include the information of the second feature region. position;
  • the second decoding subunit 40223 is configured to notify the pre-trained long and short-term memory network to decode the second feature region through the second attention parameter after the decoding of the first feature region is completed;
  • the loop sub-unit 40224 is used for loop decoding until the decoding of all characteristic regions is completed in sequence.
  • the second decoding subunit 40223 is further configured to, after decoding the first characteristic region, determine the value of the first characteristic region according to the second attention parameter
  • the decoded feature and the second feature region are used as inputs, and are input to the pre-trained long and short-term memory network for decoding.
  • the device further includes:
  • the up-sampling module 404 is configured to up-sample the characteristic image so that the size of the characteristic image is the same as the size of the image to be recognized;
  • the prediction module 405 is configured to perform pixel prediction on the up-sampled feature image according to the channel of the feature image, and predict the feature area to which each pixel in the up-sampled feature image belongs;
  • the labeling module 406 is configured to label the feature area to which each pixel in the up-sampled feature image belongs according to the time sequence attribute of the channel, so that each pixel in the up-sampled feature image The characteristic area to which it belongs has time-series attributes, and the marked characteristic image is obtained;
  • the decoding module 402 is further configured to input the labeled feature image into a preset feature decoding space according to the time sequence attribute, and use an attention mechanism to place the feature region in the feature decoding space according to the time sequence attribute. To decode.
  • license plate number recognition device provided in the embodiment of the present invention can be applied to mobile phones, monitors, computers, servers and other devices that need to perform license plate number recognition.
  • the license plate number recognition device provided by the embodiment of the present invention can realize each process realized by the license plate number recognition method in the foregoing method embodiment, and can achieve the same beneficial effects. To avoid repetition, I won’t repeat them here.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention. As shown in FIG. 9, it includes: a memory 902, a processor 901, and a memory 902 stored in the memory 902 and available in the processor.
  • the computer program running on 901 of which:
  • the processor 901 is configured to call a computer program stored in the memory 902, and execute the following steps:
  • the image to be identified includes license plate information, and the feature image includes multiple channels corresponding to the multiple channels.
  • the channel In a characteristic area, the channel has a time sequence attribute;
  • the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
  • the preset feature coding space includes a pre-trained space transformation network and a pre-trained coding network.
  • the processor 901 performs the input of the image to be recognized into the preset feature coding space. Correction and encoding, encoding to obtain feature images with multiple channels, including:
  • the number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
  • the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long and short-term memory network
  • the processor 901 executes the input of the feature image according to the time sequence attribute Go to the preset feature decoding space, and decode the feature area corresponding to the channel according to the time sequence attribute through the attention mechanism, including:
  • the feature regions corresponding to the channels are sorted according to the time sequence attributes, and the pre-trained long- and short-term memory network is notified according to the sorting order to sequentially decode the sequence corresponding to the sorting Characteristic area.
  • the execution of the processor 901 to notify the preset trained long and short-term memory network according to the order to sequentially decode the feature regions corresponding to the order includes:
  • the pre-trained attention mechanism When decoding the first feature region, the pre-trained attention mechanism outputs a second attention parameter according to the order, and the second attention parameter includes the position of the second feature region;
  • the processor 901 executes the decoding of the second characteristic region after the first characteristic region is decoded, and the second attention parameter is used to notify the pre-trained long and short-term memory network to decode the second characteristic region.
  • the second attention parameter is used to notify the pre-trained long and short-term memory network to decode the second characteristic region.
  • the decoded features of the first feature region and the second feature region are used as input, and then input to the pre-trained long and short-term memory network In the decoding.
  • the processor 901 further executes the following steps:
  • the channel According to the time series attribute of the channel, annotate the feature area to which each pixel in the up-sampled feature image belongs, so that the feature area to which each pixel in the up-sampled feature image belongs has time sequence Attributes to get the marked feature image;
  • the processor 901 executes the input of the characteristic image into a preset characteristic decoding space according to the time series attribute, and decodes the characteristic region corresponding to the channel according to the time series attribute through an attention mechanism, including :
  • the labeled feature image is input into a preset feature decoding space according to the time series attribute, and the feature area is decoded in the feature decoding space according to the time series attribute through an attention mechanism.
  • the above-mentioned electronic device may be applied to devices such as mobile phones, monitors, computers, servers, etc., that require license plate number recognition.
  • the electronic device provided in the embodiment of the present invention can implement each process implemented by the license plate number recognition method in the foregoing method embodiment, and can achieve the same beneficial effects. To avoid repetition, details are not described herein again.
  • the embodiment of the present invention also provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program. To achieve the same technical effect, in order to avoid repetition, I will not repeat them here.
  • the program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM for short), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A vehicle license plate number recognition method and apparatus, an electronic device, and a storage medium. The method comprises: inputting an image to be recognized into a preset feature encoding space to perform correction and encoding to obtain a feature image having a plurality of channels (101), the image to be recognized comprising license plate information, the feature image comprising a plurality of feature regions corresponding to the plurality of channels, and the channels having time sequence attributes; inputting the feature image into a preset feature decoding space according to the time sequence attributes, and decoding the feature regions corresponding to the channels by means of an attention mechanism according to the time sequence attributes (102); and outputting the decoding results according to the time sequence attributes to obtain the recognition result of the image to be recognized (103), such that the accumulation of errors in multiple steps can be avoided, and the robustness of license plate number recognition can be improved, and in addition, end-to-end license plate number recognition can be achieved using only the encoding space and the decoding space in the entire recognition process.

Description

车牌号码识别方法、装置、电子设备及存储介质License plate number recognition method, device, electronic equipment and storage medium 技术领域Technical field
本发明涉及人工智能技术领域,尤其涉及一种人车牌号码识别方法、装置、电子设备及存储介质。The present invention relates to the field of artificial intelligence technology, and in particular to a method, device, electronic equipment and storage medium for recognizing a person's license plate number.
背景技术Background technique
图像识别是当前交通、小区或停车场管理常用的技术之一,例如:使用基于图像识别的车牌号码识别,识别到车辆的车牌号码。目前传统车牌号码识别一般被分为多个独立的步骤,比如:1.图像归一化:把车牌图片通过计算机视觉方法(如单应性矩阵homography等)编程“正式图”。2.图像预处理:在这里对于图像的遮挡,污垢,光照等情况进行处理(如二值分布binarized等)3.字符分割:通过计算机视觉方法进行字符分割(如边缘检测edge detection等)4.字符识别:对分割好的字符进行识别(如随机森林random forest,支持向量机svm,逻辑回归logistic regression等机器学习或深度学习方法)。这样导致了每个步骤中出现的错误可能会累加,从而造成最终识别效果不佳,也不容易定位问题出现在哪一步。而且传统车牌识别对于输入图片的要求相对来说比较高,有严格的角度以及清晰度要求。传统车牌识别的种种限制导致在安装摄像头,监控场景有着严格要求,并且识别率容易受到天气,光照等影响。因此,传统车牌识别容易受到多种因素的影响导致识别效果不佳,存在鲁棒性差的问题。Image recognition is currently one of the commonly used technologies in traffic, community or parking lot management. For example, using image recognition-based license plate number recognition to identify the vehicle's license plate number. At present, traditional license plate number recognition is generally divided into multiple independent steps, such as: 1. Image normalization: the license plate image is programmed into a "formal image" through computer vision methods (such as homography matrix homography, etc.). 2. Image preprocessing: processing the occlusion, dirt, light, etc. of the image here (such as binary distribution binarized, etc.) 3. Character segmentation: character segmentation through computer vision methods (such as edge detection, etc.) 4. Character recognition: Recognize the segmented characters (such as random forest, support vector machine svm, logistic regression and other machine learning or deep learning methods). As a result, the errors in each step may accumulate, resulting in a poor final recognition effect, and it is not easy to locate where the problem occurs. Moreover, traditional license plate recognition has relatively high requirements for input images, and has strict requirements on angle and clarity. The various limitations of traditional license plate recognition result in strict requirements for the installation of cameras and monitoring scenes, and the recognition rate is easily affected by weather and light. Therefore, traditional license plate recognition is susceptible to many factors, which leads to poor recognition results and poor robustness.
发明内容Summary of the invention
本发明实施例提供一种车牌号码识别方法,能够提高车牌号码识别的鲁棒性。The embodiment of the present invention provides a method for recognizing a license plate number, which can improve the robustness of the recognition of a license plate number.
第一方面,本发明实施例提供一种车牌号码识别方法,包括:In the first aspect, an embodiment of the present invention provides a method for recognizing a license plate number, including:
将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多 个通道对应的多个特征区域,所述通道具有时序属性;Input the image to be recognized into the preset feature encoding space for correction and encoding, and obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes multiple channels corresponding to the multiple channels. In a characteristic area, the channel has a time sequence attribute;
将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;Inputting the feature image into a preset feature decoding space according to the time series attribute, and in the feature decoding space, using an attention mechanism to decode feature regions in the feature image according to the time series attribute;
按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
可选的,所述预设的特征编码空间包括预先训练好的空间变换网络以及预先训练好的编码网络,所述将待识别图像输入到预设的特征编码空间进行矫正及编码,编码得到具有多个通道的特征图像,包括:Optionally, the preset feature encoding space includes a pre-trained space transformation network and a pre-trained encoding network, and the image to be recognized is input into the preset feature encoding space for correction and encoding, and the encoding has Feature images of multiple channels, including:
在所述预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并根据预测结果矫正所述待识别图像,得到矫正后图像;Performing correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correcting the image to be recognized according to the prediction result to obtain a corrected image;
将所述矫正后图像输入到所述预先训练好的编码网络,通过所述编码网络中的多个卷积核对所述矫正后图像进行卷积计算,得到具有多个通道的特征图像,其中,所述通道的数量与所述卷积核的数量相同,所述通道的时序属性与所述卷积核计算的先后顺序相关联。Input the corrected image to the pre-trained coding network, and perform convolution calculation on the corrected image through multiple convolution kernels in the coding network to obtain a feature image with multiple channels, wherein, The number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
可选的,所述预设的特征解码空间包括预先训练好的注意力机制以及预先训练好的长短时记忆网络,所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:Optionally, the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long- and short-term memory network, and the feature image is input into the preset feature decoding space according to the timing attributes , And decode the characteristic area corresponding to the channel according to the time sequence attribute through the attention mechanism, including:
在将所述特征图像按所述时序属性输入到特征解码空间时,向所述预先训练好的注意力机制上报各个通道的时序属性;When the feature image is input into the feature decoding space according to the time sequence attribute, reporting the time sequence attribute of each channel to the pre-trained attention mechanism;
通过所述预先训练好的注意力机制将所述通道对应的特征区域按所述时序属性进行排序,并按所述排序通知所述预先训练好的长短时记忆网络依次解码对应于所述排序的特征区域。Through the pre-trained attention mechanism, the feature regions corresponding to the channels are sorted according to the time sequence attributes, and the pre-trained long- and short-term memory network is notified according to the sorting order to sequentially decode the sequence corresponding to the sorting Characteristic area.
可选的,所述按所述排序通知所述预设训练好的长短时记忆网络依次解码对应于所述排序的特征区域,包括:Optionally, the notifying the preset trained long and short-term memory network according to the ranking to sequentially decode the feature regions corresponding to the ranking includes:
通过所述预先训练好的注意力机制按所述排序输出第一注意参数,通过所述第一注意参数通知所述预先训练好的长短时记忆网络对第一特征区域进行解码;Output first attention parameters in the order by the pre-trained attention mechanism, and notify the pre-trained long and short-term memory network to decode the first feature region through the first attention parameters;
在对所述第一特征区域进行解码时,所述预先训练好的注意力机制按所述排序输出第二注意参数,所述第二注意参数包括第二特征区域的位置;When decoding the first feature region, the pre-trained attention mechanism outputs a second attention parameter according to the order, and the second attention parameter includes the position of the second feature region;
在对所述第一特征区域解码完成后,所述所述预先训练好的长短时记忆网络对所述第二特征区域进行解码;After the decoding of the first characteristic region is completed, the pre-trained long and short-term memory network decodes the second characteristic region;
直到依次完成所有特征区域的解码。Until the decoding of all feature regions is completed in turn.
可选的,所述在对第一特征区域解码完成后,所述所述预先训练好的长短时记忆网络对第二特征区域进行解码,包括:Optionally, the decoding of the second characteristic region by the pre-trained long-short-term memory network after the decoding of the first characteristic region is completed includes:
在对所述第一特征区域完成解码后,将所述第一特征区域的解码特征与第二特征区域作为输入,输入到所述预先训练好的长短时记忆网络中进行解码。After the first feature region is decoded, the decoded features of the first feature region and the second feature region are used as inputs, and input into the pre-trained long-short-term memory network for decoding.
可选的,在所述将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像之后,所述方法还包括:Optionally, after inputting the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels, the method further includes:
将所述特征图像进行上采样,以使所述特征图像的尺寸与所述待识别图像的尺寸相同;Up-sampling the characteristic image so that the size of the characteristic image is the same as the size of the image to be recognized;
根据所述特征图像的通道,对上采样后的特征图像进行像素点预测,预测所述上采样后的特征图像中每个像素点所属的特征区域;Performing pixel point prediction on the up-sampled feature image according to the channel of the feature image, and predicting the feature area to which each pixel point in the up-sampled feature image belongs;
根据所述通道的时序属性,对所述上采样后的特征图像中每个像素点所属的特征区域进行标注,以使所述上采样后的特征图像中每个像素点所属的特征区域具有时序属性,得到标注特征图像;According to the time series attribute of the channel, annotate the feature area to which each pixel in the up-sampled feature image belongs, so that the feature area to which each pixel in the up-sampled feature image belongs has time sequence Attributes to get the marked feature image;
所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:The inputting the characteristic image into a preset characteristic decoding space according to the time series attribute, and decoding the characteristic region corresponding to the channel according to the time series attribute through an attention mechanism, includes:
将所述标注特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述特征区域按所述时序属性在所述特征解码空间中进行解码。The labeled feature image is input into a preset feature decoding space according to the time series attribute, and the feature area is decoded in the feature decoding space according to the time series attribute through an attention mechanism.
第二方面,本发明实施例提供一种车牌号码识别装置,包括:In a second aspect, an embodiment of the present invention provides a license plate number recognition device, including:
编码模块,用于将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;The encoding module is used to input the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes A plurality of characteristic regions corresponding to a channel, the channel having a time sequence attribute;
解码模块,用于将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;The decoding module is used to input the characteristic image into a preset characteristic decoding space according to the time sequence attribute, and in the characteristic decoding space, the characteristic regions in the characteristic image are sequentially arranged according to the time sequence attribute through the attention mechanism. Decode
输出模块,用于按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The output module is used to output the decoding result according to the time sequence attribute to obtain the recognition result of the image to be recognized.
第三方面,本发明实施例提供一种电子设备,包括:存储器、处理器及存 储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本发明实施例提供的车牌号码识别方法中的步骤。In a third aspect, an embodiment of the present invention provides an electronic device including: a memory, a processor, and a computer program stored on the memory and capable of running on the processor, and when the processor executes the computer program The steps in the license plate number recognition method provided by the embodiment of the present invention are realized.
第四方面,本发明实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现发明实施例提供的车牌号码识别方法中的步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for recognizing a license plate number provided by the embodiment of the present invention is implemented A step of.
本发明实施例中,将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码;按所述时序属性输出解码结果,得到所述待识别图像的识别结果。在特征编码空间内对车牌号码的待识别图像进行矫正及特征编码,在特征解码空间中按时序进行对特征区域进行解码,避免了多个步骤的错误累加,提高了车牌号码识别的鲁棒性;而且整个识别过程只经过编码空间与解码空间,可以实现端对端的车牌号码识别。In the embodiment of the present invention, the image to be recognized is input into a preset feature encoding space for correction and encoding, and a feature image with multiple channels is obtained. The image to be identified includes license plate information, and the feature image includes the multiple A plurality of feature regions corresponding to each channel, the channel has time series attributes; the feature image is input into the preset feature decoding space according to the time series attributes, and the feature region corresponding to the channel is set according to all the features through the attention mechanism. The time sequence attribute is decoded; the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained. Perform correction and feature coding on the image to be recognized of the license plate number in the feature encoding space, and decode the feature region in time sequence in the feature decoding space, which avoids the accumulation of errors in multiple steps and improves the robustness of license plate number recognition ; And the entire recognition process only goes through the coding space and the decoding space, and end-to-end license plate number recognition can be realized.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是本发明实施例提供的一种车牌号码识别方法的流程图;FIG. 1 is a flowchart of a method for recognizing a license plate number provided by an embodiment of the present invention;
图2是本发明实施例提供的另一种车牌号码识别方法的流程图;2 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention;
图3是本发明实施例提供的另一种车牌号码识别方法的流程图;3 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention;
图4是本发明实施例提供的一种车牌号码识别装置的结构示意图;4 is a schematic diagram of the structure of a license plate number recognition device provided by an embodiment of the present invention;
图5是本发明实施例提供的另一种车牌号码识别装置的结构示意图;FIG. 5 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention;
图6是本发明实施例提供的另一种车牌号码识别装置的结构示意图;Fig. 6 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention;
图7是本发明实施例提供的另一种车牌号码识别装置的结构示意图;FIG. 7 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention;
图8是本发明实施例提供的另一种车牌号码识别装置的结构示意图;8 is a schematic structural diagram of another vehicle license plate number recognition device provided by an embodiment of the present invention;
图9是本发明实施例提供的一种电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
请参见图1,图1是本发明实施例提供的一种车牌号码识别方法的流程图,如图1所示,包括以下步骤:Please refer to FIG. 1. FIG. 1 is a flowchart of a method for recognizing a license plate number according to an embodiment of the present invention. As shown in FIG. 1, it includes the following steps:
101、将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像。101. Input the image to be recognized into a preset feature encoding space for correction and encoding, to obtain a feature image with multiple channels.
其中,上述的待识别图像包括车牌信息,上述的特征图像包括与多个通道对应的多个特征区域,上述的通道具有时序属性。Wherein, the above-mentioned image to be recognized includes license plate information, the above-mentioned characteristic image includes a plurality of characteristic regions corresponding to a plurality of channels, and the above-mentioned channels have time series attributes.
上述的待识别图像可以是通过用户上传的车辆车牌的静态图像或动态视频的图像帧,也可以是通过部署在交通道路上、小区出入口、停车场出入口的摄像头获取到的车辆车牌的静态图像或动态视频的图像帧。The above-mentioned image to be recognized can be a static image of a vehicle license plate or an image frame of a dynamic video uploaded by the user, or a static image or a static image of the vehicle license plate obtained by a camera deployed on a traffic road, entrances and exits of communities, and entrances and exits of parking lots. The image frame of the dynamic video.
上述的待识别图像中的车牌信息可以是一个或者多个,即一张待识别图像中有一个或多个待识别的车牌号码。The above-mentioned license plate information in the image to be recognized may be one or more, that is, there are one or more license plate numbers to be recognized in an image to be recognized.
上述的特征编码空间可以是全卷积网络空间,上述的全卷积网络空间可以通过卷积计算,预测待识别图像的矫正参数,根据预测得到的矫正参数对待识别图像进行矫正。上述的全卷积网络空间可以通过卷积计算,预测车牌信息中各个字符所对应的特征区域。The aforementioned feature encoding space may be a fully convolutional network space, and the aforementioned fully convolutional network space may be calculated by convolution to predict the correction parameters of the image to be recognized, and correct the image to be recognized according to the predicted correction parameters. The above-mentioned fully convolutional network space can be calculated by convolution to predict the feature area corresponding to each character in the license plate information.
上述的矫正可以理解为将待识别图像进行空间变换和对齐,可以包括对待识别图像的平移、缩放、旋转等。The above correction can be understood as performing spatial transformation and alignment of the image to be recognized, and may include translation, scaling, and rotation of the image to be recognized.
上述的特征区域通过全卷积网络中的通道进行确定,上述的通道为卷积计算后输出得到的通道。具体来说,是通过通道的通道值进行确定。在全卷积网络中,通过卷积核对待识别图像进行卷积计算,提取出对应的特征,一个卷积核对应得到一个通道。比如,待识别车牌的参数(3,W,H),其中,W,H车牌的高宽,3为待识别车牌的RGB三色通道,在通过一个卷积核分别对RGB三通道进行卷积计算后,输出时得到的是RGB三通道对应通道值加和之后的通道,比如,通过一个卷积核分别对RGB三通道进行卷积计算后,分别得到(R1,R2,R3,……,Rn),(G1,G2,G3,……,Gn),(B1,B2,B3,……, Bn),则加和之后的通道为(R1+G1+B1,R2+G2+B2,R3+G3+B3,……,Rn+Gn+Bn),所以可以认为一个卷积核对应得到一个通道。根据同一特征点上的不同的通道值,确定不同的特征区域,比如,在同一特征点上,通道值最大表示该特征点属于该通道对应的特征区域。以车牌为例进一步说明,普通汽车车牌由7个字符组成,在卷积过程中,需要将这7个字符进行分割,每个字符成为一个特征区域,也对应一个通道,也可以称为字符区域,在对车牌图像做完卷积后,每个字符区域会对应被一个通道中所表征。不同的通道表征不同的字符区域,一个特征点所属字符区域为该特征处通道值最大的通道所对应的字符区域。The above-mentioned feature area is determined by the channel in the full convolutional network, and the above-mentioned channel is the output channel obtained after convolution calculation. Specifically, it is determined by the channel value of the channel. In a fully convolutional network, the convolution kernel is used to perform convolution calculations on the image to be recognized, and the corresponding features are extracted, and one convolution kernel corresponds to one channel. For example, the parameters (3, W, H) of the license plate to be recognized, where W, H is the height and width of the license plate, 3 is the RGB three-color channel of the license plate to be recognized, and the RGB three channels are convolved through a convolution kernel. After calculation, the output obtained is the channel after the sum of the corresponding channel values of the RGB three channels. For example, after the RGB three channels are respectively convolved through a convolution kernel, (R1, R2, R3, ..., Rn), (G1, G2, G3, ..., Gn), (B1, B2, B3, ..., Bn), then the channel after the addition is (R1+G1+B1, R2+G2+B2, R3 +G3+B3,..., Rn+Gn+Bn), so it can be considered that one convolution kernel corresponds to one channel. According to different channel values on the same feature point, different feature regions are determined. For example, on the same feature point, the largest channel value indicates that the feature point belongs to the feature region corresponding to the channel. Take the license plate as an example to further illustrate that a normal car license plate is composed of 7 characters. In the convolution process, these 7 characters need to be segmented. Each character becomes a characteristic area, which corresponds to a channel, which can also be called a character area. , After convolving the license plate image, each character area will be represented by a channel. Different channels represent different character regions, and the character region to which a feature point belongs is the character region corresponding to the channel with the largest channel value at the feature.
因此,可以通过遍历各个特征点的最大通道值,确定各个特征点所对应的特征区域。由于车牌号码是由多个字符进行组合的,在特征编码空间进行特征编码后,输出的特征图像需要对应有多个字符所在的特征区域,故特征编码空间的输出为对应于字符数量的多个通道的特征图像。上述的多个通道具有时序属性,该时序属性为编码过程中,卷积核的卷积计算顺序决定,比如,第一个卷积核进行卷积计算后,得到第一个通道,第二个卷积核进行计算后,得到第二个通道,可以看出由于卷积核的计算顺序,使得通道具有时序属性。Therefore, the feature area corresponding to each feature point can be determined by traversing the maximum channel value of each feature point. Since the license plate number is composed of multiple characters, after feature encoding is performed in the feature encoding space, the output feature image needs to correspond to the feature area where multiple characters are located, so the output of the feature encoding space is multiple corresponding to the number of characters The characteristic image of the channel. The above-mentioned multiple channels have timing attributes, which are determined by the convolution calculation order of the convolution kernel during the encoding process. For example, after the first convolution kernel performs the convolution calculation, the first channel is obtained, and the second After the convolution kernel is calculated, the second channel is obtained. It can be seen that due to the calculation sequence of the convolution kernel, the channel has a timing attribute.
应该理解,在特征编码空间中,对待识别图像进行特征编码是一个对待识别图像的特征提取过程;对待识别图像的矫正是一个预测矫正,矫正的效果与训练数据的完善程度成正相关,在对车牌号码识别前,无需复杂的图像预处理步骤,直接输入待识别图像即可。It should be understood that in the feature coding space, the feature coding of the image to be recognized is a feature extraction process of the image to be recognized; the correction of the image to be recognized is a predictive correction, and the effect of the correction is positively correlated with the perfection of the training data. Before the number is recognized, no complicated image preprocessing steps are required, and the image to be recognized can be directly input.
102、将特征图像按所述时序属性输入到预设的特征解码空间,在特征解码空间中通过注意力机制将特征图像中的特征区域按时序属性依次进行解码。102. Input the feature image into a preset feature decoding space according to the time series attributes, and decode the feature regions in the feature image according to the time series attributes in the feature decoding space through an attention mechanism.
在该步骤中,特征图像为步骤101中经过特征编码空间编码得到的特征图像,该特征图像包括与车牌字符数量对应的通道,每个通道对应不同的特征区域,也可以理解为每个通道对应不同的字符区域。可以理解的是,将特征图像按时序属性输入到预设的特征解码空间指的是将特征图像对应的多个通道按时序属性输入到特征解码空间。In this step, the feature image is the feature image obtained through feature encoding space encoding in step 101. The feature image includes channels corresponding to the number of license plate characters. Each channel corresponds to a different feature area, which can also be understood as each channel corresponds to Different character areas. It can be understood that inputting the feature image into the preset feature decoding space according to the time series attribute refers to inputting the multiple channels corresponding to the feature image into the feature decoding space according to the time series attribute.
特征解码空间依次对各个通道对应的特征区域进行解码,以解码得到对应特征区域所表征的字符。The feature decoding space sequentially decodes the feature regions corresponding to each channel to decode the characters represented by the corresponding feature regions.
上述的各个通道的时序属性由注意力机制进行维护,在特征图像输入到特 征解码空间后,由于输入时,特征图像的通道具有时序属性,该注意力机制会将特征图像的通道时序进行排序,并输出排序对应的注意力参数,使得各个通道按时序属性进行解码。The above-mentioned time sequence attributes of each channel are maintained by the attention mechanism. After the feature image is input into the feature decoding space, since the channel of the feature image has a time sequence attribute during input, the attention mechanism will sort the channel timing of the feature image. And output the attention parameters corresponding to the sorting, so that each channel is decoded according to the timing attribute.
上述的特征解码空间可以是基于时序的神经网络,比如循环神经网络(英文:Recurrent Neural Network,简称:CNN)、长短时记忆网络(英文:Long-Short Term Memory Cells,简称:LSTM)。上述的基于时序的神经网络可以根据前一个字符与后一个字符的联系进行预测,比如,根据相关车牌号码的规范,在汽车车牌“浙J·L9098”中,在前一个字符为汉字“浙”时,后一个字符为字母的概率为100%,即前一个字符为汉字类别的情况下,在对后一个字符进行解码时,可以不用考虑后一个字符为汉字或数字类别,只在24字母的类别中进行解码。相当于后面的字符依赖于前面的字符进行解码。The above-mentioned feature decoding space may be a neural network based on time sequence, such as recurrent neural network (English: Recurrent Neural Network, abbreviated as: CNN), long-short-term memory network (English: Long-Short Term Memory Cells, abbreviated as: LSTM). The above-mentioned neural network based on time series can make predictions based on the relationship between the previous character and the next character. For example, according to the specifications of the relevant license plate number, in the car license plate "Zhe J·L9098", the previous character is the Chinese character "Zhe" When the next character is a letter, the probability is 100%, that is, when the previous character is a Chinese character type, when decoding the next character, you don’t need to consider the latter character as a Chinese character or number. It is only in the case of a 24-letter character. Decode in the category. It is equivalent to the following characters to be decoded depending on the preceding characters.
需要说明的是,在进行解码时,解码得到的字符与车牌号码的结构组成相关,以国内常用的民用车牌为例,车牌包括三个部分,第一部分为省、自治区、直辖市简称,第二部分为发牌机构代号,第三部分为序号。在汽车车牌“浙J·L9098”中,第一部分为“浙”,第二部分为“J”,第三部分为“L9098”。根据国内的行政划分,第一部分为省、自治区、直辖市简称字符,有31个对应的汉字字符,第二个字符是发牌机构代号字符,使用大写字母对应的字符进行表示,有24个对应的大写字母字符(由于大写字母中I、O易与数字1、0混淆,所以在车牌号码编制中不纳入字母字符,所以为24个),数字为0-9共10个字符,所以总共为65个字符可供解码。在传统解码中,由于没有基于时序进行解码,不考虑前一个解码结果,使得车牌上每个字符都要从这65个字符中进行解码。而基于时序进行解码,则第一个字符解码只需要在31个汉字字符中进行解码,第二个字符只需要在24个字母字符中进行解码,剩余的字符依次只需要在字母字符和数字字符共34个字符中进行解码即可。It should be noted that when decoding, the decoded characters are related to the structure of the license plate number. Taking the common domestic civilian license plate as an example, the license plate includes three parts. The first part is the abbreviation of the province, autonomous region, and municipality, and the second part It is the code of the licensing organization, and the third part is the serial number. In the car license plate "ZheJ·L9098", the first part is "Zhe", the second part is "J", and the third part is "L9098". According to the domestic administrative division, the first part is the abbreviation characters of provinces, autonomous regions, and municipalities directly under the Central Government, with 31 corresponding Chinese characters, and the second character is the code of the licensing agency, which is represented by the characters corresponding to uppercase letters. There are 24 corresponding characters. Uppercase alphabetic characters (because I and O in uppercase letters are easily confused with the numbers 1, 0, so alphabetic characters are not included in the license plate number compilation, so there are 24), the numbers are 0-9 and there are 10 characters in total, so the total is 65 Characters are available for decoding. In traditional decoding, since there is no decoding based on timing, the previous decoding result is not considered, so that every character on the license plate must be decoded from these 65 characters. For decoding based on time sequence, the first character decoding only needs to be decoded in 31 Chinese characters, the second character only needs to be decoded in 24 alphabetic characters, and the remaining characters only need to be decoded in alphabetic characters and numeric characters in turn. A total of 34 characters can be decoded.
当然,上述只是以一种常用的民用车牌为例,不应视为是对本发明的限制,还可以有其他不同的用途的车牌具有不同的车牌号码结构,比如警车车牌、教练车牌、出入境车牌、使馆车牌、军车车牌、武警车牌、民航车牌、挂车车牌、农用车牌、个性车牌等具有不同的车牌号码结构。Of course, the above is just an example of a common civilian license plate, which should not be regarded as a limitation of the present invention. There can also be other license plates for different purposes with different license plate number structures, such as police car license plates, coach license plates, and entry-exit license plates. , Embassy license plates, military vehicle license plates, armed police license plates, civil aviation license plates, trailer license plates, agricultural license plates, individual license plates, etc. have different license plate number structures.
上述的注意力机制可以是通道注意力模块(英文:Attention RefinmentModule,简称:ARM)。上述的通道注意力模块可以为各通道对应的 特征区域分配对应的注意力参数,上述的注意力参数为对应特征区域所在通道中的位置。比如,“浙”字符所对应的特征区域所在通道上的各个特征点的通道值均大于其他通道的值,此时,将“浙”字符所对应的特征区域的位置作为注意力参数,在开始解码时,根据该注意力参数通知特征解码空间对该位置进行解码。The aforementioned attention mechanism may be a channel attention module (English: Attention Reinment Module, abbreviated as ARM). The above-mentioned channel attention module can assign corresponding attention parameters to the feature regions corresponding to each channel, and the above attention parameters are the positions in the channels where the corresponding feature regions are located. For example, the channel value of each feature point on the channel where the feature area corresponding to the "Zhe" character is located is greater than the value of other channels. At this time, the position of the feature area corresponding to the "Zhe" character is used as the attention parameter. When decoding, the feature decoding space is notified to decode the position according to the attention parameter.
上述的注意力机制也可以是直接针对特征区域在特征图像的二维空间位置的注意力机制,根据特征图像的高宽,计算得到各个字符对应的特征区域在特征图像中的二维空间位置,注意力机制为各个字符对应的特征区域的二维空间位置按从上到下,从左到右的顺序分配对应的注意力参数。在开始解码时,根据该注意力参数通知特征解码空间依次对特征区域进行解码。The above-mentioned attention mechanism can also be an attention mechanism directly aimed at the two-dimensional spatial position of the feature region in the feature image. According to the height and width of the feature image, the two-dimensional spatial position of the feature region corresponding to each character in the feature image is calculated. The attention mechanism assigns corresponding attention parameters to the two-dimensional spatial position of the feature region corresponding to each character in the order from top to bottom and from left to right. At the beginning of decoding, the feature decoding space is notified to decode the feature regions in sequence according to the attention parameter.
103、按时序属性输出解码结果,得到待识别图像的识别结果。103. Output the decoding result according to the time sequence attribute, and obtain the recognition result of the image to be recognized.
上述的解码结果为待识别图像中的车牌信息对应的字符,由于在特征解码空间中,是按时序属性进行解码,因此,所得到的解码字符也是具有时序属性的,将得到的解码字符按时序属性进行输出,满足对车牌号码的字符排序。The above decoding result is the character corresponding to the license plate information in the image to be recognized. Since the decoding is performed according to the time series attribute in the feature decoding space, the obtained decoded character also has the time series attribute, and the obtained decoded character is according to the time series. The attributes are output to satisfy the character sorting of the license plate number.
本发明实施例中,将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码;按所述时序属性输出解码结果,得到所述待识别图像的识别结果。在特征编码空间内对车牌号码的待识别图像进行矫正及特征编码,在特征解码空间中按时序进行对特征区域进行解码,避免了多个步骤的错误累加,提高了车牌号码识别的鲁棒性;而且整个识别过程只经过编码空间与解码空间,可以实现端对端的车牌号码识别。In the embodiment of the present invention, the image to be recognized is input into a preset feature encoding space for correction and encoding, and a feature image with multiple channels is obtained. The image to be identified includes license plate information, and the feature image includes the multiple A plurality of feature regions corresponding to each channel, the channel has time series attributes; the feature image is input into the preset feature decoding space according to the time series attributes, and the feature region corresponding to the channel is set according to all the features through the attention mechanism. The time sequence attribute is decoded; the decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained. Perform correction and feature coding on the image to be recognized of the license plate number in the feature encoding space, and decode the feature region in time sequence in the feature decoding space, which avoids the accumulation of errors in multiple steps and improves the robustness of license plate number recognition ; And the entire recognition process only goes through the coding space and the decoding space, and end-to-end license plate number recognition can be realized.
需要说明的是,本发明实施例提供的车牌号码识别方法可以应用于需要对车牌号码进行识别的手机、监控器、计算机、服务器等设备。It should be noted that the license plate number recognition method provided by the embodiment of the present invention can be applied to mobile phones, monitors, computers, servers and other devices that need to recognize license plate numbers.
可选的,请参见图2,图2是本发明实施例提供的另一种车牌号码识别方法的流程图,与图1实施例的区别在于:预设的特征编码空间包括预先训练好的空间变换网络以及预先训练好的编码网络,预设的特征解码空间包括预先训练好的注意力机制以及预先训练好的长短时记忆网络,如图2所示,包括以下步骤:Optionally, please refer to FIG. 2. FIG. 2 is a flowchart of another method for recognizing a license plate number according to an embodiment of the present invention. The difference from the embodiment in FIG. 1 is that the preset feature encoding space includes a pre-trained space The transformation network and the pre-trained coding network, the preset feature decoding space includes the pre-trained attention mechanism and the pre-trained long and short-term memory network, as shown in Figure 2, including the following steps:
201、在预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并根据预测结果矫正所述待识别图像,得到矫正后图像。201. Perform correction prediction on the image to be recognized in a pre-trained spatial transformation network, and correct the image to be recognized according to the prediction result to obtain a corrected image.
在该步骤中,上述的预先训练好的空间变换网络可以是STN(Spatial Transform Network)空间变换网络。上述的空间变换网络与编码网络可以组成全卷积神经网络,使得特征编码空间为一个全卷积神经网络。In this step, the aforementioned pre-trained spatial transformation network may be an STN (Spatial Transform Network) spatial transformation network. The aforementioned spatial transformation network and coding network can form a fully convolutional neural network, so that the feature coding space is a fully convolutional neural network.
上述的空间变换网络可以是设置在编码网络之前,这样,可以将待识别图像经过空间变换网络变换成符合编码网络的图像要求,可以理解为将任意输入图像变换为编码网络的期望输入图像。The above-mentioned spatial transformation network may be set before the coding network. In this way, the image to be recognized can be transformed to meet the image requirements of the coding network through the spatial transformation network, which can be understood as transforming any input image into a desired input image of the coding network.
在空间变换网络中,通过计算空间变换的参数,该参数根据待识别图像变换的形式不同而不同,比如,实现2D仿射变换时,该参数就是一个6维(2x3)向量的输出。在计算得到空间变换的参数后,根据该参数生成对应的空间变换函数,根据该变换函数,将待识别图像变换为编码网络所期望的图像。In the spatial transformation network, by calculating the parameters of the spatial transformation, the parameters are different according to the form of the image transformation to be recognized. For example, when a 2D affine transformation is implemented, the parameter is the output of a 6-dimensional (2x3) vector. After the parameters of the space transformation are calculated, the corresponding space transformation function is generated according to the parameters, and the image to be recognized is transformed into the image expected by the coding network according to the transformation function.
具体的,在STN空间变换网络中,通过三个部分对待识别图像进行处理,分别为Localisation net(定位网络),Grid generator(网格生成)和Sample(采样输出)三个部分。其中,Localisation net决定输入所需变换的参数θ,Grid generator通过θ和定义的变换方式寻找输出与输入特征的映射T(θ),Sample结合位置映射和变换参数对输入特征进行选择并结合双线性插值采样进行输出,以使待识别图像变换为编码网络所期望的图像。由于在Localisation net中,需要经过若干卷积或全链接操作后连接一个回归层以回归输出变换参数θ,该参数θ是一个回归预测的参数,所以STN空间变换网络是一个可以进行训练的空间变换网络,因此,STN空间变换网络能够通过训练自适应的学到对于不同数据的空间变换方式。而且,STN空间变换网络不仅可以对输入进行空间变换,同样可以作为网络模块插入到编码网络的任意层中实现对不同特征图像的空间变换,最终使得编码网络可以学习对平移、尺度变换、旋转和更多常见的扭曲的不变性,提高编码网络的特征编码鲁棒性。Specifically, in the STN spatial transformation network, the image to be recognized is processed through three parts, namely Localisation net (location network), Grid generator (grid generation) and Sample (sample output). Among them, the Localisation net determines the input required transformation parameter θ, the Grid generator finds the output and input feature mapping T(θ) through θ and the defined transformation method, Sample combines the position mapping and transformation parameters to select the input features and combines the double line Sexual interpolation sampling is output, so that the image to be recognized is transformed into the image expected by the coding network. Since in the Localisation net, a regression layer needs to be connected after several convolution or full link operations to return to the output transformation parameter θ, which is a regression prediction parameter, so the STN spatial transformation network is a spatial transformation that can be trained Therefore, the STN spatial transformation network can adaptively learn the spatial transformation methods for different data through training. Moreover, the STN spatial transformation network can not only perform spatial transformation on the input, but also can be inserted into any layer of the coding network as a network module to realize the spatial transformation of different feature images. Finally, the coding network can learn to translate, scale, rotate and The invariance of more common distortions improves the robustness of feature coding of the coding network.
202、将矫正后图像输入到预先训练好的编码网络,通过编码网络中的多个卷积核对矫正后图像进行卷积计算,得到具有多个通道的特征图像。202. Input the corrected image to a pre-trained coding network, and perform convolution calculation on the corrected image through multiple convolution kernels in the coding network to obtain a feature image with multiple channels.
其中,上述的通道的数量与卷积核的数量相同,上述的通道的时序属性与卷积核计算的先后顺序相关联。Wherein, the number of the aforementioned channels is the same as the number of convolution kernels, and the timing attributes of the aforementioned channels are related to the order of calculation of the convolution kernels.
上述的矫正后的图像为特征编码空间中空间变换网络进行矫正后的待识 别图像。The above-mentioned corrected image is a to-be-identified image corrected by the spatial transformation network in the feature coding space.
上述的预先训练好的编码网络可以是卷积神经网络,用于提取车牌信息中各个字符所在的特征区域。The above-mentioned pre-trained coding network may be a convolutional neural network, which is used to extract the characteristic region where each character in the license plate information is located.
在一种可能的实施例中,上述的编码网络具有多层计算层,每两个计算层间可以设置空间转换网络,以将上一计算层计算得到的通道进行空间转换,从而满足下一计算层的输入期望,即对每层计算层的输入都进行矫正,降低错误累加的程度,从而提高识别准确度。In a possible embodiment, the aforementioned coding network has multiple calculation layers, and a space conversion network can be set between every two calculation layers to perform spatial conversion on the channels calculated by the previous calculation layer, so as to satisfy the next calculation. The input expectation of each layer is to correct the input of each calculation layer to reduce the degree of error accumulation, thereby improving the recognition accuracy.
上述的编码网络为根据字符图像作为数据集进行训练得到的编码网络。上述的数据集可以根据31个汉字字符、24个字母字符、10个数字字符,总共65个字符组成,每个字符对应多个不同情况下的图像。通过数据集对编码网络进行训练,可以使该编码网络学习到对字符所属特征区域进行编码,从而编码得到字符所在的特征区域。具体来说,是训练得到编码网络中的卷积核所对应的权重参数,以使编码网络在对待识别图像进行卷积计算时,通过对应的卷积核,卷积计算得到对应的通道,该通道对应于字符所属的特征区域。The aforementioned coding network is a coding network obtained by training based on character images as a data set. The above-mentioned data set can be composed of 31 Chinese characters, 24 alphabetic characters, 10 numeric characters, a total of 65 characters, and each character corresponds to multiple images in different situations. The coding network is trained through the data set, so that the coding network can learn to encode the characteristic region to which the character belongs, so as to encode the characteristic region where the character is located. Specifically, the weight parameter corresponding to the convolution kernel in the coding network is trained to make the coding network obtain the corresponding channel through the corresponding convolution kernel and convolution calculation when performing convolution calculation on the image to be recognized. The channel corresponds to the characteristic area to which the character belongs.
可选的,在一种可能的实施例中,编码网络可以是全卷积神经网络,全卷积网络可以接受任意尺寸的输入图像,即不用对待识别图像的尺寸进行处理,全卷积网络采用反卷积计算层对最后一个卷积层的特征图像进行上采样,使该特征图像的尺寸与输入图像的尺寸相同,从而可以对每个像素都产生一个预测,同时保留了原始输入图像中的空间信息。因此,编码网络为全卷积神经网络时,输出的特征图像为与待识别图像具有相同空间信息的特征图像,即提取到的各个特征区域的位置信息可以在待识别图像的空间信息中以像素点分布位置进行表征,可以在上采样后的特征图像上对像素点进行遍历分类。当然,上述的遍历分类是基于特征图像的通道来进行分类的,各个像素点所属分类对应于通道值最高的通道,进而对应于该通道值最高的通道所对应的特征区域。Optionally, in a possible embodiment, the encoding network may be a fully convolutional neural network, and the fully convolutional network may accept input images of any size, that is, the size of the image to be recognized does not need to be processed, and the fully convolutional network uses The deconvolution calculation layer up-samples the feature image of the last convolution layer so that the size of the feature image is the same as the size of the input image, so that a prediction can be generated for each pixel, while retaining the original input image Spatial information. Therefore, when the coding network is a fully convolutional neural network, the output feature image is a feature image with the same spatial information as the image to be recognized, that is, the position information of each feature region extracted can be divided into pixels in the spatial information of the image to be recognized. The point distribution position is characterized, and the pixel points can be traversed and classified on the up-sampled feature image. Of course, the above-mentioned traversal classification is based on the channel of the feature image. The classification of each pixel corresponds to the channel with the highest channel value, and further to the feature area corresponding to the channel with the highest channel value.
另外,在对最后一个卷积层的特征图像进行上采样时,可以将各通道对应的特征区域进行标注,以使上采样后的特征图像上各个像素点都对应有通道的标注,相当于对上采样后的特征图像中的特征区域进行标注,使得特征区域具有时序属性。在该情况下,可以不保留对应的通道,上述的注意力机制将会针对上采样后的特征图像中的特征区域及对应的标注进行提示,以使特征区域按所述时序属性在所述特征解码空间中进行解码。In addition, when the feature image of the last convolutional layer is up-sampled, the feature area corresponding to each channel can be labeled, so that each pixel on the up-sampled feature image corresponds to the channel label, which is equivalent to The feature regions in the up-sampled feature images are labeled, so that the feature regions have time series properties. In this case, the corresponding channel may not be reserved. The aforementioned attention mechanism will prompt for the feature region and the corresponding label in the up-sampled feature image, so that the feature region is displayed in the feature according to the time series attribute. Decoding in the decoding space.
203、在将特征图像按时序属性输入到特征解码空间时,向预先训练好的注意力机制上报各个通道的时序属性。203. When the feature image is input into the feature decoding space according to the time sequence attribute, the time sequence attribute of each channel is reported to the pre-trained attention mechanism.
204、通过预先训练好的注意力机制将通道对应的特征区域按时序属性进行排序,并按排序通知预先训练好的长短时记忆网络依次解码对应于排序的特征区域。204. Sort the characteristic regions corresponding to the channels according to time sequence attributes through the pre-trained attention mechanism, and notify the pre-trained long- and short-term memory network according to the sort to sequentially decode the characteristic regions corresponding to the sort.
上述步骤203、204中,注意力机制根据通道的时序属性,得到通道序列a,注意力机制计算出当前时刻t每个通道a i的权重a t,i,可以通过公式进行计算: In the above steps 203 and 204, the attention mechanism obtains the channel sequence a according to the timing attribute of the channel, and the attention mechanism calculates the weight at ,i of each channel a i at the current time t, which can be calculated by the formula:
e ti=f aat(a i,h t-1) e ti = f aat (a i ,h t-1 )
其中,式中的f aat为注意力感知函数,a i为当前输入向量,h t-1为上个时刻的解码状态,L为通道数量。 Among them, f aat in the formula is the attention perception function, a i is the current input vector, h t-1 is the decoding state at the previous moment, and L is the number of channels.
根据输入通道的时序和对应的权重输出一个通道,可以通过
Figure PCTCN2020108989-appb-000001
输出一个通道,将该通道输入到长短时记忆网络进行解码。
According to the timing of the input channel and the corresponding weight output a channel, you can pass
Figure PCTCN2020108989-appb-000001
Output a channel and input the channel to the long and short-term memory network for decoding.
上述的长短时记忆网络在解码当前通道对应的特征区域时,会根据注意力机制的输出来获取下一个待解码的特征区域所在位置。When the aforementioned long-short-term memory network decodes the characteristic region corresponding to the current channel, it will obtain the location of the next characteristic region to be decoded according to the output of the attention mechanism.
具体的,通过预先训练好的注意力机制按排序输出第一注意参数,通过所述第一注意参数通知所述预先训练好的长短时记忆网络对第一特征区域进行解码。在对第一特征区域进行解码时,预先训练好的注意力机制按所述排序输出第二注意参数,第二注意参数包括第二特征区域的位置;在对第一特征区域解码完成后,通过第二注意参数通知预先训练好的长短时记忆网络去注意第二特征区域所在的位置,从而使该预先训练好的长短时记忆网络对该第二特征区域进行解码;进一步的,在对第一特征区域完成解码后,根据第二注意参数,将第一特征区域的解码特征与第二特征区域作为输入,输入到预先训练好的长短时记忆网络中进行解码;循环解码直到依次完成所有特征区域的解码。Specifically, the pre-trained attention mechanism is used to output the first attention parameters in order, and the pre-trained long and short-term memory network is notified through the first attention parameters to decode the first feature region. When decoding the first feature region, the pre-trained attention mechanism outputs the second attention parameter according to the order, and the second attention parameter includes the position of the second feature region; after the first feature region is decoded, pass The second attention parameter informs the pre-trained long and short-term memory network to pay attention to the location of the second feature region, so that the pre-trained long and short-term memory network decodes the second feature region; After the feature region is decoded, according to the second attention parameter, take the decoded features of the first feature region and the second feature region as input, and input them into the pre-trained long-short-term memory network for decoding; loop decoding until all feature regions are completed in turn Decoding.
需要说明的是,由于空间变换网络以及编码网络部署在特征编码空间,注意力机制以及长短时记忆网络部署在特征解码空间,可以实现端对端的训练,即可以通过一个数据集对特征编码空间以及特征解码空间进行训练。因此,在 待识别图像输入到特征解码空间前,可以不用再进行图像的预处理。It should be noted that since the spatial transformation network and the coding network are deployed in the feature coding space, the attention mechanism and the long-short-term memory network are deployed in the feature decoding space, end-to-end training can be realized, that is, the feature coding space and the Feature decoding space for training. Therefore, before the image to be recognized is input into the feature decoding space, there is no need to preprocess the image.
205、按时序属性输出解码结果,得到待识别图像的识别结果。205. Output the decoding result according to the time sequence attribute, and obtain the recognition result of the image to be recognized.
上述的解码结果为待识别图像中的车牌信息对应的字符,由于在特征解码空间中,是按时序属性进行解码,因此,所得到的解码字符也是具有时序属性的,将得到的解码字符按时序属性进行输出,满足对车牌号码的字符排序。The above decoding result is the character corresponding to the license plate information in the image to be recognized. Since the decoding is performed according to the time series attribute in the feature decoding space, the obtained decoded character also has the time series attribute, and the obtained decoded character is according to the time series. The attributes are output to satisfy the character sorting of the license plate number.
在本发明实施例中,通过特征编码空间中的空间变换网络对车牌号码的待识别图像进行矫正后,通过编码网络对矫正待识别图像进行特征编码,在特征解码空间中按时序进行对特征区域进行解码,是一个端对端的解码形式,避免了图像预处理中多个步骤的错误累加,提高了车牌号码识别的鲁棒性;而且整个训练过程和识别过程只经过编码空间与解码空间,可以实现端对端的车牌号码识别。In the embodiment of the present invention, after the image to be recognized of the license plate number is corrected by the spatial transformation network in the feature encoding space, the image to be recognized is corrected by the encoding network, and the feature region is processed in the feature decoding space in time sequence. Decoding is an end-to-end decoding form, which avoids the accumulation of errors in multiple steps in image preprocessing, and improves the robustness of license plate number recognition; moreover, the entire training process and recognition process only go through the encoding space and the decoding space. Realize end-to-end license plate number recognition.
如图3所示,图3是本发明实施例提供的另一种车牌号码识别方法的流程图,由编码器(Encoder)以及解码器(decoder)组成,其中编码器中部署了STN层对待识别图像进行矫正以及卷积神经网络进行特征提取,解码器为长短时记忆网络和注意力机制结合的架构。如图3所示,待识别的车牌图像信息为“浙J·L9098”,输入包括图像参数为色彩通道(3,RGB)、宽(W)、高(H),在经过编码空间进行特征编码后,得到特征图像,在通道的时序属性中,特征图像中的通道对应特征区域分别为汉字字符对应的第一特征区域,字母字符对应的第二特征区域,字母/数字对应的第三特征区域至第七特征区域,特征图像输入到解码空间时,由注意力机制对各个通道对应的特征区域按时序属性进行排序,并对各个通道对应的特征区域进行提示,在h0时,注意力机制输出第一个注意力参数,第一个注意力参数由开始指令<start>+第一特征区域所在位置组成,在h1时,输入第一特征区域到解码器中的长短时记忆网络,在进行解码时,解码器中的长短时记忆网络会在31个汉字中解码出第一特征区域属于哪个汉字,解码结果为“浙”,此时,会将当前的解码状态进行保存,注意力机制输出第二个注意力参数,第二个注意力参数由上一解码状态+第二特征区域所在位置组成,在h2时,输入第一特征区域的解码状态与第二特征区域到解码器中的长短时记忆网络,在进行解码时,由于存在上一解码状态为汉字解码状态,在普通汽车车牌规则中,汉字后跟字母的概率为100%,解码器中的长短时记忆网络会在24个字母中解码出第二特征区域属于哪个字母,解 码结果为“J”,此时,会将当前的解码状态进行保存,注意力机制输出第三个注意力参数,第三个注意力参数由上一解码状态+第三特征区域所在位置组成,在h3时,输入第二特征区域的解码状态以及第三特征区域到解码器中的长短时记忆网络,在进行解码时,由于存在上一解码状态为字母状态,在普通汽车车牌规则中,字母后跟汉字的概率为0%,解码器中的长短时记忆网络会在24个字母以及10个数字中解码出第三特征区域属于哪个字母或数字,解码结果为“L”,此时,会将当前的解码状态进行保存,注意力机制输出第四个注意力参数,直到长短时记忆网络输出<end>结束识别,则认为已经识别完毕,输出解码结果。As shown in FIG. 3, FIG. 3 is a flowchart of another method for recognizing a license plate number provided by an embodiment of the present invention, which is composed of an encoder and a decoder, and the STN layer is deployed in the encoder to be recognized. The image is rectified and the convolutional neural network is used for feature extraction, and the decoder is a combination of long and short-term memory network and attention mechanism. As shown in Figure 3, the image information of the license plate to be recognized is "Zhe J·L9098", and the input includes image parameters such as color channel (3, RGB), width (W), height (H), and feature encoding is performed in the encoding space Then, the feature image is obtained. In the time sequence attribute of the channel, the channel corresponding feature areas in the feature image are the first feature area corresponding to Chinese characters, the second feature area corresponding to alphabetic characters, and the third feature area corresponding to letters/numbers. To the seventh feature region, when the feature image is input into the decoding space, the attention mechanism sorts the feature regions corresponding to each channel according to the time sequence attribute, and prompts the feature regions corresponding to each channel. At h0, the attention mechanism outputs The first attention parameter, the first attention parameter is composed of the start instruction <start> + the location of the first feature region. At h1, input the first feature region to the long and short-term memory network in the decoder, and then decode When the time, the long and short-term memory network in the decoder will decode which Chinese character the first feature area belongs to among the 31 Chinese characters, and the decoding result is "Zhe". At this time, the current decoding state will be saved, and the attention mechanism will output the first Two attention parameters, the second attention parameter is composed of the previous decoding state + the location of the second feature region. At h2, enter the decoding state of the first feature region and the length of the second feature region to the decoder The memory network, when decoding, because the previous decoding state is the Chinese character decoding state, in the ordinary car license plate rules, the probability of a Chinese character followed by a letter is 100%, and the long and short-term memory network in the decoder will decode in 24 letters Find out which letter the second feature area belongs to, and the decoding result is "J". At this time, the current decoding state will be saved, and the attention mechanism will output the third attention parameter. The third attention parameter is from the previous decoding state. +The position of the third feature area is composed of, at h3, input the decoding state of the second feature area and the long and short-term memory network of the third feature area into the decoder. When decoding, the previous decoding state is a letter state In the ordinary car license plate rules, the probability of a letter followed by a Chinese character is 0%. The long and short-term memory network in the decoder will decode which letter or number the third characteristic area belongs to among 24 letters and 10 numbers. The decoding result is "L", at this time, the current decoding state will be saved, and the attention mechanism will output the fourth attention parameter. Until the long and short-term memory network outputs <end> to end the recognition, it is considered that the recognition has been completed and the decoding result is output.
在本发明实施例中,由于编码器中部署了STN层对待识别图像进行矫正以及卷积神经网络进行特征提取,解码器为长短时记忆网络和注意力机制结合的架构,使得编码器+解码器具有深度神经网络的特征,可以使用深度学习的方法用数据驱动整个编码器+解码器模型的训练,训练的数据越完善,能够识别的场景就越多,提高了模型的鲁棒性。另外,由于编码器+解码器是一个端到端的模型,可以不用对图像进行预处理,提高了车牌号码的识别的速度,由于不存在预处理过程中的多个步骤,不会造成错误累加,提高了车牌号码的识别准确率。In the embodiment of the present invention, since the STN layer is deployed in the encoder to correct the image to be recognized and the convolutional neural network to perform feature extraction, the decoder is an architecture that combines a long and short-term memory network and an attention mechanism, so that the encoder + decoder With the characteristics of deep neural networks, deep learning methods can be used to drive the training of the entire encoder + decoder model with data. The more complete the training data, the more scenes that can be identified, which improves the robustness of the model. In addition, since the encoder + decoder is an end-to-end model, there is no need to preprocess the image, which improves the speed of the license plate number recognition. Since there are no multiple steps in the preprocessing process, it will not cause errors to accumulate. Improve the recognition accuracy of the license plate number.
需要说明的是,本发明实施例提供的车牌号码识别方法可以应用于需要进行车牌号码识别的手机、监控器、计算机、服务器等设备。It should be noted that the license plate number recognition method provided in the embodiments of the present invention can be applied to mobile phones, monitors, computers, servers and other devices that need to perform license plate number recognition.
请参见图4,图4是本发明实施例提供的一种车牌号码识别装置的结构示意图,如图4所示,包括:Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a license plate number recognition device provided by an embodiment of the present invention. As shown in FIG. 4, it includes:
编码模块401,用于将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;The encoding module 401 is used to input the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes the Multiple characteristic regions corresponding to each channel, the channel having a time sequence attribute;
解码模块402,用于将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;The decoding module 402 is configured to input the characteristic image into a preset characteristic decoding space according to the time sequence attribute, and in the characteristic decoding space, use an attention mechanism to adjust the characteristic region in the characteristic image according to the time sequence attribute. Decoding sequentially;
输出模块403,用于按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The output module 403 is configured to output the decoding result according to the time sequence attribute to obtain the recognition result of the image to be recognized.
可选的,如图5所示,所述预设的特征编码空间包括预先训练好的空间变 换网络以及预先训练好的编码网络,所述编码模块401,包括:Optionally, as shown in Figure 5, the preset feature coding space includes a pre-trained space transformation network and a pre-trained coding network. The coding module 401 includes:
矫正单元4011,用于在所述预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并根据预测结果矫正所述待识别图像,得到矫正后图像;The correction unit 4011 is configured to perform correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correct the image to be recognized according to the prediction result to obtain a corrected image;
编码单元4012,用于将所述矫正后图像输入到所述预先训练好的编码网络,通过所述编码网络中的多个卷积核对所述矫正后图像进行卷积计算,得到具有多个通道的特征图像,其中,所述通道的数量与所述卷积核的数量相同,所述通道的时序属性与所述卷积核计算的先后顺序相关联。The encoding unit 4012 is configured to input the corrected image into the pre-trained encoding network, and perform convolution calculation on the corrected image through multiple convolution cores in the encoding network to obtain multiple channels The feature image of, wherein the number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
可选的,如图6所示,所述预设的特征解码空间包括预先训练好的注意力机制以及预先训练好的长短时记忆网络,所述解码模块402,包括:Optionally, as shown in FIG. 6, the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long and short-term memory network, and the decoding module 402 includes:
注意力单元4021,用于在将所述特征图像按所述时序属性输入到特征解码空间时,向所述预先训练好的注意力机制上报各个通道的时序属性;The attention unit 4021 is configured to report the time series attribute of each channel to the pre-trained attention mechanism when the characteristic image is input into the characteristic decoding space according to the time series attribute;
解码单元4022,用于通过所述预先训练好的注意力机制将所述通道对应的特征区域按所述时序属性进行排序,并按所述排序通知所述预先训练好的长短时记忆网络依次解码对应于所述排序的特征区域。The decoding unit 4022 is configured to sort the feature regions corresponding to the channel according to the time sequence attribute through the pre-trained attention mechanism, and notify the pre-trained long-short-term memory network to decode sequentially according to the sorting Corresponds to the sorted characteristic area.
可选的,如图7所示,所述解码单元4022,包括:Optionally, as shown in FIG. 7, the decoding unit 4022 includes:
第一解码子单元40221,用于通过所述预先训练好的注意力机制按所述排序输出第一注意参数,通过所述第一注意参数通知所述预先训练好的长短时记忆网络对第一特征区域进行解码;The first decoding subunit 40221 is configured to output first attention parameters in the order through the pre-trained attention mechanism, and notify the pre-trained long and short-term memory network to perform the first attention parameters through the first attention parameters. Decoding the characteristic area;
输出子单元40222,用于在对所述第一特征区域进行解码时,所述预先训练好的注意力机制按所述排序输出第二注意参数,所述第二注意参数包括第二特征区域的位置;The output subunit 40222 is configured to output second attention parameters according to the order by the pre-trained attention mechanism when decoding the first feature region, and the second attention parameters include the information of the second feature region. position;
第二解码子单元40223,用于在对所述第一特征区域解码完成后,通过所述第二注意参数通知所述预先训练好的长短时记忆网络对所述第二特征区域进行解码;The second decoding subunit 40223 is configured to notify the pre-trained long and short-term memory network to decode the second feature region through the second attention parameter after the decoding of the first feature region is completed;
循环子单元40224,用于循环解码直到依次完成所有特征区域的解码。The loop sub-unit 40224 is used for loop decoding until the decoding of all characteristic regions is completed in sequence.
可选的,如图7所示,所述第二解码子单元40223,还用于在对所述第一特征区域完成解码后,根据所述第二注意参数,将所述第一特征区域的解码特征与第二特征区域作为输入,输入到所述预先训练好的长短时记忆网络中进行解码。Optionally, as shown in FIG. 7, the second decoding subunit 40223 is further configured to, after decoding the first characteristic region, determine the value of the first characteristic region according to the second attention parameter The decoded feature and the second feature region are used as inputs, and are input to the pre-trained long and short-term memory network for decoding.
可选的,如图8所示,所述装置还包括:Optionally, as shown in FIG. 8, the device further includes:
上采样模块404,用于将所述特征图像进行上采样,以使所述特征图像的尺寸与所述待识别图像的尺寸相同;The up-sampling module 404 is configured to up-sample the characteristic image so that the size of the characteristic image is the same as the size of the image to be recognized;
预测模块405,用于根据所述特征图像的通道,对上采样后的特征图像进行像素点预测,预测所述上采样后的特征图像中每个像素点所属的特征区域;The prediction module 405 is configured to perform pixel prediction on the up-sampled feature image according to the channel of the feature image, and predict the feature area to which each pixel in the up-sampled feature image belongs;
标注模块406,用于根据所述通道的时序属性,对所述上采样后的特征图像中每个像素点所属的特征区域进行标注,以使所述上采样后的特征图像中每个像素点所属的特征区域具有时序属性,得到标注特征图像;The labeling module 406 is configured to label the feature area to which each pixel in the up-sampled feature image belongs according to the time sequence attribute of the channel, so that each pixel in the up-sampled feature image The characteristic area to which it belongs has time-series attributes, and the marked characteristic image is obtained;
所述解码模块402还用于将所述标注特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述特征区域按所述时序属性在所述特征解码空间中进行解码。The decoding module 402 is further configured to input the labeled feature image into a preset feature decoding space according to the time sequence attribute, and use an attention mechanism to place the feature region in the feature decoding space according to the time sequence attribute. To decode.
需要说明的是,本发明实施例提供的车牌号码识别装置可以应用于需要进行进行车牌号码识别的手机、监控器、计算机、服务器等设备。It should be noted that the license plate number recognition device provided in the embodiment of the present invention can be applied to mobile phones, monitors, computers, servers and other devices that need to perform license plate number recognition.
本发明实施例提供的车牌号码识别装置能够实现上述方法实施例中车牌号码识别方法实现的各个过程,且可以达到相同的有益效果。为避免重复,这里不再赘述。The license plate number recognition device provided by the embodiment of the present invention can realize each process realized by the license plate number recognition method in the foregoing method embodiment, and can achieve the same beneficial effects. To avoid repetition, I won’t repeat them here.
参见图9,图9是本发明实施例提供的一种电子设备的结构示意图,如图9所示,包括:存储器902、处理器901及存储在所述存储器902上并可在所述处理器901上运行的计算机程序,其中:Referring to FIG. 9, FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention. As shown in FIG. 9, it includes: a memory 902, a processor 901, and a memory 902 stored in the memory 902 and available in the processor. The computer program running on 901, of which:
处理器901用于调用存储器902存储的计算机程序,执行如下步骤:The processor 901 is configured to call a computer program stored in the memory 902, and execute the following steps:
将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;Input the image to be recognized into the preset feature encoding space for correction and encoding, and obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes multiple channels corresponding to the multiple channels. In a characteristic area, the channel has a time sequence attribute;
将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;Inputting the feature image into a preset feature decoding space according to the time series attribute, and in the feature decoding space, using an attention mechanism to decode feature regions in the feature image according to the time series attribute;
按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
可选的,所述预设的特征编码空间包括预先训练好的空间变换网络以及预先训练好的编码网络,所述处理器901执行的所述将待识别图像输入到预设的特征编码空间进行矫正及编码,编码得到具有多个通道的特征图像,包括:Optionally, the preset feature coding space includes a pre-trained space transformation network and a pre-trained coding network. The processor 901 performs the input of the image to be recognized into the preset feature coding space. Correction and encoding, encoding to obtain feature images with multiple channels, including:
在所述预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并 根据预测结果矫正所述待识别图像,得到矫正后图像;Performing correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correcting the image to be recognized according to the prediction result to obtain a corrected image;
将所述矫正后图像输入到所述预先训练好的编码网络,通过所述编码网络中的多个卷积核对所述矫正后图像进行卷积计算,得到具有多个通道的特征图像,其中,所述通道的数量与所述卷积核的数量相同,所述通道的时序属性与所述卷积核计算的先后顺序相关联。Input the corrected image to the pre-trained coding network, and perform convolution calculation on the corrected image through multiple convolution kernels in the coding network to obtain a feature image with multiple channels, wherein, The number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
可选的,所述预设的特征解码空间包括预先训练好的注意力机制以及预先训练好的长短时记忆网络,所述处理器901执行的所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:Optionally, the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long and short-term memory network, and the processor 901 executes the input of the feature image according to the time sequence attribute Go to the preset feature decoding space, and decode the feature area corresponding to the channel according to the time sequence attribute through the attention mechanism, including:
在将所述特征图像按所述时序属性输入到特征解码空间时,向所述预先训练好的注意力机制上报各个通道的时序属性;When the feature image is input into the feature decoding space according to the time sequence attribute, reporting the time sequence attribute of each channel to the pre-trained attention mechanism;
通过所述预先训练好的注意力机制将所述通道对应的特征区域按所述时序属性进行排序,并按所述排序通知所述预先训练好的长短时记忆网络依次解码对应于所述排序的特征区域。Through the pre-trained attention mechanism, the feature regions corresponding to the channels are sorted according to the time sequence attributes, and the pre-trained long- and short-term memory network is notified according to the sorting order to sequentially decode the sequence corresponding to the sorting Characteristic area.
可选的,所述处理器901执行的所述按所述排序通知所述预设训练好的长短时记忆网络依次解码对应于所述排序的特征区域,包括:Optionally, the execution of the processor 901 to notify the preset trained long and short-term memory network according to the order to sequentially decode the feature regions corresponding to the order includes:
通过所述预先训练好的注意力机制按所述排序输出第一注意参数,通过所述第一注意参数通知所述预先训练好的长短时记忆网络对第一特征区域进行解码;Output first attention parameters in the order by the pre-trained attention mechanism, and notify the pre-trained long and short-term memory network to decode the first feature region through the first attention parameters;
在对所述第一特征区域进行解码时,所述预先训练好的注意力机制按所述排序输出第二注意参数,所述第二注意参数包括第二特征区域的位置;When decoding the first feature region, the pre-trained attention mechanism outputs a second attention parameter according to the order, and the second attention parameter includes the position of the second feature region;
在对所述第一特征区域解码完成后,通过所述第二注意参数通知所述预先训练好的长短时记忆网络对所述第二特征区域进行解码;After the decoding of the first feature region is completed, notify the pre-trained long and short-term memory network to decode the second feature region through the second attention parameter;
循环解码直到依次完成所有特征区域的解码。Loop decoding until the decoding of all feature regions is completed in sequence.
可选的,所述处理器901执行的所述在对第一特征区域解码完成后,所述通过所述第二注意参数通知所述预先训练好的长短时记忆网络对第二特征区域进行解码,包括:Optionally, the processor 901 executes the decoding of the second characteristic region after the first characteristic region is decoded, and the second attention parameter is used to notify the pre-trained long and short-term memory network to decode the second characteristic region. ,include:
在对所述第一特征区域完成解码后,根据所述第二注意参数,将所述第一特征区域的解码特征与第二特征区域作为输入,输入到所述预先训练好的长短时记忆网络中进行解码。After the first feature region is decoded, according to the second attention parameter, the decoded features of the first feature region and the second feature region are used as input, and then input to the pre-trained long and short-term memory network In the decoding.
可选的,在所述将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像之后,所述处理器901还执行以下步骤:Optionally, after the image to be recognized is input into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels, the processor 901 further executes the following steps:
将所述特征图像进行上采样,以使所述特征图像的尺寸与所述待识别图像的尺寸相同;Up-sampling the characteristic image so that the size of the characteristic image is the same as the size of the image to be recognized;
根据所述特征图像的通道,对上采样后的特征图像进行像素点预测,预测所述上采样后的特征图像中每个像素点所属的特征区域;Performing pixel point prediction on the up-sampled feature image according to the channel of the feature image, and predicting the feature area to which each pixel point in the up-sampled feature image belongs;
根据所述通道的时序属性,对所述上采样后的特征图像中每个像素点所属的特征区域进行标注,以使所述上采样后的特征图像中每个像素点所属的特征区域具有时序属性,得到标注特征图像;According to the time series attribute of the channel, annotate the feature area to which each pixel in the up-sampled feature image belongs, so that the feature area to which each pixel in the up-sampled feature image belongs has time sequence Attributes to get the marked feature image;
所述处理器901执行的所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:The processor 901 executes the input of the characteristic image into a preset characteristic decoding space according to the time series attribute, and decodes the characteristic region corresponding to the channel according to the time series attribute through an attention mechanism, including :
将所述标注特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述特征区域按所述时序属性在所述特征解码空间中进行解码。The labeled feature image is input into a preset feature decoding space according to the time series attribute, and the feature area is decoded in the feature decoding space according to the time series attribute through an attention mechanism.
需要说明的是,上述电子设备可以是可以应用于需要进行车牌号码识别的手机、监控器、计算机、服务器等设备。It should be noted that the above-mentioned electronic device may be applied to devices such as mobile phones, monitors, computers, servers, etc., that require license plate number recognition.
本发明实施例提供的电子设备能够实现上述方法实施例中车牌号码识别方法实现的各个过程,且可以达到相同的有益效果,为避免重复,这里不再赘述。The electronic device provided in the embodiment of the present invention can implement each process implemented by the license plate number recognition method in the foregoing method embodiment, and can achieve the same beneficial effects. To avoid repetition, details are not described herein again.
本发明实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现本发明实施例提供的车牌号码识别方法的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiment of the present invention also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program. To achieve the same technical effect, in order to avoid repetition, I will not repeat them here.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,简称RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM for short), etc.
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。The above-disclosed are only the preferred embodiments of the present invention, which of course cannot be used to limit the scope of rights of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.

Claims (10)

  1. 一种车牌号码识别方法,其特征在于,包括以下步骤:A method for recognizing a license plate number is characterized in that it comprises the following steps:
    将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;Input the image to be recognized into the preset feature encoding space for correction and encoding, and obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes multiple channels corresponding to the multiple channels. In a characteristic area, the channel has a time sequence attribute;
    将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;Inputting the feature image into a preset feature decoding space according to the time series attribute, and in the feature decoding space, using an attention mechanism to decode feature regions in the feature image according to the time series attribute;
    按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The decoding result is output according to the time sequence attribute, and the recognition result of the image to be recognized is obtained.
  2. 如权利要求1所述的方法,其特征在于,所述预设的特征编码空间包括预先训练好的空间变换网络以及预先训练好的编码网络,所述将待识别图像输入到预设的特征编码空间进行矫正及编码,编码得到具有多个通道的特征图像,包括:The method of claim 1, wherein the preset feature coding space includes a pre-trained space transformation network and a pre-trained coding network, and the image to be recognized is input to the preset feature coding Spatial correction and coding, coding to obtain feature images with multiple channels, including:
    在所述预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并根据预测结果矫正所述待识别图像,得到矫正后图像;Performing correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correcting the image to be recognized according to the prediction result to obtain a corrected image;
    通过所述编码网络中的多个卷积核对所述矫正后图像进行卷积计算,得到具有多个通道的特征图像,其中,所述通道的数量与所述卷积核的数量相同,所述通道的时序属性与所述卷积核计算的先后顺序相关联。Perform convolution calculation on the corrected image through multiple convolution kernels in the coding network to obtain a feature image with multiple channels, where the number of channels is the same as the number of convolution kernels, and The time sequence attribute of the channel is associated with the sequence of calculation of the convolution kernel.
  3. 如权利要求1所述的方法,其特征在于,所述预设的特征解码空间包括预先训练好的注意力机制以及预先训练好的长短时记忆网络,所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:The method of claim 1, wherein the preset feature decoding space includes a pre-trained attention mechanism and a pre-trained long- and short-term memory network, and the feature image is divided according to the time sequence The attributes are input into the preset feature decoding space, and the feature region corresponding to the channel is decoded according to the time sequence attributes through the attention mechanism, including:
    在将所述特征图像按所述时序属性输入到特征解码空间时,向所述预先训练好的注意力机制上报各个通道的时序属性;When the feature image is input into the feature decoding space according to the time sequence attribute, reporting the time sequence attribute of each channel to the pre-trained attention mechanism;
    通过所述预先训练好的注意力机制将所述通道对应的特征区域按所述时序属性进行排序,并按所述排序通知所述预先训练好的长短时记忆网络依次解码对应于所述排序的特征区域。Through the pre-trained attention mechanism, the feature regions corresponding to the channels are sorted according to the time sequence attributes, and the pre-trained long- and short-term memory network is notified according to the sorting order to sequentially decode the sequence corresponding to the sorting Characteristic area.
  4. 如权利要求3所述的方法,其特征在于,所述按所述排序通知所述预设训练好的长短时记忆网络依次解码对应于所述排序的特征区域,包括:The method according to claim 3, wherein said notifying said preset trained long and short-term memory network according to said ranking to sequentially decode the characteristic regions corresponding to said ranking, comprising:
    通过所述预先训练好的注意力机制按所述排序输出第一注意参数,通过所述第一注意参数通知所述预先训练好的长短时记忆网络对第一特征区域进行解码;Output first attention parameters in the order by the pre-trained attention mechanism, and notify the pre-trained long and short-term memory network to decode the first feature region through the first attention parameters;
    在对所述第一特征区域进行解码时,所述预先训练好的注意力机制按所述排序输出第二注意参数,所述第二注意参数包括第二特征区域的位置;When decoding the first feature region, the pre-trained attention mechanism outputs a second attention parameter according to the order, and the second attention parameter includes the position of the second feature region;
    在对所述第一特征区域解码完成后,通过所述第二注意参数通知所述预先训练好的长短时记忆网络对所述第二特征区域进行解码;After the decoding of the first feature region is completed, notify the pre-trained long and short-term memory network to decode the second feature region through the second attention parameter;
    循环解码直到依次完成所有特征区域的解码。Loop decoding until the decoding of all feature regions is completed in sequence.
  5. 如权利要求4所述的方法,其特征在于,所述在对第一特征区域解码完成后,所述通过所述第二注意参数通知所述预先训练好的长短时记忆网络对第二特征区域进行解码,包括:The method according to claim 4, characterized in that, after the decoding of the first feature region is completed, the second attention parameter is used to notify the pre-trained long and short-term memory network of the second feature region. Decoding, including:
    在对所述第一特征区域完成解码后,根据所述第二注意参数,将所述第一特征区域的解码特征与第二特征区域作为输入,输入到所述预先训练好的长短时记忆网络中进行解码。After the first feature region is decoded, according to the second attention parameter, the decoded features of the first feature region and the second feature region are used as input, and then input to the pre-trained long and short-term memory network In the decoding.
  6. 如权利要求1所述的方法,其特征在于,在所述将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像之后,所述方法还包括:The method according to claim 1, characterized in that, after inputting the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels, the method further comprises:
    将所述特征图像进行上采样,以使所述特征图像的尺寸与所述待识别图像的尺寸相同;Up-sampling the characteristic image so that the size of the characteristic image is the same as the size of the image to be recognized;
    根据所述特征图像的通道,对上采样后的特征图像进行像素点预测,预测所述上采样后的特征图像中每个像素点所属的特征区域;Performing pixel point prediction on the up-sampled feature image according to the channel of the feature image, and predicting the feature area to which each pixel point in the up-sampled feature image belongs;
    根据所述通道的时序属性,对所述上采样后的特征图像中每个像素点所属的特征区域进行标注,以使所述上采样后的特征图像中每个像素点所属的特征区域具有时序属性,得到标注特征图像;According to the time series attribute of the channel, annotate the feature area to which each pixel in the up-sampled feature image belongs, so that the feature area to which each pixel in the up-sampled feature image belongs has time sequence Attributes to get the marked feature image;
    所述将所述特征图像按所述时序属性输入到预设的特征解码空间,并通过注意力机制将所述通道对应的特征区域按所述时序属性进行解码,包括:The inputting the characteristic image into a preset characteristic decoding space according to the time series attribute, and decoding the characteristic region corresponding to the channel according to the time series attribute through an attention mechanism, includes:
    将所述标注特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述标注特征图像中的特征区域按所述时序属性依次进行解码。The labeled feature image is input into a preset feature decoding space according to the time sequence attribute, and the feature regions in the labeled feature image are sequentially decoded according to the time sequence attribute in the feature decoding space through an attention mechanism.
  7. 一种车牌号码识别装置,其特征在于,所述装置包括:A vehicle license plate number recognition device, characterized in that the device includes:
    编码模块,用于将待识别图像输入到预设的特征编码空间进行矫正及编码,得到具有多个通道的特征图像,所述待识别图像包括车牌信息,所述特征图像包括与所述多个通道对应的多个特征区域,所述通道具有时序属性;The encoding module is used to input the image to be recognized into a preset feature encoding space for correction and encoding to obtain a feature image with multiple channels. The image to be identified includes license plate information, and the feature image includes A plurality of characteristic regions corresponding to a channel, the channel having a time sequence attribute;
    解码模块,用于将所述特征图像按所述时序属性输入到预设的特征解码空间,在所述特征解码空间中通过注意力机制将所述特征图像中的特征区域按所述时序属性依次进行解码;The decoding module is used to input the characteristic image into a preset characteristic decoding space according to the time sequence attribute, and in the characteristic decoding space, the characteristic regions in the characteristic image are sequentially arranged according to the time sequence attribute through the attention mechanism. Decode
    输出模块,用于按所述时序属性输出解码结果,得到所述待识别图像的识别结果。The output module is used to output the decoding result according to the time sequence attribute to obtain the recognition result of the image to be recognized.
  8. 如权利要求7所述的装置,其特征在于,所述预设的特征编码空间包括预先训练好的空间变换网络以及预先训练好的编码网络,所述编码模块,包括:8. The device of claim 7, wherein the preset feature coding space includes a pre-trained spatial transformation network and a pre-trained coding network, and the coding module includes:
    矫正单元,用于在所述预先训练好的空间变换网络中对所述待识别图像进行矫正预测,并根据预测结果矫正所述待识别图像,得到矫正后图像;The correction unit is configured to perform correction prediction on the image to be recognized in the pre-trained spatial transformation network, and correct the image to be recognized according to the prediction result to obtain a corrected image;
    编码单元,用于将所述矫正后图像输入到所述预先训练好的编码网络,通过所述编码网络中的多个卷积核对所述矫正后图像进行卷积计算,得到具有多个通道的特征图像,其中,所述通道的数量与所述卷积核的数量相同,所述通道的时序属性与所述卷积核计算的先后顺序相关联。The encoding unit is configured to input the corrected image into the pre-trained encoding network, and perform convolution calculation on the corrected image through multiple convolution kernels in the encoding network to obtain a multiple channel A feature image, wherein the number of the channels is the same as the number of the convolution kernels, and the timing attributes of the channels are associated with the order of calculation of the convolution kernels.
  9. 一种电子设备,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的车牌号码识别方法中的步骤。An electronic device, characterized by comprising: a memory, a processor, and a computer program stored on the memory and capable of running on the processor. The processor executes the computer program as claimed in claim 1. Steps in the license plate number recognition method described in any one of to 6.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的车牌号码识别方法中的步骤。A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the license plate number according to any one of claims 1 to 6 is realized. Identify the steps in the method.
PCT/CN2020/108989 2019-12-31 2020-08-13 License plate number recognition method and apparatus, electronic device, and storage medium WO2021135254A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911425285.2A CN111191663B (en) 2019-12-31 2019-12-31 License plate number recognition method and device, electronic equipment and storage medium
CN201911425285.2 2019-12-31

Publications (1)

Publication Number Publication Date
WO2021135254A1 true WO2021135254A1 (en) 2021-07-08

Family

ID=70709766

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/108989 WO2021135254A1 (en) 2019-12-31 2020-08-13 License plate number recognition method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN111191663B (en)
WO (1) WO2021135254A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627349A (en) * 2021-08-12 2021-11-09 南京信息工程大学 Dynamic facial expression recognition method based on self-attention transformation network
CN114612979A (en) * 2022-03-09 2022-06-10 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN115546768A (en) * 2022-12-01 2022-12-30 四川蜀道新能源科技发展有限公司 Pavement marking identification method and system based on multi-scale mechanism and attention mechanism
CN115661807A (en) * 2022-12-28 2023-01-31 成都西物信安智能系统有限公司 Method for acquiring license plate information
CN116824116A (en) * 2023-06-26 2023-09-29 爱尔眼科医院集团股份有限公司 Super wide angle fundus image identification method, device, equipment and storage medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191663B (en) * 2019-12-31 2022-01-11 深圳云天励飞技术股份有限公司 License plate number recognition method and device, electronic equipment and storage medium
CN111860682B (en) * 2020-07-30 2024-06-14 上海高德威智能交通系统有限公司 Sequence recognition method, sequence recognition device, image processing apparatus, and storage medium
CN112149661B (en) * 2020-08-07 2024-06-21 珠海欧比特宇航科技股份有限公司 License plate recognition method, license plate recognition device and license plate recognition medium
CN112215229B (en) * 2020-08-27 2023-07-18 北京英泰智科技股份有限公司 License plate recognition method and device based on lightweight network end-to-end
CN112215224A (en) * 2020-10-22 2021-01-12 深圳市平方科技股份有限公司 Deep learning-based trailer number identification method and device
CN112508018A (en) * 2020-12-14 2021-03-16 北京澎思科技有限公司 License plate recognition method and device and storage medium
CN112633264B (en) * 2021-03-11 2021-06-15 深圳市安软科技股份有限公司 Vehicle attribute identification method and device, electronic equipment and storage medium
CN113159204A (en) * 2021-04-28 2021-07-23 深圳市捷顺科技实业股份有限公司 License plate recognition model generation method, license plate recognition method and related components
CN113554030B (en) * 2021-07-27 2022-08-16 上海大学 Multi-type license plate recognition method and system based on single character attention

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108388896A (en) * 2018-02-09 2018-08-10 杭州雄迈集成电路技术有限公司 A kind of licence plate recognition method based on dynamic time sequence convolutional neural networks
CN109784340A (en) * 2018-12-14 2019-05-21 北京市首都公路发展集团有限公司 A kind of licence plate recognition method and device
CN109948604A (en) * 2019-02-01 2019-06-28 北京捷通华声科技股份有限公司 Recognition methods, device, electronic equipment and the storage medium of irregular alignment text
CN110414451A (en) * 2019-07-31 2019-11-05 深圳市捷顺科技实业股份有限公司 It is a kind of based on end-to-end licence plate recognition method, device, equipment and storage medium
CN110427938A (en) * 2019-07-26 2019-11-08 中科视语(北京)科技有限公司 A kind of irregular character recognition device and method based on deep learning
CN111191663A (en) * 2019-12-31 2020-05-22 深圳云天励飞技术有限公司 License plate number recognition method and device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784280A (en) * 2019-01-18 2019-05-21 江南大学 Human bodys' response method based on Bi-LSTM-Attention model
CN110070085B (en) * 2019-04-30 2021-11-02 北京百度网讯科技有限公司 License plate recognition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108388896A (en) * 2018-02-09 2018-08-10 杭州雄迈集成电路技术有限公司 A kind of licence plate recognition method based on dynamic time sequence convolutional neural networks
CN109784340A (en) * 2018-12-14 2019-05-21 北京市首都公路发展集团有限公司 A kind of licence plate recognition method and device
CN109948604A (en) * 2019-02-01 2019-06-28 北京捷通华声科技股份有限公司 Recognition methods, device, electronic equipment and the storage medium of irregular alignment text
CN110427938A (en) * 2019-07-26 2019-11-08 中科视语(北京)科技有限公司 A kind of irregular character recognition device and method based on deep learning
CN110414451A (en) * 2019-07-31 2019-11-05 深圳市捷顺科技实业股份有限公司 It is a kind of based on end-to-end licence plate recognition method, device, equipment and storage medium
CN111191663A (en) * 2019-12-31 2020-05-22 深圳云天励飞技术有限公司 License plate number recognition method and device, electronic equipment and storage medium

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627349A (en) * 2021-08-12 2021-11-09 南京信息工程大学 Dynamic facial expression recognition method based on self-attention transformation network
CN113627349B (en) * 2021-08-12 2023-12-05 南京信息工程大学 Dynamic facial expression recognition method based on self-attention transformation network
CN114612979A (en) * 2022-03-09 2022-06-10 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN114612979B (en) * 2022-03-09 2024-05-31 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN115546768A (en) * 2022-12-01 2022-12-30 四川蜀道新能源科技发展有限公司 Pavement marking identification method and system based on multi-scale mechanism and attention mechanism
CN115546768B (en) * 2022-12-01 2023-04-07 四川蜀道新能源科技发展有限公司 Pavement marking identification method and system based on multi-scale mechanism and attention mechanism
CN115661807A (en) * 2022-12-28 2023-01-31 成都西物信安智能系统有限公司 Method for acquiring license plate information
CN115661807B (en) * 2022-12-28 2023-04-07 成都西物信安智能系统有限公司 Method for acquiring license plate information
CN116824116A (en) * 2023-06-26 2023-09-29 爱尔眼科医院集团股份有限公司 Super wide angle fundus image identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111191663B (en) 2022-01-11
CN111191663A (en) 2020-05-22

Similar Documents

Publication Publication Date Title
WO2021135254A1 (en) License plate number recognition method and apparatus, electronic device, and storage medium
US11475660B2 (en) Method and system for facilitating recognition of vehicle parts based on a neural network
US11256960B2 (en) Panoptic segmentation
Kümmerer et al. DeepGaze II: Reading fixations from deep features trained on object recognition
Sameen et al. Classification of very high resolution aerial photos using spectral‐spatial convolutional neural networks
US20220277549A1 (en) Generative Adversarial Networks for Image Segmentation
CN111222513B (en) License plate number recognition method and device, electronic equipment and storage medium
AU2021354030B2 (en) Processing images using self-attention based neural networks
CN111598182A (en) Method, apparatus, device and medium for training neural network and image recognition
CN115496928B (en) Multi-modal image feature matching method based on multi-feature matching
CN112634296A (en) RGB-D image semantic segmentation method and terminal for guiding edge information distillation through door mechanism
CN113255659A (en) License plate correction detection and identification method based on MSAFF-yolk 3
CN112215190A (en) Illegal building detection method based on YOLOV4 model
US20220335572A1 (en) Semantically accurate super-resolution generative adversarial networks
CN115937022A (en) Few-sample image restoration method based on iterative residual error learning
CN109492610A (en) A kind of pedestrian recognition methods, device and readable storage medium storing program for executing again
CN111104941B (en) Image direction correction method and device and electronic equipment
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN116863384A (en) CNN-Transfomer-based self-supervision video segmentation method and system
CN115100552A (en) Unmanned aerial vehicle remote sensing image real-time semantic segmentation method, medium and equipment
Wu et al. STR transformer: a cross-domain transformer for scene text recognition
CN117115014A (en) Blurred image recovery method and device and electronic equipment
CN116168394A (en) Image text recognition method and device
CN113379001B (en) Processing method and device for image recognition model
CN115170414A (en) Knowledge distillation-based single image rain removing method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20908643

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23.11.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20908643

Country of ref document: EP

Kind code of ref document: A1