CN111368645A

CN111368645A - Method and device for identifying multi-label license plate, electronic equipment and readable medium

Info

Publication number: CN111368645A
Application number: CN202010093786.1A
Authority: CN
Inventors: 苏伟博
Original assignee: Beijing Pengsi Technology Co ltd
Current assignee: Beijing Pengsi Technology Co ltd
Priority date: 2020-02-14
Filing date: 2020-02-14
Publication date: 2020-07-03

Abstract

The embodiment of the disclosure discloses a method, a device, electronic equipment and a readable medium for identifying a multi-label license plate. One embodiment of the method comprises: acquiring an image to be identified; detecting the image to be identified; in response to the fact that the image to be recognized comprises a vehicle image, extracting a license plate image from the vehicle image; carrying out image processing on the license plate image; inputting the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image characteristics, wherein the multi-label license plate recognition convolutional network comprises a space attention network; and outputting the license plate number based on the license plate image characteristics. The implementation mode realizes the effect of attention of a character space, so that the overall recognition performance of the multi-label license plate recognition convolutional network model is improved.

Description

Method and device for identifying multi-label license plate, electronic equipment and readable medium

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for identifying a multi-label license plate, electronic equipment and a readable medium.

Background

License plate identification is a technology for automatically capturing and identifying a license plate number in motion from a complex background, and the technology is widely applied to systems such as an expressway vehicle management system, an Electronic Toll Collection (ETC) system and a parking lot management system. In the traditional license plate recognition mode, the recognition precision can be obviously reduced under the complex condition. Therefore, a technical method for recognizing the license plate with higher recognition precision and simpler calculation is needed.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Some embodiments of the present disclosure provide a method, an apparatus, an electronic device and a readable medium for recognizing a multi-tag license plate to solve the technical problems mentioned in the above background.

In a first aspect, some embodiments of the present disclosure provide a method for identifying a multi-tag license plate, the method comprising: acquiring an image to be identified; detecting the image to be identified; in response to the fact that the image to be recognized comprises a vehicle image, extracting a license plate image from the vehicle image; carrying out image processing on the license plate image; inputting the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image characteristics, wherein the multi-label license plate recognition convolutional network comprises a space attention network; and outputting the license plate number based on the license plate image characteristics.

In a second aspect, some embodiments of the present disclosure provide an apparatus for identifying a multi-tag license plate, the apparatus comprising: an acquisition unit configured to acquire an image to be recognized; a detection unit configured to detect the image to be recognized; the extraction unit is configured to respond to the detection that the image to be recognized comprises a vehicle image, and a license plate image is extracted from the vehicle image; an image processing unit configured to perform image processing on the license plate image; the input unit is configured to input the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image characteristics, wherein the multi-label license plate recognition convolutional network comprises a space attention network; and the output unit is configured to output the license plate number based on the license plate image characteristics.

In a third aspect, some embodiments of the present disclosure also provide an electronic device, including: one or more processors; storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods as in the embodiments described above.

In a fourth aspect, some embodiments of the disclosure also provide a computer readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method according to the above embodiments.

One of the above-described various embodiments of the present disclosure has, at least in part, the following advantageous effects: by the aid of the obtained large number of image training samples to be recognized, under-fitting does not occur in a network training result, model accuracy is improved laterally, and loss is reduced. The acquired images are subjected to detection on the vehicle photos, extraction of the license plate photos and processing of the license plate photos to form standard pictures which are in accordance with the standard pictures input into a training network, unnecessary calculation amount is reduced for the training network, license plate number information in the license plate is paid more attention to, and the recognition accuracy of a network model is improved laterally. And inputting the processed license plate image into a multi-label license plate recognition improved network, and outputting a license plate number with higher recognition precision.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

FIG. 1 is a schematic diagram of one application scenario in which some embodiments of the present disclosure may be applied to a method for identifying a multi-tag license plate;

FIG. 2 is a flow diagram of a method for identifying a multi-tag license plate in accordance with some embodiments of the present disclosure;

FIG. 3 is a schematic diagram of a backbone convolutional network structure, according to some embodiments of the present disclosure;

FIG. 4 is a schematic diagram of a branched convolutional network structure, according to some embodiments of the present disclosure;

FIG. 5 is a schematic diagram of a spatial attention network structure according to some embodiments of the present disclosure;

FIG. 6 is a schematic diagram of an apparatus for identifying a multi-tag license plate according to some embodiments of the present disclosure;

FIG. 7 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

In the embodiments of the present disclosure, the concept of the vehicle described may be a four-wheel motor vehicle, and may also be a vehicle type such as a motorcycle, a tricycle, a bicycle, and a power-assisted vehicle.

In embodiments of the present disclosure, those skilled in the art will appreciate the functions to be performed by the various functional layers (modules) of the neural network, for example, the convolutional layer may be used to perform a convolution operation to extract feature information of an input image (e.g., size 227 × 227) to obtain a feature map (e.g., size 13 × 13), the pooling layer may perform a pooling operation on the input image, such as a maximum-value-merging (max-pooling) method, a mean-pooling (mean-pooling) method, etc., the activation layer introduces a non-linear factor through an activation function, such as using a property correction unit (ReLU, Leaky-ReLU, P-ReLU, R-ReLU) function, a Sigmoid function (Sigmoid function), or a hyperbolic tangent function (tanh function), etc. the random layer (Dropout) is used to alleviate the overfitting problem, such as a fully connected layer (also known as a dot-packed layer) may be set to 0.4, 0.5, etc. to convert the feature map of the convolutional output into a one-dimensional vector, such as 1x 6.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

As shown in fig. 1, the schematic diagram 100 includes a collection device 101, a server 102, and a terminal device 103.

When the collecting device 101 detects that a vehicle runs, electronic devices with image collecting capability such as a camera, a mobile phone and a camera can be started to collect and store license plate photos. Wherein the vehicle detection may include, but is not limited to, at least one of: ultrasonic detection, video detection, infrared detection, induction coil detection, acoustic detection and other detection technologies. For example, when a vehicle enters a parking lot, the collection device 101 detects that the vehicle will pass through by using an induction coil detection method, and at this time, a device such as a camera or a video camera is turned on to capture the vehicle that will travel into the parking lot.

The collection device 101 inputs the captured vehicle photograph to the server 102. The server 102 performs license plate detection on the vehicle picture, and then inputs the vehicle picture into the multi-label license plate recognition convolutional network to output license plate number information. The server 102 may be a server that provides various services, such as a background server that provides analysis support for information displayed on the terminal device 103. The background server can feed back the processing result (such as license plate number information) to the terminal equipment.

The server may be hardware or software. When the server is hardware, the server can be implemented as a distributed server cluster consisting of a plurality of servers, or can be implemented as a single server, or one or more virtualized instances distributed from cloud computing resources. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

In response to the license plate number sent by the server 102, the terminal device 103 displays the corresponding license plate number. The terminal device 103 may be a variety of electronic devices having a display screen including, but not limited to, smart phones, tablets, e-book readers, MP4 players, laptop and desktop computers, outdoor display screens, and the like.

It should be noted that the method for identifying a multi-tag license plate provided by the embodiments of the present disclosure is generally performed by the server 102. Accordingly, a means for identifying multi-labeled license plates may be provided in server 102.

It should be understood that the number of terminal devices, collection devices, and servers in fig. 1 are merely illustrative. There may be any number of terminal devices, collection devices, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of some embodiments of a method for identifying a multi-labeled license plate is shown. The method for identifying the multi-label license plate comprises the following steps:

step 201, acquiring an image to be identified.

In some embodiments, an executing subject (e.g., the server 102 shown in fig. 1) of the multi-tag license plate recognition method may obtain a license plate photo to be recognized from a collection terminal (e.g., the collection device 101 shown in fig. 1) through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection manners may include, but are not limited to, 3G/4G/5G WWAN connection, Wi-Fi connection, bluetooth connection, WiMAX connection, Zigbee connection, uwb (ultra wideband) connection, and other wireless connection manners known now or developed in the future.

In some alternative implementations of some embodiments, the number of images to be identified that are acquired largely determines the robustness of the training network. The more the collection quantity is, the less fitting the model is, and the higher the accuracy of the output license plate number is.

In embodiments of the present disclosure, the image may be acquired in a variety of ways. For example image acquisition in static mode. In a static mode, a vehicle triggers a ground induction coil, an infrared or radar device and the like to give a trigger signal to a camera, and the camera can capture an image after receiving the trigger signal. For example image acquisition in video mode. In the video mode, no trigger signal is needed outside, and the camera can record video stream images including license plate images of vehicles in real time.

Step 202, detecting the image to be identified.

In capturing a photograph, a situation may occur in which the captured image is a background image. For example, in intelligent parking lot management, when the floodgate machine induces the vehicle through the induction coil who sets up will pass through, at this moment, the coil will give system trigger signal, and the camera is opened, shoots the vehicle that traveles. If the vehicle does not pass through the image at this time, the image taken is a background image. Because only the license plate of the captured image is effective information, the image to be identified can be screened by utilizing a vehicle detection technology.

In some optional implementations of some embodiments, the vehicle detection technique described above includes: carrying out gray level conversion on the image to be identified; denoising the image to be recognized after the gray level conversion by Gaussian filtering; positioning the vertical edge of the denoised image to be identified; thresholding and extracting the positioned vertical edge; communicating the thresholded extracted vertical edges to determine a communicated region; determining whether to include a vehicle image based on the connected region.

And step 203, extracting the vehicle image obtained in the step 202 to obtain a license plate image.

The license plate image is extracted, so that a part of information which is useless for the multi-label license plate recognition convolution network can be removed.

As an example, the license plate image extraction technique includes finding a matrix connected domain by using a matrix, and many matrices are considered to be license plate regions (since the license plate matrices specified by the country are all in a certain proportion, finding the matrix with the same proportion can find the matrix where the license plate is located). And calling a classification algorithm (such as Support Vector Machine (SVM), low-Level Regression (LR) and Softmax) trained by using positive and negative samples to classify whether the region is the license plate region or not according to the obtained matrix regions with a plurality of sizes, and extracting the license plate image.

And step 204, processing the license plate image obtained in the step 203.

The resolution of the images determines to some extent the computational load of the network training. The initial license plate image may be scaled to a fixed resolution license plate image, such as 40x 128. And then, inputting the scaled license plate image into the backbone convolutional network of the multi-label license plate recognition convolutional network which is trained in advance.

In some optional implementations of some embodiments, for the detection and processing of the license plate to be recognized, an opensource Computer Vision library may be utilized.

For example, the license plate location may be found from the vehicle image by a color-based segmentation method (color edge algorithm, color distance and similarity algorithm, etc.), a texture-based segmentation method (segmentation using texture features in the horizontal direction of the license plate region, including wavelet texture, horizontal gradient difference texture, etc.), an edge detection-based segmentation method, a mathematical morphology-based segmentation method, etc.

For example, in order to increase the speed of license plate detection, a vehicle image may be subjected to a graying process, and an image of a located license plate region may be subjected to a binarization process or the like.

And step 205, inputting the processed license plate image into a pre-trained multi-label license plate recognition convolution network to obtain the license plate image characteristics.

In some optional implementations of some embodiments, the multi-tag license plate recognition convolutional network includes a backbone convolutional network and a branch convolutional network.

The main neural network is used for extracting high-order characteristic vectors of the license plate image. The branched convolutional network comprises the same sub-network structures corresponding to the number of characters in the license plate, and each sub-network structure is used for identifying each corresponding character in the license plate.

For example, domestic license plates typically contain 7 characters, and the branched convolutional network includes 7 branches having the same sub-network structure.

In some optional implementations of some embodiments, each sub-network structure includes a spatial attention network and a fully Connected Layer (FC) Connected to its output. The spatial attention network plays a role in enhancing and suppressing the characteristics of the channel and also plays a role in enhancing and suppressing the characteristics of the region at the corresponding position of the character. The full-connectivity layer is used to classify each character (e.g., by Softmax).

As an example, the number of classification results of the full-connected layer (i.e. the number of nodes of the output layer corresponding to the full-connected layer) corresponds to the sum of the number of variables of each character content in the license plate. For example, for a Chinese license plate, 65 types of labels are provided, wherein the 65 type labels comprise 31 provinces of Chinese characters, 24 English letters and 10 numbers.

In some embodiments, the fully connected layer may be one layer.

In some embodiments, the fully connected layer is a multilayer.

And step 206, outputting the license plate number based on the license plate image characteristics obtained in the step 205.

And finally obtaining the output corresponding license plate number according to the license plate image characteristics output by the multi-label license plate recognition network.

With continued reference to FIG. 3, FIG. 3 illustrates a structure 300 of a backbone neural network in a multi-tag license plate recognition convolutional network.

In embodiments of the present disclosure, the backbone neural network includes a plurality of volume blocks, and the convolution operation parameters (convolution kernel size, step size, padding) of each volume block may be the same, or different. Each convolution block may contain one or more convolution layers.

Further, a pooling layer is connected at the output of at least one of the volume blocks.

In some embodiments of the present disclosure, the output of each volume block is connected to a corresponding pooling layer.

In at least one embodiment of the present disclosure, a specific implementation of a backbone neural network is provided, comprising three volume blocks, the output of each volume block being connected to a corresponding pooling layer.

For example, taking the license plate photo processed in step 204 as an example of inputting the license plate photo into the above-mentioned backbone convolutional network of the multi-tag license plate recognition convolutional network that has been trained in advance, the license plate photo is input into the first convolutional block of the backbone convolutional network, and the first convolutional block outputs 32 feature maps 301 of 40 × 128.

The first convolution block of the main neural network is a convolution operation performed under the conditions that the number of convolution kernels is 32, the size of the convolution kernels is 3x3, the step size (stride) is 1, and the padding (padding) is 1.

The output of the first convolution block of the backbone neural network is input to the first pooling layer, which outputs 32 20 × 64 feature maps.

Wherein, the window size of the first pooling layer of the main neural network is 3x3, and the step size is 2.

The output of the first pooling layer of the backbone neural network is input to a second convolution block, which outputs 64 feature maps 302 of 20x 64.

The second convolution block of the trunk neural network is a convolution operation performed under the conditions that the number of convolution kernels is 64, the size of the convolution kernels is 3x3, the step size is 1, and the padding is 1.

The output of the second convolution block of the backbone neural network is input to the second pooling layer, which outputs 64 10 × 32 feature maps.

Wherein, the window size of the second pooling layer of the main neural network is 3x3, and the step size is 2.

The output of the second pooling layer of the backbone neural network is input to a third convolution block, which outputs 128 10 × 32 feature maps 303.

The third convolution block of the trunk neural network is formed by stacking the same three convolution layers of which the sizes of 128 convolution kernels are 3x3, the step length is 1 and the filling is 1.

The output of the third convolution block of the backbone neural network is input to the third pooling layer, which outputs 128 5 × 16 feature maps.

Wherein, the window size of the third pooling layer of the main neural network is 3x3, and the step size is 2.

It is to be readily understood that the resolution data of the above-described feature maps are merely examples. For input license plate images with different resolutions, feature maps output after convolution and pooling operations have the sizes of corresponding operations.

Referring to fig. 4, fig. 4 illustrates a structure 400 of a branched neural network in a multi-tag license plate recognition convolutional network.

In some embodiments, the output of the backbone convolutional network is input to the branched neural network structure described above.

Taking a number plate of china as an example, the branched neural network is composed of seven sub-networks with the same network structure, including a first branched sub-network 401, a second branched sub-network 402, a third branched sub-network 403, a fourth branched sub-network 404, a fifth branched sub-network 405, a sixth branched sub-network 406, and a seventh branched sub-network 407. Each branch sub-network corresponds to one character in the license plate, namely, the first branch sub-network corresponds to the label 1, the second branch sub-network corresponds to the label 2, the third branch sub-network corresponds to the label 3, the fourth branch sub-network corresponds to the label 4, the fifth branch sub-network corresponds to the label 5, the sixth branch sub-network corresponds to the label 6, and the seventh branch sub-network corresponds to the label 7. Wherein the first subnetwork structure comprises a spatial attention network and a fully connected layer. The spatial attention network is connected with 1 or more than 1 fully connected layer so as to integrate local information with category distinction in the volume block or the pooling layer. Finally, 65 character category information is output. Wherein, the character of each branch output represents a single character symbol code in the license plate, and the output of the seven-branch network represents the whole license plate number

Referring to fig. 5, fig. 5 shows a block diagram 500 of a spatial attention network.

In some embodiments, the output of the backbone convolutional network is input into the spatial attention network and output to the fully-connected layers of all branches, respectively.

In one embodiment of the present disclosure, the spatial attention network includes a first convolution block, a second upper branch block, a third lower branch block, a fourth element multiplication operation layer.

For example, the upper branching block includes: a first pooling layer, a second convolution block, a third full link layer, a fourth deconvolution block, and a readjustment layer.

In fig. 5, the output of the backbone convolutional network is input to the first convolutional block of the spatial attention network, as indicated by reference numeral 501.

Wherein, the number of convolution kernels of the first convolution block of the spatial attention network is 128, the convolution kernel size is 1 × 1, the step size is 1, and the padding is 0. The convolution outputs 128 10 × 32 feature maps.

In fig. 5, the signature graph output by the first convolution block of the spatial attention network is input to the upper branching block first pooling layer, as indicated by reference numeral 502.

Wherein, the window of the upper branch first pooling layer is 2x2, and the step size is 2.

In fig. 5, the first pooling layer output is input to the upper branch second volume block based on the upper branch in step 502, as indicated by reference numeral 503.

The convolution kernel size of the second convolution block of the upper branch is 1 × 1, the step size is 1, and the padding is 0. The convolution outputs 8 feature maps of 3 × 8, and the feature map channels are reduced to 1/16, thereby reducing the amount of calculation and increasing the nonlinearity.

In fig. 5, the output of the upper branch second convolution block is input to the upper branch third fully-connected layer, outputting 128 1 × 1 signatures, as indicated by reference numeral 504.

The full-connection layer weight completely covers the feature image, and for the multi-label first branch, the weight of the full-connection layer corresponding to the first character position area is enhanced, the weight of the full-connection layer corresponding to the area outside the first character position area is weakened, and a mechanism of character position space attention is played. The upper branch third fully connected layer is subsequently connected with an activation function.

In fig. 5, for the first branch, the value output by the activation function (e.g., sigmoid) is such that the weight of the first character is closer to 1, the weight of the region other than the first character is closer to 0, the value output by the activation function is input to the fourth deconvolution block of the upper branch, and the resolution of the output profile is readjusted (e.g., 5200 feature maps of 1 × 1 are output, and readjusted to 65 feature maps of 5 × 16 are output), as indicated by reference numeral 505. Among them, deconvolution is also called transposed convolution, which is the inverse process of convolution. By setting the deconvolution block and the realignment layer to output a map of the same size resolution as the first convolution block of the spatial attention network, for example, a 65 by 5 feature map 16.

In fig. 5, the result of deconvolution and re-adjustment of the branch output on the layer is multiplied by the first convolution block of the spatial attention network to output the result, and the feature map with spatial attention is shown as reference numeral 506. In the obtained characteristic diagram, the license plate symbol information is emphasized, and other irrelevant information except the license plate symbol is weakened.

Although not specifically described in the present disclosure, those skilled in the art will appreciate that the above-described required multi-tag license plate recognition convolutional network can be obtained by using sample data and through a training process.

With further reference to fig. 6, to implement the methods illustrated in the above figures, the present disclosure provides some embodiments of an apparatus for multi-tag license plates, which correspond to those illustrated in fig. 2, and which may be particularly applicable in various electronic devices.

As shown in fig. 6, the apparatus for recognizing a multi-labeled license plate includes: an acquisition unit 601 configured to acquire an image to be recognized; a detection unit 602 configured to detect the image to be recognized; an extracting unit 603 configured to extract a license plate image from the vehicle image in response to detecting that the image to be recognized includes the vehicle image; an image processing unit 604 configured to perform image processing on the license plate image; an input unit 605 configured to input the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image features, where the multi-label license plate recognition convolutional network includes a spatial attention network; an output unit 606 configured to output a license plate number based on the license plate image feature.

An embodiment of the present disclosure provides an electronic device, including: one or more processors; storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of the embodiments described above.

Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the server shown in fig. 1) 700 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device in some embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., central processing unit CPU, graphics processor GPU, neural network processor NPU, digital processor DSP, application specific integrated circuit ASIC, programmable gate array FPGA, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.

In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network via communications means 709, or may be installed from storage 708, or may be installed from ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of some embodiments of the present disclosure.

It should be noted that the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText transfer protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the apparatus; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an image to be identified; detecting an image to be recognized; in response to detecting that the image to be recognized comprises a vehicle image, extracting a license plate image from the vehicle image; carrying out image processing on the license plate image; inputting the processed license plate image into a pre-trained multi-label license plate recognition convolution network to obtain license plate image characteristics, wherein the multi-label license plate recognition convolution network comprises a space attention network; and outputting the license plate number based on the license plate image characteristics.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a detection unit, an extraction unit, an image processing unit, an input unit, and an output unit. Here, the names of the units do not constitute a limitation to the units themselves in some cases, and for example, the detection unit may also be described as a "unit that detects the above-described image to be recognized".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims

1. A method for identifying a multi-labeled license plate, comprising:

acquiring an image to be identified;

detecting the image to be identified;

in response to detecting that the image to be recognized comprises a vehicle image, extracting a license plate image from the vehicle image;

carrying out image processing on the license plate image;

inputting the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image characteristics, wherein the multi-label license plate recognition convolutional network comprises a space attention network;

and outputting the license plate number based on the license plate image characteristics.

2. The method of claim 1, wherein the spatial attention network comprises:

a first convolution block, a second upper branch block, a third lower branch block and a fourth element multiplication operation layer;

the second upper branching block includes: the upper branch is a first pooling layer, the upper branch is a second convolution block, the upper branch is a third full-connection layer, the upper branch is a fourth reverse convolution block and a readjustment layer.

3. The method of claim 1, wherein the multi-tag license plate recognition convolutional network comprises:

a trunk convolutional network and a branch convolutional network;

the backbone convolutional network includes: a plurality of convolution blocks, each convolution block containing one or more convolution layers.

The branched convolutional network includes: a predetermined number of branched convolutional subnetworks having the same network structure.

4. The method of claim 3, wherein the output of each of the volume blocks is connected with a pooling layer.

5. The method of claim 3, wherein the number of branching convolutional subnets corresponds to the number of characters in a license plate.

6. The method of claim 3, wherein the branching convolution sub-network comprises:

spatial attention networks and fully connected layers.

7. The method of claim 4, wherein the backbone convolutional network comprises three convolutional blocks, the output of each convolutional block connecting a corresponding pooling layer.

8. An apparatus for identifying a multi-tag license plate, comprising:

an acquisition unit configured to acquire an image to be recognized;

a detection unit configured to detect the image to be recognized;

an extraction unit configured to extract a license plate image from the vehicle image in response to detecting that the image to be recognized includes the vehicle image;

an image processing unit configured to perform image processing on the license plate image;

the input unit is configured to input the processed license plate image into a pre-trained multi-label license plate recognition convolutional network to obtain license plate image characteristics, and the multi-label license plate recognition convolutional network comprises a space attention network;

an output unit configured to output a license plate number based on the license plate image feature.

9. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-8.

10. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-8.