CN115953770A

CN115953770A - License plate recognition method and device based on video stream, computing equipment and storage medium

Info

Publication number: CN115953770A
Application number: CN202211607153.3A
Authority: CN
Inventors: 胡中华; 陈炫憧; 彭博
Original assignee: Beijing Signalway Technologies Co ltd
Current assignee: Beijing Signalway Technologies Co ltd
Priority date: 2022-12-14
Filing date: 2022-12-14
Publication date: 2023-04-11

Abstract

The invention discloses a license plate recognition method based on video streaming, which comprises the following steps: acquiring a video stream to be identified; sequentially inputting a video stream to be recognized into a first slow channel network and a first fast channel network of a trained vehicle and license plate detection model to detect the positions of the vehicle and the license plate so as to obtain a license plate region image and a time sequence characteristic of a sequence frame; sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model to perform license plate recognition, and obtaining a license plate recognition result and time sequence characteristics of the sequence frame; and outputting a license plate recognition result of the video stream to be recognized according to the license plate recognition result and the time sequence characteristics of the sequence frame. The scheme can improve the speed and the precision of license plate recognition on low-calculation-force equipment.

Description

License plate recognition method and device based on video stream, computing equipment and storage medium

Technical Field

The invention relates to the technical field of computer vision and image recognition, in particular to a license plate recognition method and device based on video streaming, computing equipment and a storage medium.

Background

The license plate recognition method can accurately and quickly recognize license plates in various complex scenes, and can effectively improve the efficiency of traffic law enforcement, parking lot management and road traffic. Currently, license plate recognition is mainly based on a deep learning method, and is used for directly positioning license plates in images and recognizing license plate contents.

The prior art discloses a license plate recognition system based on deep learning, wherein when a vehicle entering a video is detected, a recognition function is triggered, and a license plate recognition and positioning model is used for recognizing and positioning the relative position of a license plate in an image; and identifying and positioning the category, size, color and character information of the license plate through the license plate content identification model. The method needs two models for detection, has complicated detection steps, uses the license plate as a small target in a large scene, has low detection efficiency, needs higher calculation amount of equipment, and is not suitable for a low-calculation-force platform.

Therefore, a license plate recognition method based on video streaming is needed, which can improve the precision and efficiency of license plate recognition of complex scene video streaming on lightweight equipment, so as to solve the problems in the prior art.

Disclosure of Invention

In view of the above problems, in order to improve the recognition of license plates in complex scene video streams on low-computing-power devices, the scheme provides a license plate recognition method, a license plate recognition device, a computing device and a storage medium based on video streams, which can reduce the dependence of license plate recognition on picture resolution, reduce learning difficulty and improve recognition accuracy.

According to a first aspect of the present invention, there is provided a license plate recognition method based on a video stream, including: firstly, acquiring a video stream to be identified; then, sequentially inputting the video stream to be recognized into a first slow channel network and a first fast channel network of the trained vehicle and license plate detection model to detect the position of the vehicle and the license plate so as to obtain a license plate region image and a time sequence characteristic of a sequence frame; then, sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model for license plate recognition to obtain a license plate recognition result and time sequence characteristics of the sequence frame; and finally, outputting a license plate recognition result of the video stream to be recognized according to the license plate recognition result and the time sequence characteristics of the sequence frame.

According to the license plate recognition method, the spatial position information between the vehicle and the license plate is learned, and the approximate region of the license plate is determined according to the relative position between the vehicle and the license plate, so that the dependence of the license plate on the resolution of the image can be reduced, and the learning difficulty is reduced; different characteristics of input video frames are extracted alternately at high speed and low speed, so that the requirement of computing resources is remarkably reduced and the identification precision and speed are improved while multi-frame identification information is integrated.

Optionally, in the above method, the trained vehicle and license plate detection model includes a first slow channel network and a first fast channel network connected in parallel, and a first ConvLSTM feature fusion network, where the first slow channel network includes 1 first slow channel module, and the first fast channel network includes 5 first fast channel modules connected in series.

Optionally, in the above method, the first fast channel module uses mobilenetv2 as a skeleton, FPN as a neck module, and the head module of the centeret as a prediction branch, for extracting the first feature of the input image by a first time step; the first slow channel module uses mobilenetv3 as a main stem, FPN as a neck module, and a head module of the centeret as a prediction branch, and is used for extracting a second feature of the input image through a second time step, wherein the first time step is smaller than the second time step; the first ConvLSTM feature fusion network is used for fusing first features obtained by the first fast channel network and second features obtained by the first slow channel network so as to obtain the position and time sequence features of the vehicle and the license plate of the input image according to the fused features.

Optionally, in the above method, the trained license plate recognition model includes a second slow channel network and a second fast channel network connected in parallel, and a second ConvLSTM feature fusion network, where the second slow channel network includes 1 second slow channel module, and the second fast channel network includes 3 second fast channel modules connected in series.

Optionally, in the above method, the second fast channel module uses mobilenetv2 as a skeleton and crnn as a recognition prediction network for extracting a third feature of the input image by a third time step. The second slow channel module uses mobilenetv3 as a backbone and crnn as an identification prediction network, and is used for extracting a fourth feature of the input image by a fourth time step, wherein the third time step is smaller than the fourth time step. The second ConvLSTM feature fusion network is used for fusing a third feature obtained by the second fast channel network and a fourth feature obtained by the second slow channel network so as to obtain a license plate recognition result and a time sequence feature of the input image according to the fused features.

Optionally, in the method, in the task of detecting the license plate region, the following steps may be performed:

s1, inputting a current frame image into a first slow channel network of a trained vehicle and license plate detection model for detection to obtain a vehicle and license plate position and a first time sequence characteristic of the current frame image, and determining a first license plate area image of the current frame image based on the vehicle and license plate position of the current frame image;

s3, inputting the next frame of image and the first time sequence feature into a first fast channel network of the trained vehicle and license plate detection model for detection to obtain the vehicle and license plate position and the third time sequence feature of the next frame of image, and determining a second license plate area image of the next frame based on the vehicle and license plate position of the next frame of image;

and (4) repeating the steps S1 and S3 until the detection of the positions of the vehicles and the license plates of all the sequence frames is completed.

Optionally, in the method, in the license plate recognition task, the following steps may be performed:

s2, inputting the first license plate area image into a second slow channel network of a trained license plate recognition model for recognition to obtain a license plate recognition result and a second time sequence characteristic of the current frame image;

s4, inputting the second license plate area image and the second time sequence feature into a second fast channel network of the trained license plate recognition model for recognition to obtain a license plate recognition result and a fourth time sequence feature of the next frame of image;

and repeating the steps S2 and S4 until the license plate recognition of all the sequence frames is completed.

According to a second aspect of the present invention, there is provided a license plate recognition apparatus based on a video stream, comprising: the license plate recognition system comprises an acquisition module, a license plate region detection module, a license plate recognition module and an output module. The device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a video stream to be recognized;

the license plate region detection module is used for sequentially inputting the video stream to be recognized into a first slow channel network and a first fast channel network of a trained vehicle and license plate detection model to detect the position of the vehicle and the license plate so as to obtain a license plate region image and a time sequence characteristic of a sequence frame;

the license plate recognition module is used for sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model for license plate recognition to obtain the license plate recognition result and the time sequence characteristics of the sequence frame; and the output module is used for outputting the license plate recognition result of the video stream to be recognized according to the license plate recognition result of the sequence frame and the time sequence characteristics.

According to a third aspect of the present invention, there is provided a computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor performing the video stream-based license plate recognition method according to the first aspect.

According to a fourth aspect of the present invention, there is provided a computer readable storage medium comprising a computer program stored thereon, which is capable of being loaded by a processor and executing the video stream-based license plate recognition method according to the first aspect.

According to the scheme of the invention, the license plate detection problem is converted into the learning vehicle and the license plate space position, and the license plate approximate region is determined based on the distance between the vehicle central point and the license plate central point, so that the dependence of license plate detection on the resolution of an input picture can be reduced, and the algorithm learning difficulty is reduced; and the feature extraction tasks are alternately executed through two types of feature extraction networks of a fast channel and a slow channel, the balance of calculation speed and precision can be kept, and the recognition precision is obviously improved by integrating multi-frame license plate recognition results. Therefore, the license plate recognition method based on the video stream can be operated on low-computing-power equipment, and is suitable for high-precision and real-time license plate recognition of large-scene license plate video streams.

The above description is only an overview of the technical solutions of the present invention, and the present invention can be implemented in accordance with the content of the description so as to make the technical means of the present invention more clearly understood, and the above and other objects, features, and advantages of the present invention will be more clearly understood.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 shows a schematic block diagram of a computing device 100, according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a network architecture of a trained vehicle and license plate detection model according to an embodiment of the invention;

FIG. 3 is a network architecture diagram of a trained license plate recognition model according to one embodiment of the present invention;

FIG. 4 is a flow chart illustrating a method 400 for license plate recognition based on video streaming according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an interlaced model of a license plate recognition method based on video stream according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram illustrating a license plate recognition apparatus 600 based on a video stream according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The license plate identification based on the video stream is to dynamically acquire vehicle information through a camera to obtain a video stream containing a vehicle, and then process the video stream to identify the license plate information. In the prior art, the whole image is taken as a detection object, and the license plate image segmentation and the license plate content identification are carried out in two steps, so that the calculation amount is large, and the influence of external objective factors is easy to influence. And accumulated errors are easy to occur between the two models, so that the identification precision of the whole model is influenced.

Compared with image target detection, video target detection is highly redundant and comprises a large amount of time locality and space locality, so that the problem of a large amount of redundancy between continuous frames in a video stream can be solved by fully utilizing a time sequence context relationship, and the detection speed is improved.

Therefore, in order to improve the accuracy and speed of video stream license plate recognition in a complex scene and reduce the calculation amount of a model, the scheme provides a license plate recognition method based on a video stream, which can be applied to low-calculation-capacity equipment, obtains the space relative position of a vehicle and a license plate through a deep learning algorithm, can reduce the dependence of license plate recognition on the resolution ratio of an input image and reduce the algorithm learning difficulty, combines the slowfast algorithm idea, uses two networks with different calculation frame rates to perform feature extraction, and can improve the recognition accuracy while reducing the calculation amount of the model.

FIG. 1 shows a schematic block diagram of a computing device 100, according to an embodiment of the invention. As shown in FIG. 1, in a basic configuration 102, a computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.

Depending on the desired configuration, the processor 104 may be any type of processor, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a Digital Signal Processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.

Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The physical memory in the computing device is usually referred to as a volatile memory RAM, and data in the disk needs to be loaded into the physical memory to be read by the processor 104. System memory 106 may include an operating system 120, one or more applications 122, and program data 124. In some implementations, the application 122 can be arranged to execute instructions on an operating system with the program data 124 by the one or more processors 104. Operating system 120 may be, for example, linux, windows, etc., which includes program instructions for handling basic system services and performing hardware dependent tasks. The application 122 includes program instructions for implementing various user-desired functions, and the application 122 may be, for example, a browser, instant messenger, a software development tool (e.g., an integrated development environment IDE, a compiler, etc.), and the like, but is not limited thereto. When the application 122 is installed into the computing device 100, a driver module may be added to the operating system 120.

When the computing device 100 is started, the processor 104 reads program instructions of the operating system 120 from the memory 106 and executes them. Applications 122 run on top of operating system 120, utilizing interfaces provided by operating system 120 and the underlying hardware to implement various user-desired functions. When the user starts the application 122, the application 122 is loaded into the memory 106, and the processor 104 reads the program instructions of the application 122 from the memory 106 and executes the program instructions.

The computing device 100 also includes a storage device 132, the storage device 132 including a removable storage 136 and a non-removable storage 138, the removable storage 136 and the non-removable storage 138 each connected to the storage interface bus 134.

Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.

A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media. In the computing device 100 according to the present invention, the application 122 includes instructions for performing the video stream-based license plate recognition method 400 of the present invention.

The lightweight video streaming license plate recognition method provided by the invention comprises two tasks of license plate detection and license plate information recognition. The license plate detection part combines license plate detection and vehicle detection, so that a vehicle and license plate detection model for detecting vehicles and approximate license plate areas is designed, and the license plate is assisted to be detected through the relative position between the vehicle and the license plate in the video stream.

Based on the thought of slow network and fast network algorithms, the scheme uses two networks with different feature extraction precisions to respectively extract different features of the video frame. The fast network uses a small time step to quickly extract weak features of a video frame at a higher frame rate, and the accuracy is poor. Slow network uses large time steps to extract the exact features of the video frames at lower frame rates. The larger the time step is, the lower the accuracy of feature extraction is, the smaller the time step is, the higher the accuracy of feature extraction is, and the video frames can be alternately processed through two feature extraction networks with different precisions. And finally, performing feature fusion on the slow network and the fast network through a ConvLSTM feature fusion structure, and detecting the positions of the vehicle and the license plate based on the fused features.

Fig. 2 is a schematic structural diagram of a trained vehicle and license plate detection model according to an embodiment of the invention. As shown in fig. 2, the trained vehicle and license plate detection model includes a first slow channel network and a first fast channel network connected in parallel, and a first ConvLSTM feature fusion network, where the first slow channel network includes 1 first slow channel module, and the first fast channel network includes 5 first fast channel modules connected in series.

The method can improve the system computing efficiency based on an asynchronous mode and a quantitative model, namely, a fast channel network is adopted to extract image features, a slow channel network is adopted to compute the features and update the storage features. And the quantization model can compress the model, eliminating the need for rescaling.

According to one embodiment of the present invention, the first fast channel module uses mobilenetv2 as a skeleton, FPN as a neck module, and the head module of centernet as a prediction branch for extracting a first feature of an image by a first time step. The fast channel refers to a fast sampling frame rate and has fidelity in time resolution, so the first feature is a weak feature and has low accuracy.

The first slow channel module uses mobilenetv3 as a main stem, FPN as a neck module, and the head module of the centeret as a prediction branch, and is used for mainly extracting the spatial features of the input image through the second time step. The first time step is smaller than the second time step, for example, the first time step is 2, and the second time step is 16. That is, the sampling frame rate of the slow channel is small, and one frame is extracted every 16 frames, so that the second feature is a strong feature, and the accuracy is high.

The mobilenetv2 network mainly uses a bottleneck residual module to perform feature extraction, the mobilenetv3 network adds an SE structure in the bottleneck residual module, and uses ReLU6 (x + 3)/6 to approximately replace sigmoid, so that the number of convolution kernels of the first convolution layer is reduced (32 is changed into 16). The model initialization time and inference time of mobilenetv3 are slower than mobilenetv2, so this scheme uses mobilenetv3 for slow channel networks and mobilenetv2 for fast channel networks.

Feature maps of different time dimensions and feature scales of the FPN feature pyramid network can be used for fusion. And finally, performing global average pooling and eigenvector connection on the output of each channel by using the centeret, and finally outputting three prediction branches, namely a license plate position heat map prediction branch, a distance prediction branch from a vehicle center point to a license plate center point and a license plate size prediction branch.

When a pre-constructed model is trained, 5000 segments of vehicle video stream data of a bayonet scene can be collected, the frame rate of each video segment is 25fps, the time length is 30 seconds, and the collected video segments are labeled to the positions of vehicles and license plates in each frame of image and the recognition result of the license plates and serve as training data to be input into the model. And iteratively training the model for 160 cycles based on the centernet loss function to obtain the trained vehicle and license plate detection model, wherein the first slow channel module is connected with the first fast channel module 5 times every 1 time.

The first ConvLSTM feature fusion network is used for fusing a first feature obtained by the first fast channel network with a second feature obtained by the first slow channel network. ConvLSTM converts the 2D input in LSTM into a 3D tensor, the last two dimensions being the spatial dimensions (rows and columns). For data at each moment t, convLSTM replaces a part of connection operation in the LSTM with convolution operation, namely, prediction is carried out through current input and local adjacent past states, and finally detection frames and time sequence characteristics of vehicles and license plates of input video frames are obtained after feature fusion.

Fig. 3 is a schematic structural diagram of a trained license plate recognition model according to an embodiment of the invention. As shown in fig. 3, the trained license plate recognition model includes a second fast channel network and a second slow channel network connected in parallel, and a second ConvLSTM feature fusion network, where the second fast channel network includes 3 second fast channel modules connected in series, and the second slow channel network includes 1 second slow channel module.

According to one embodiment of the present invention, the second fast channel module uses mobilenetv2 as a backbone and crnn as a recognition prediction network for extracting a temporal feature of the input image by a third time step. The second slow channel network uses mobilenetv3 as a backbone and crnn as an identification prediction network, and is used for extracting the spatial features of the input image through a fourth time step, wherein the third time step is smaller than the fourth time step. That is, the sampling frame rate of the second fast-channel network is greater than the sampling frame rate of the second slow-channel network.

The crnn network is a convolution cyclic neural network and is used for identifying an indefinite-length text sequence end to end and converting text identification into a sequence learning problem of time sequence dependence, so that the text identification accuracy is effectively improved.

When a pre-constructed license plate recognition model is trained, 5000 segments of vehicle video stream data of a checkpoint scene can be collected, the frame rate of each video segment is 25fps, the time length is 30 seconds, and the collected video segments are labeled to the position of a vehicle and a license plate in each frame of image and a license plate recognition result and serve as training data input models. And (3) iteratively training the model for 200 cycles based on the ctc loss function to obtain the trained license plate recognition model, wherein the second slow channel module is connected with the second fast channel module 3 times every 1 time.

And the second ConvLSTM feature fusion network is used for fusing the third features obtained by the second fast channel network and the fourth features obtained by the second slow channel network to obtain a license plate recognition result and a time sequence feature of the input image.

FIG. 4 is a flowchart illustrating a license plate recognition method 400 based on video streaming according to an embodiment of the invention. As shown in fig. 4, the method starts with step S410, obtaining a video stream to be identified.

The video stream to be identified may be a vehicle video stream collected by a road camera monitoring device, a vehicle video stream collected by a parking lot entrance or a high-speed toll gate, and since video segments including vehicles in the video stream may not be continuous, it is necessary to process the video stream data to be identified.

For example, the acquired original video stream may be subjected to a slicing process into video segments from the vehicle entering the video capture picture to the vehicle leaving the video capture picture.

And then, executing step S420, and sequentially inputting the video stream to be recognized into the first slow channel network and the first fast channel network of the trained vehicle and license plate detection model to detect the position of the vehicle and the license plate so as to obtain the license plate region image and the time sequence characteristics of the sequence frame.

The method specifically comprises the following steps: s1, inputting a current frame image into a first slow channel network of a trained vehicle and license plate detection model for detection, and obtaining the position of the vehicle and the license plate of the current frame image and first time sequence characteristics. Then, a first license plate area image of the current frame image is determined based on the vehicle and license plate position of the current frame image. And S3, inputting the next frame of image and the first time sequence feature into a first fast channel network of the trained vehicle and license plate detection model for detection to obtain the vehicle and license plate position and the third time sequence feature of the next frame of image, and determining a second license plate area image of the next frame based on the vehicle and license plate position of the next frame of image. And (4) repeating the steps S1 and S3 until the detection of the positions of the vehicles and the license plates of all the sequence frames is completed.

By alternately using the two feature extraction networks of the fast channel and the slow channel to respectively extract different frame features, the calculation redundancy can be reduced, and the balance between the speed and the accuracy can be achieved.

And then, executing step S430, and sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model for license plate recognition to obtain a license plate recognition result and time sequence characteristics of the sequence frame.

The method specifically comprises the following steps: s2, inputting the first license plate area image into a second slow channel network of the trained license plate recognition model for recognition to obtain a license plate recognition result and a second time sequence characteristic of the current frame image. And S4, inputting the second license plate area image and the second time sequence characteristic into a second fast channel network of the trained license plate recognition model for recognition to obtain a license plate recognition result and a fourth time sequence characteristic of the next frame of image. And repeating the steps S2 and S4 until the license plate recognition of all the sequence frames is completed.

And finally, executing the step S440, and outputting a license plate recognition result of the video stream to be recognized according to the license plate recognition result and the time sequence characteristics of the sequence frame.

The vehicle license plate recognition results of the sequence frames can be voted and cross-compared, and a recognition result with the highest accuracy is finally obtained. For example, the number of times of occurrence of each license plate character is counted, and the character with the largest number of occurrences is selected as the final license plate recognition result.

Thus, the license plate recognition step for a video stream can be summarized as follows: inputting the picture t into a first slow channel network to obtain the position of a vehicle t, the approximate region of the corresponding license plate and first time sequence characteristics, then intercepting the approximate region of the license plate of the picture t, and inputting the approximate region of the license plate of the picture t into a second slow channel network to obtain a license plate recognition result t and second time sequence characteristics.

Inputting the picture t +1 and the first time sequence characteristics into a first fast channel network to obtain the position of a vehicle at t +1, the approximate region of a corresponding license plate and third time sequence characteristics, intercepting the approximate region of the license plate of the picture t +1, and inputting the approximate region of the license plate of the picture t +1 and the third time sequence characteristics into a second fast channel network to obtain a license plate recognition result at t +1 and the time sequence characteristics. And repeating the steps according to a slow and fast alternation rule until the vehicle leaves the video picture, and outputting a license plate recognition result.

FIG. 5 is a diagram illustrating a staggered model of a license plate recognition method according to an embodiment of the present invention. As shown in FIG. 5, firstly, the vehicle video stream sequence frame is input \8230; l _t-3 、l _t-2 、l _t-1 、l _t 、l _t+1 、l _t+2 8230and 8230. Then, based on the principle of fast-slow alternation, the current frame is firstly input into a first slow channel network and a second slow channel network, then the next frame is input into a first fast channel network and a second fast channel network, and the like until all the sequence frames finish license plate recognition, and license plate recognition results of all the sequence frames are obtained: 8230while its advantages are high effect _t-3 、D _t-2 、D _t-1 、D _t 、D _t+1 、D _t+2 8230and 8230. And finally, integrating a plurality of license plate recognition results based on the time sequence characteristics to obtain a final license plate recognition result.

Fig. 6 is a schematic structural diagram illustrating a license plate recognition apparatus 600 based on a video stream according to an embodiment of the present invention. As shown in fig. 6, the video stream-based license plate recognition apparatus 600 includes: the license plate recognition system comprises an acquisition module 610, a license plate region detection module 620, a license plate recognition module 630 and an output module 640.

The obtaining module 610 may obtain a video stream to be identified. The license plate region detection module 620 may sequentially input the video stream to be recognized into the first slow channel network and the first fast channel network of the trained vehicle and license plate detection model to perform vehicle and license plate position detection, so as to obtain a license plate region image and a time sequence feature of the sequence frame. The license plate recognition module 630 may sequentially input the license plate region image and the timing sequence feature of the sequence frame into the second slow channel network and the second fast channel network of the trained license plate recognition model for license plate recognition, so as to obtain the license plate recognition result and the timing sequence feature of the sequence frame. The output module 640 may output the license plate recognition result of the video stream to be recognized according to the license plate recognition result and the time sequence characteristics of the sequence frame.

An embodiment of the present application discloses a computer-readable storage medium, which includes a computer program that can be loaded by a processor and execute the license plate recognition method 400 based on video stream.

Wherein the computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device; program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

According to the technical scheme, the license plate detection problem is converted into the spatial positions of the learning vehicle and the license plate, and the approximate region of the license plate is determined based on the distance between the vehicle central point and the license plate central point, so that the dependence of license plate detection on the resolution of an input picture can be reduced, and the algorithm learning difficulty is reduced; and the feature extraction tasks are alternately executed through two types of feature extraction networks of a fast channel and a slow channel, the balance of calculation speed and precision can be kept, and the recognition precision is obviously improved by integrating multi-frame license plate recognition results. Therefore, the license plate recognition method based on the video stream can be operated on low-computing-power equipment, and is suitable for high-precision and real-time license plate recognition of large-scene license plate video streams.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may additionally be divided into multiple sub-modules.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Moreover, those skilled in the art will appreciate that although some embodiments described herein include some features included in other embodiments, not others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

Furthermore, some of the embodiments are described herein as a method or combination of method elements that can be implemented by a processor of a computer system or by other means of performing a function. A processor with the necessary instructions for implementing a method or method elements thus forms an apparatus for implementing the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.

As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed by way of illustration and not limitation with respect to the scope of the invention, which is defined by the appended claims.

Claims

1. A license plate recognition method based on video streaming, which is suitable for being executed in a computing device, is characterized by comprising the following steps:

acquiring a video stream to be identified;

sequentially inputting the video stream to be recognized into a first slow channel network and a first fast channel network of a trained vehicle and license plate detection model to detect the position of the vehicle and the license plate so as to obtain a license plate region image and a time sequence characteristic of a sequence frame;

sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of a trained license plate recognition model for license plate recognition to obtain license plate recognition results and time sequence characteristics of the sequence frame;

and outputting a license plate recognition result of the video stream to be recognized according to the license plate recognition result and the time sequence characteristics of the sequence frame.

2. The license plate recognition method of claim 1,

the trained vehicle and license plate detection model comprises a first slow channel network, a first fast channel network and a first ConvLSTM feature fusion network which are connected in parallel, wherein the first slow channel network comprises 1 first slow channel module, and the first fast channel network comprises 5 first fast channel modules which are connected in series.

3. The license plate recognition method of claim 2,

the first fast channel module uses mobilenetv2 as a main stem, FPN as a neck module and a head module of the centeret as a prediction branch, and is used for extracting a first feature of an input image through a first time step;

the first slow channel module uses mobilenetv3 as a trunk, FPN as a neck module, and a head module of the centrenetet as a prediction branch, and is used for extracting a second feature of the input image through a second time step, wherein the first time step is smaller than the second time step;

the first ConvLSTM feature fusion network is used for fusing a first feature obtained by the first fast channel network and a second feature obtained by the first slow channel network so as to obtain the position and time sequence features of the vehicle and the license plate of the input image according to the fused features.

4. The license plate recognition method of claim 1, wherein the trained license plate recognition model comprises a second slow channel network and a second fast channel network connected in parallel and a second ConvLSTM feature fusion network, the second slow channel network comprises 1 second slow channel module, and the second fast channel network comprises 3 second fast channel modules connected in series.

5. The license plate recognition method of claim 4,

the second fast channel module uses mobilenetv2 as a backbone and crnn as an identification prediction network, and is used for extracting a third feature of the input image through a third time step;

the second slow channel module uses mobilenetv3 as a backbone and crnn as an identification prediction network, and is used for extracting a fourth feature of the input image through a fourth time step, wherein the third time step is smaller than the fourth time step;

and the second ConvLSTM feature fusion network is used for fusing the third features obtained by the second fast channel network and the fourth features obtained by the second slow channel network so as to obtain a license plate recognition result and a time sequence feature of the input image according to the fused features.

6. The license plate recognition method of claim 1, wherein the step of sequentially inputting the video stream to be recognized into a first slow channel network and a first fast channel network of a trained vehicle and license plate detection model to perform vehicle and license plate position detection to obtain a license plate region image and a time sequence feature of a sequence frame comprises:

s1, inputting a current frame image into a first slow channel network of a trained vehicle and license plate detection model for detection to obtain the position of the vehicle and the license plate of the current frame image and a first time sequence characteristic, and determining a first license plate area image of the current frame image based on the position of the vehicle and the license plate of the current frame image;

s3, inputting the next frame of image and the first time sequence feature into a first fast channel network of a trained vehicle and license plate detection model for detection to obtain the position of the vehicle and the license plate of the next frame of image and a third time sequence feature, and determining a second license plate area image of the next frame based on the position of the vehicle and the license plate of the next frame of image;

and (5) repeatedly executing the steps S1 and S3 until the detection of the positions of the vehicles and the license plates of all the sequence frames is completed.

7. The license plate recognition method of claim 6, wherein the step of sequentially inputting the license plate region image and the timing sequence feature of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model for license plate recognition to obtain the license plate recognition result and the timing sequence feature of the sequence frame comprises:

8. A video stream-based license plate recognition apparatus, comprising:

the acquisition module is used for acquiring a video stream to be identified;

the license plate region detection module is used for sequentially inputting the video stream to be recognized into a first slow channel network and a first fast channel network of a trained vehicle and license plate detection model to detect the position of the vehicle and the license plate so as to obtain a license plate region image and time sequence characteristics of a sequence frame;

the license plate recognition module is used for sequentially inputting the license plate region image and the time sequence characteristics of the sequence frame into a second slow channel network and a second fast channel network of the trained license plate recognition model for license plate recognition to obtain a license plate recognition result and time sequence characteristics of the sequence frame;

and the output module is used for outputting the license plate recognition result of the video stream to be recognized according to the license plate recognition result of the sequence frame and the time sequence characteristics.

9. A computing device, characterized by: comprising a memory, a processor and a computer program stored on said memory and executable on said processor, said processor executing the video stream based license plate recognition method according to any of claims 1-7.

10. A computer-readable storage medium comprising a computer program stored thereon, which can be loaded by a processor and executes the video stream-based license plate recognition method according to any one of claims 1 to 7.