WO2024011889A1 - 信息识别方法、装置及存储介质 - Google Patents

信息识别方法、装置及存储介质 Download PDF

Info

Publication number
WO2024011889A1
WO2024011889A1 PCT/CN2023/073724 CN2023073724W WO2024011889A1 WO 2024011889 A1 WO2024011889 A1 WO 2024011889A1 CN 2023073724 W CN2023073724 W CN 2023073724W WO 2024011889 A1 WO2024011889 A1 WO 2024011889A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
characters
information
image frames
image
Prior art date
Application number
PCT/CN2023/073724
Other languages
English (en)
French (fr)
Inventor
刘洋
吴沛晗
夏涛
赵鑫
郭明杰
Original Assignee
北京京东乾石科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东乾石科技有限公司 filed Critical 北京京东乾石科技有限公司
Publication of WO2024011889A1 publication Critical patent/WO2024011889A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • This application relates to the field of information identification technology, and relates to an information identification method, device and storage medium.
  • License plate recognition is a very important link in vehicle management, and license plate recognition itself also has different scenarios, such as license plate recognition at gates and license plate recognition at platforms, which usually use computer vision methods.
  • Solutions in academia generally pursue the accurate identification of license plates appearing on a single picture, but do not consider the need to continuously identify surveillance images in real-life scenarios, and their ability to respond to interference is relatively poor.
  • Existing solutions in the industry are mostly used to identify the front license plate of vehicles, and the camera is relatively close to the license plate, and the scene is relatively simple. Therefore, the existing solutions have low recognition accuracy for license plate information when faced with more complex scenarios with more interference.
  • the information recognition method, device and storage medium provided by the embodiments of the present application can improve the recognition accuracy of vehicle plate information.
  • the embodiment of this application provides an information identification method, including:
  • N is an integer greater than 1;
  • the characters in the image of the area to be measured in each image frame are deduplicated and optimized to obtain the character recognition result, until the character recognition result is obtained for the N starting image frames.
  • the termination image frame detection stops when the first object is in the termination state, and multiple character recognition results are obtained; wherein the N termination image frames are any length of time after the current moment in the video stream. Extracted at the termination moment; each of the Image frames belong to a plurality of image frames extracted from the video stream from the current time to the termination time;
  • a target character recognition result is determined.
  • An embodiment of the present application also provides an information identification device, including:
  • the collection and extraction module is configured to collect the video stream of a predetermined area in real time, and extract N starting image frames corresponding to the current moment in the video stream based on the sliding window method; N is an integer greater than 1;
  • the detection processing module is configured to perform deduplication optimization processing on the characters in the image of the area to be measured in each image frame to obtain the characters if the first object detected for the N starting image frames is in the starting state.
  • the recognition results will stop when the first object in the N terminal image frames is detected to be in the terminal state, and a plurality of character recognition results are obtained; wherein the N terminal image frames are located in the video stream at a distance of Extracted from the termination moment any length of time after the current moment; each of the image frames belongs to multiple image frames extracted from the video stream from the current moment to the termination moment;
  • the determining module is configured to determine a target character recognition result based on the plurality of character recognition results.
  • An embodiment of the present application also provides an information identification device, which includes a memory and a processor.
  • the memory stores a computer program that can be run on the processor.
  • the processor executes the program, the steps in the above method are implemented.
  • Embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps in the above method are implemented.
  • the video stream of a predetermined area is collected in real time, and N starting image frames corresponding to the current moment are extracted from the video stream based on the sliding window method; N is an integer greater than 1; if for N starting If the first object in the image frame is detected as the starting state, then the characters in the image of the area to be measured in each image frame are deduplicated and optimized to obtain the character recognition result, until the Nth terminal image frame is detected and the first object is obtained.
  • An object stops when it is in the terminal state, and multiple character recognition results are obtained; among them, the N terminal image frames are extracted at the terminal moment after any length of time from the current moment in the video stream; the image frames belong to the current moment to the terminal moment in the video Multiple image frames extracted from the stream; based on multiple character recognition results, the target character recognition result is determined. Since this solution captures the license plate information during the arrival and departure of the first object, and simultaneously deduplicates and optimizes the recognized characters, it overcomes the difficulty of format diversification in the license plate recognition process, and can thereby Improve the recognition accuracy of license plate information.
  • Figure 1 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 2 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • FIG. 3 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 4 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 5 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 6 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 7 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 8 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 9 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 10 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 11 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 12 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 13 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • Figure 14 is an optional schematic diagram of the effect of the information identification method provided by the embodiment of the present application.
  • Figure 15 is a schematic structural diagram of an information identification device provided by an embodiment of the present application.
  • Figure 16 is a schematic diagram of a hardware entity of an information identification device provided by an embodiment of the present application.
  • first ⁇ second ⁇ third are only used to distinguish similar objects and do not mean Regarding the specific ordering of objects, it can be understood that “first ⁇ second ⁇ third” can interchange the specific order or sequence if permitted, so that the embodiments of the present application described here can be used in other ways than those shown in the figure here. may be performed in any order other than that shown or described.
  • Vehicle management is a crucial link in the logistics field. Effective management and dispatch of vehicles can ensure the safe and efficient flow of goods and ensure the quality of logistics services.
  • the correct identification of vehicle information is an important factor in resource allocation in logistics scenarios. basic prerequisites. For example, when vehicles enter the logistics park to load and unload goods, they need to strictly follow the schedule plan made in advance. Identifying the information of vehicles entering the park can help confirm whether the vehicles arrive on time and whether they can start operations as planned. In addition, the vehicle and the corresponding cargo information need to be strictly matched.
  • the warehouse needs to start preparations for loading/unloading when the vehicle arrives at the park, and allocate platforms based on resource occupancy. License plate recognition in the platform scene can help verify vehicle information and confirm that the vehicle has arrived at the correct loading and unloading location.
  • the first step is license plate detection, which is to locate the pixel location of the license plate in the picture.
  • the second step is character recognition, which is to identify the content of each character in the license plate and then form a license plate.
  • Current automatic license plate recognition systems based on computer vision can be divided into academic and industrial solutions. Solutions in academia generally pursue the accurate identification of license plates appearing on a single picture, but do not consider the need to continuously identify surveillance images in real-life scenarios, and their ability to respond to interference is relatively poor. Existing solutions in the industry are mostly used to identify the front license plate of vehicles, and the camera is relatively close to the license plate.
  • the scene is relatively simple and simple, such as the license plate recognition system in a parking lot.
  • the license plate recognition system for places like platforms used for vehicle loading and unloading, in order to facilitate loading and unloading, most vehicles are reversed into the warehouse, so what the camera captures is mostly the rear license plate of the vehicle.
  • the rear license plate There are two formats of license plates: single-line and double-line (the rear license plate of a yellow card has double lines), and compared with the front license plate, the rear license plate of a vehicle in a logistics scene is dirtier, and the character part of the license plate is always blocked.
  • the captured license plate is generally small and prone to occlusion and is affected by lighting, angles, etc.
  • the automatic recognition system of conventional license plates cannot be used in this scenario. The accuracy is very low.
  • FIG. 1 is an optional flow diagram of the information identification method provided by the embodiment of the present application, which will be described in conjunction with the steps shown in Figure 1 .
  • S101 Collect the video stream of a predetermined area in real time, and extract N starting image frames corresponding to the current moment in the video stream based on the sliding window method.
  • the information recognition device collects the video stream of a predetermined area in real time, and extracts N starting image frames corresponding to the current moment in the video stream based on the sliding window method.
  • N is an integer greater than 1.
  • the information recognition device collects the video stream of the predetermined area through a camera installed in the predetermined area.
  • the information recognition device extracts an image frame every predetermined number of image frames, and uses the sliding window method to determine the current N starting image frames corresponding to the previous moment.
  • the information recognition device extracts the image frame corresponding to the current moment in the video stream, and extracts multiple image frames at forward intervals along the time axis to obtain N starting image frames.
  • the information recognition device may be a terminal or a server connected to a camera in a predetermined area.
  • the information recognition device detects that the first object in the N starting image frames is in the starting state, it will perform deduplication optimization processing on the characters in the image of the area to be measured in each image frame to obtain the characters.
  • the recognition results will stop when the first object in the N termination image frames is detected to be in the termination state, and multiple character recognition results will be obtained.
  • the N termination image frames are extracted at the termination moment any length of time from the current moment in the video stream; the image frames belong to multiple image frames extracted from the video stream from the current moment to the termination moment.
  • the information recognition device detects that the first object in the N starting image frames is close to or far away from the target area.
  • the information recognition device processes each image frame extracted from the video stream from the current time to the end time, and extracts the characters and corresponding character-related information in the image of the area to be measured in each image frame.
  • the information recognition device can perform deduplication and optimization processing on multiple characters through character-related information to obtain character recognition results corresponding to each image frame.
  • the information recognition device detects a change in the Y-axis coordinate of the first object in N starting image frames.
  • the information recognition device processes each image frame extracted from the video stream from the current moment, and extracts the characters and corresponding character-related information in the image of the area to be measured in each image frame.
  • the information recognition device can perform deduplication optimization processing on multiple characters through character-related information to obtain the character recognition results corresponding to each image frame, until the information recognition device detects N termination image frames corresponding to the termination time one minute after the current time.
  • the first object is stopped when stationary, and multiple character recognition results are obtained.
  • the process of license plate information recognition mainly includes: 1. Vehicle body detection. 2. Determine vehicle status. 3. License plate detection. 4. License plate character recognition.
  • the vehicle body detection is used to confirm whether there are vehicles in the platform area, and then determine whether there are vehicles arriving and leaving the platform through the detected position changes of the vehicles. Since the license plate is also one of the salient features of the vehicle, this solution can also be simplified to using license plate detection to determine whether there is a vehicle in the platform area, and then through The position change of the license plate determines whether a vehicle arrives and leaves the platform.
  • the license plate target is smaller and more difficult to detect, and the license plate may be blocked. Therefore, this solution prefers to use vehicle body detection to determine whether there are vehicles in the platform area.
  • this solution also proposes an engineering solution based on edge deployment. details as follows:
  • the information recognition device uses the target detection method in computer vision to solve the problem of vehicle body detection, and determines whether there is a vehicle in the platform area through the video stream collected by the camera.
  • the information recognition device can identify the rear compartment of the vehicle.
  • a method based on deep learning is used for detection here.
  • the rear compartment of the vehicle needs to be marked with a rectangular frame, as shown in Figure 2.
  • the trunk box in the vehicle can be labeled, and the labeled data can be used for training.
  • Yolov5 Since target detection is a relatively mature technology in the field of computer vision, there are a variety of technologies to choose from (You Only Look Once, Yolov) series algorithms, (Region-CNN, RCNN) algorithms, and centernet algorithms. In the embodiment of this application, Yolov5 is selected , in other embodiments, other algorithms may also be used.
  • the information recognition device determines the status of the vehicles in the current platform area and the occupied status of the platform through the recognized vehicle position changes.
  • the information recognition device can summarize the recognized information into vehicle status, platform status and events.
  • the vehicle status is divided into no vehicle, vehicle in the warehouse, vehicle out of the warehouse, and vehicle parked.
  • the platform status is divided into occupied and released. Events include vehicle arrival at the platform, vehicle departure from the platform, vehicle position adjustment and vehicle temporary parking.
  • the judgment method is as follows:
  • the information recognition device can determine the position change of the rear compartment detected by the first preset detection model.
  • surveillance cameras need to be used to periodically collect images of a predetermined area (platform area) (5 frames per second can be used), and the position of the car body is detected for each frame of image.
  • the information recognition device uses the current image frame and the vehicle body position in the N-1 image frames closest to the current image frame each time to make a judgment. If the vehicle body is detected in at least M frames among the N starting image frames, and If the car body is close to the platform, it is determined that the vehicle is in the warehouse state. If the vehicle body is located far away from the platform, it is determined that the vehicle is in the outbound state.
  • the car body is detected in at least M image frames, but the position of the car body does not change significantly, it is a parking state. If the car body is detected in less than M frames, it is a car-free state. Since vehicle body recognition may cause mis-recognition or missed recognition, abnormality detection must first be performed before dynamic judgment to eliminate the interference of mis-recognition and missed recognition.
  • the platform status can be configured according to the actual situation. If there is a car on the platform at that time, it is an occupied state. If there is no car, it is a released state.
  • the reporting of events needs to be combined with the status of the vehicle and the platform.
  • An event is defined by a start state and an end state. Approaching the platform and moving away from the platform are the starting states, and the car-free state and parking state are the ending states.
  • the platform status When the platform status is released, the vehicle status changes from the warehousing status to the parked status as the vehicle arrives at the platform event. After the event, the platform status needs to be updated to occupied.
  • the platform status is released, the vehicle status from the warehousing state to the vehicle-free state is a temporary vehicle parking event. After the event, the platform status is still released.
  • the platform status When the platform status is occupied, the vehicle status changes from the entry/exit status to the parked status as a vehicle adjustment event. After the event, the platform status is still occupied.
  • the platform status When the platform status is occupied, the vehicle status changes from the entry/exit status to the vehicle-free status as the vehicle leaves the platform event. After the event, the platform status is updated to released.
  • the information identification device can also form corresponding identification information for each event. After the information recognition device recognizes the target character recognition result, the target character recognition result can be mapped and stored with the identification information of the corresponding event.
  • the information recognition device determines the target character recognition result based on multiple character recognition results.
  • the information recognition device when the information recognition device recognizes each character recognition result, it will form character category probability information corresponding to all characters in each character recognition result.
  • the information recognition device can determine the character recognition result with the largest sum of character category probability information of all characters as the target character recognition result.
  • the video stream of a predetermined area is collected in real time, and N starting image frames corresponding to the current moment are extracted from the video stream based on the sliding window method; N is an integer greater than 1; if for N starting If the first object in the image frame is detected as the starting state, then the characters in the image of the area to be measured in each image frame are deduplicated and optimized to obtain the character recognition result, until the Nth terminal image frame is detected and the first object is obtained.
  • An object stops when it is in the terminal state, and multiple character recognition results are obtained; among them, the N terminal image frames are extracted at the terminal moment after any length of time from the current moment in the video stream; the image frames belong to the current moment to the terminal moment in the video Multiple image frames extracted from the stream; based on multiple character recognition results, the target character recognition result is determined. Since this solution captures the license plate information during the arrival and departure of the first object, and simultaneously deduplicates and optimizes the recognized characters, it overcomes the difficulty of format diversification in the license plate recognition process, and can thereby Improve the recognition accuracy of license plate information.
  • FIG 3 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S102 shown in Figure 1 can also be implemented through S104 to S107, which will be explained in conjunction with each step. .
  • the information recognition device detects N starting image frames to obtain a starting detection result.
  • the information recognition device can detect N starting image frames through the Yolov5 model to obtain the starting detection result.
  • other detection models may also be used, which are not limited in the embodiments of this application.
  • the information recognition device detects that the initial detection result indicates that the first object is close to or far away from the target area, then the first image frame among the N initial image frames is used as the starting point to target the object in the video stream. Each extracted image frame is processed to obtain the corresponding image of the area to be measured.
  • the information recognition device detects that the Y coordinate of the center point of the first object represented by the initial detection result is increasing or decreasing, then the first image frame among the N initial image frames is used as the starting point for Each image frame extracted from the video stream is processed through the Warped Planar Object Detection Network (wpod-NET) model to obtain the corresponding image of the area to be detected.
  • wpod-NET Warped Planar Object Detection Network
  • the information recognition device after the information recognition device discovers the trigger event monitoring, it performs license plate recognition on each image frame returned by the surveillance camera after the trigger event and before the end of the event. License plate detection technology is needed here.
  • the information recognition device uses WPOD-NET to detect license plates. It needs to label the training data first.
  • the labeling method is as follows: given a picture containing a license plate, label the 4 vertices of the license plate, starting from the upper left corner and moving in the counterclockwise direction, as shown in Figure 4 As shown (the four vertices of the license plate in the picture have been connected with straight lines). After the annotation is completed, a large amount of annotated data is input into the model and the training is completed. Then the trained model parameters can be obtained. After inputting a picture containing the license plate, the coordinates of the four vertices of the license plate in the picture can be output.
  • training data is collected using a combination of manual annotation of data and automatic generation of data, thereby increasing data diversity while only requiring manual collection and annotation of a small amount of data.
  • the specific method is to randomly generate a license plate image, select an annotated image as the background, and paste the generated license plate to the license plate position of the image through affine transformation as a generated training sample, so that the four vertices of the generated license plate can be The position coincides with the marked 4 vertex positions. Since the positions of the four vertices of the license plate have been manually marked before, the positions of these four annotated points can be directly used as markers for generating images. At the same time, the generated pictures can also be enhanced through brightness transformation and other methods. As shown in Figure 5. Affine transformation can be used to transform the license plate information "Lu B 12345" in the image before transformation into "Su D 12346".
  • the information recognition device uses a preset target detection model to process the image of the area to be detected, and obtains multiple characters and corresponding character-related information.
  • the information recognition device can use the Yolov5 model to process the image of the area to be tested to obtain multiple characters and corresponding character-related information.
  • Other models may be used in other embodiments, and are not limited in the embodiments of this application.
  • the information recognition device needs to recognize the license plate characters to determine the license plate number.
  • This solution uses target detection as the basis for license plate character recognition, that is, the target detection method is used to identify the characters and their positions on the image of the area to be measured.
  • This part uses Yolov5 to train on the training data set. Through the trained target detection model, we can get the category and position of each character, as shown in Figure 6.
  • the target detection model can recognize that the image of the area to be tested includes the character "F2242", and obtain the character category F, location information and category probability of "F” 0.91; the character category 2, location information and category probability of "2" are 0.90; "2" The character category 2, location information and category probability of "4" are 0.90; the character category 1, location information and category probability of "4" are 0.90; the character category 2, location information and category probability of "2" are 0.90.
  • This step can also be completed using other target detection algorithms, but since Yolov5 is a relatively mature algorithm at present, the detection accuracy and operating efficiency can be guaranteed. Yolov5 is used here. Training the Yolov5 model requires a large amount of labeling data. Similar to the license plate detection part, data collection is difficult and labeling is cumbersome.
  • S107 Use character-related information to perform deduplication and optimization processing on multiple characters to obtain a character recognition result composed of characters in a certain sequence until the termination detection result indicates that the first object is in a state of no displacement change, or that there is no first object at the termination moment. Stop when you get multiple character recognition results.
  • the information recognition device uses character-related information to perform deduplication and optimization processing on multiple characters to obtain a character recognition result consisting of characters in a certain sequence. Until the detection result is terminated, it indicates that the first object is in a state of no displacement change, or the detection result is terminated. Stop when there is no first object at the moment and obtain multiple character recognition results. Among them, the termination detection result is obtained by detecting corresponding N termination image frames.
  • the information recognition device uses character-related information to deduplicate repeatedly recognized characters among multiple characters, and eliminates irrelevant characters outside the license plate area to obtain a character recognition result consisting of characters in a certain sequence.
  • the information recognition device until the end of the detection result indicates that the first object is in a state of no displacement change or that there is no first object at the end of the time. Stop when the state is reached and multiple character recognition results are obtained.
  • the information recognition device first determines the status of the vehicle by detecting the vehicle body, and detects the extracted image frames when the vehicle approaches or moves away. Since this solution captures the license plate information during the arrival and departure of the vehicle, Recognition, and at the same time use character-related information to deduplicate and optimize the recognized characters, overcome the difficulty of format diversification in the license plate recognition process, and thereby improve the recognition accuracy of license plate information.
  • FIG 8 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S104 to S107 shown in Figure 3 can also be implemented through S108 to S113, and each step will be combined Be explained.
  • the information recognition device uses the first preset detection model to process the N starting image frames to obtain the starting detection results.
  • the first preset detection model may be the Yolov5 model.
  • each image The frame is processed by the second preset detection model to obtain an image of the area to be detected.
  • each image frame is processed through the second preset detection model to obtain an image of the area to be detected.
  • the second preset detection model may be a WPOD-NET model, and there is no limit to the second preset detection model in the embodiments of this application.
  • the information recognition device detects that the initial detection result indicates that at least 15 of the 20 initial image frames include vehicles, and the distance between the vehicle and the platform is becoming larger or smaller, then the Each image frame is processed by the second preset detection model to obtain an image of the area to be detected.
  • the information recognition device uses a preset target detection model to process the image of the area to be detected, and obtains multiple characters, character category probability information and character frame information corresponding to the multiple characters respectively.
  • the preset target detection model can be the Yolov5 model.
  • the character frame information may include: character frame area information, character frame center point coordinate information and character frame vertex information. coordinate information.
  • S111 Use multiple character frame information to calculate an overlap matrix, and combine the overlap matrix to perform deduplication processing on multiple characters to obtain multiple first characters.
  • the information recognition device uses multiple character frame information to calculate an overlap matrix, and combines the overlap matrix to perform deduplication processing on multiple characters to obtain multiple first characters.
  • the information recognition device can obtain multiple overlap degrees by using the ratio of the overlap area of the character frames of any two characters to the sum of the areas of the character frames of the two characters.
  • the information recognition device can use multiple overlap degrees to construct an overlap degree matrix in order.
  • the information recognition device combines the overlap degree matrix to perform deduplication processing on multiple characters to obtain multiple first characters.
  • the plurality of characters includes: T characters; the plurality of character frame area information includes: T character frame area information; T is an integer greater than 1.
  • the information recognition device calculates the first group T of the first character and the first character to the T-th character respectively based on the character frame area information of the first character to the T-th character. degree of overlap.
  • the information recognition device uses the first group of T overlap degrees to construct the first row of the overlap degree matrix until the T-th character is calculated and the T-th character from the first character to the T-th character is calculated.
  • a group of T overlapping degrees is used to construct the last row of the overlapping degree matrix using the Tth group of T overlapping degrees, and then the overlapping degree matrix is obtained.
  • one character is recognized as multiple different classes, or one character is recognized as multiple same classes, resulting in an increase in the number of recognized characters.
  • the overlap degree indicator is used to measure the overlapping range of the two character frames.
  • Overlap degree overlapping area/total area. Calculate the overlap degree for all characters pairwise to form an overlap degree matrix D.
  • Dij represents the overlap degree of character i and character j.
  • the confidence represents the The character with the smaller probability of the current character category
  • the confidence(i) ⁇ confidence(j) that is, delete the i-th row and i-th column in D, and then traverse the D matrix again until all of the D matrix
  • the values are all less than thresh.
  • the information recognition device combines the first character frame information corresponding to the plurality of first characters to eliminate abnormal characters among the plurality of first characters to obtain a plurality of second characters.
  • the information recognition device combines the first character frame center point information corresponding to multiple first characters to calculate the density of each first character and other characters.
  • the information recognition device eliminates characters with the smallest density to obtain multiple second characters.
  • S113 Use the second character frame information corresponding to the plurality of second characters to perform a hierarchical sorting process on the plurality of second characters to obtain a character recognition result until the termination detection result indicates that there are at least M termination images among the N termination image frames.
  • the distance between the first object in the frame and the target area does not change, or it stops when K of the N starting image frames include the first object in the ending image frames, and multiple character recognition results are obtained.
  • the information recognition device uses the second character frame information corresponding to the plurality of second characters to perform hierarchical sorting processing on the plurality of second characters to obtain character recognition results until the termination detection result represents N termination image frames. Stop when at least M of the ending image frames include the first object, and the distance between the first object and the target area does not change, or K of the N starting image frames include the first object, and get multiple character recognition results. K is an integer greater than or equal to 1 and less than M.
  • the information recognition device can obtain more accurate target character recognition results by performing deduplication processing on multiple characters, eliminating abnormal points, and hierarchical sorting processing.
  • FIG 9 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S111 to S112 shown in Figure 8 can also be implemented through S114 to S118, and each step will be combined Be explained.
  • the information recognition device uses multiple character frame area information included in the multiple character frame information to calculate the overlap degree between any two characters, and constructs an overlap degree matrix based on the obtained multiple overlap degrees.
  • the information recognition device compares multiple overlap degrees with preset thresholds in sequence to obtain a comparison result for each overlap degree.
  • the information recognition device detects that the first degree of overlap corresponding to the comparison result representation is greater than the preset threshold, then the corresponding rows and columns of the overlapping characters in the overlap matrix are deleted to obtain a new overlap matrix until overlap
  • the target matrix is obtained when the overlap degrees in the degree matrix are all less than the preset threshold.
  • the overlapping characters are the characters with the smallest character category probability corresponding to the first degree of overlap.
  • the combination matrix D [d11, d12, d13, d14
  • d21 is the overlap degree of character 2 and character 1. If the character class probability of character 2 is less than the character class probability of character 2, the information recognition device can delete the second row and the second column in D.
  • the information recognition device determines a plurality of first characters through the characters corresponding to the degree of overlap in any row included in the target matrix.
  • the characters corresponding to the overlap degree of any row of the target matrix include character 2 and character 3.
  • the plurality of first characters may include: character 2 and character 3.
  • S118 Process the plurality of first character frame information through the preset abnormal point detection model, eliminate the abnormal characters with the lowest density among the plurality of first characters and other first character densities, and obtain a plurality of second characters.
  • the information recognition device processes multiple character frame information through a preset abnormal point detection model, eliminates the abnormal characters with the lowest density among the multiple first characters and other first character densities, and obtains multiple first characters. Two characters.
  • the preset outlier detection model can be a local outlier factor (LOF), an outlier detection model.
  • LEF local outlier factor
  • the preset target detection model incorrectly identified the "1" in the lower left corner.
  • the position of this character was obviously different from the area in other character sets, so the LOF outlier detection algorithm was used to eliminate it.
  • the LOF algorithm mainly determines whether the point is an outlier by comparing the density of each point p with its neighbor points. The lower the density of point p, the more likely it is to be identified as an outlier. As for density, it is calculated by the distance between points. The farther the points are, the lower the density. The closer the points are, the higher the density.
  • the information recognition device constructs an overlap matrix by calculating the overlap between any two characters, and then uses the overlap matrix to perform deduplication processing on multiple characters.
  • the deduplication effect is good and the accuracy can be obtained. higher target character recognition results.
  • FIG 11 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S113 shown in Figure 8 can also be implemented through S119 to S23, which will be explained in conjunction with each step. .
  • the information recognition device performs linear fitting on multiple center point information in the multiple second character frame information to obtain a fitting line, and calculates the sum of square errors of the multiple center point information from the fitting line.
  • the information recognition device determines the number of character layers of the character recognition result based on the sum of square errors.
  • the information recognition device determines that if the sum of square errors is greater than the second threshold, the number of character layers is two. If the sum of square errors is not greater than the second threshold, the number of character layers is one.
  • the second threshold can be any real number.
  • the information recognition device first performs linear fitting on the center coordinates of all characters and calculates (mean square error, MSE) (the sum of square errors from all points to the fitted straight line).
  • MSE mean square error
  • the threshold thresh it is determined to be a double-layer license plate (because the error must be very small if it is a single-layer license plate), otherwise it is a single-layer license plate. If it is judged to be a double-layered license plate, 5 points will be selected from all characters, with a total of 21 combinations. Linear fitting will be performed on each combination to calculate the MSE, and the character combination with the smallest MSE will be retained. Because the five characters of the lower license plate are on a straight line, the error is definitely the smallest among all combinations, as shown in Figure 6, so the combination of "F2242" characters is a single-layer license plate.
  • the information recognition device extracts a plurality of character combinations including any P second characters from the plurality of second characters.
  • P is an integer greater than 1 and less than T;
  • the information recognition device splices the plurality of second characters in order of size corresponding to the abscissa information to obtain the character recognition result.
  • the information recognition device performs linear fitting on multiple character combinations, calculates multiple error sums of squares, and determines the P third characters included in the character combination corresponding to the smallest error sum of squares.
  • the information recognition device performs linear fitting on the center point coordinate information of characters in multiple character combinations to obtain multiple sets of fitting lines.
  • the information recognition device calculates the error corresponding to each group corresponding to the characters in each group. Difference sum of squares, get multiple error sums of squares.
  • the information recognition device splices the remaining characters in the order of the size of the abscissa information in the corresponding second character frame information, and then splices the P third characters in the order of the size of the corresponding abscissa information. Splicing to obtain character recognition results.
  • the remaining characters are characters other than the P third characters among the plurality of second characters.
  • the information recognition device performs linear fitting on a plurality of second characters, and determines the number of character layers through the sum of square errors between each character and the fitting line. Since the number of layers of license plate characters is uncertain in real-life scenarios, this solution can more accurately determine the number of character layers to obtain target character recognition results with accurate recognition results.
  • FIG 12 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S101 shown in Figure 1 can also be implemented through S124 to S26, which will be explained in conjunction with each step. .
  • the information recognition device extracts the current image frame corresponding to the current moment in the video stream.
  • the camera collects the video stream of the current predetermined area in real time and sends it to the information recognition device.
  • the information recognition device After receiving the real-time video stream, the information recognition device extracts the image frame corresponding to the current moment.
  • the information recognition device takes the current image frame as a starting point in the video stream and extracts an image frame every predetermined time period or a predetermined number of image frames along the time axis until N-1 image frames are extracted.
  • the video stream includes multiple image frames distributed along the time axis.
  • the information recognition device can take the current image frame as a starting point and extract an image frame every 1 second until N-1 image frames are extracted. In the embodiment of this application, there is no limit on the predetermined duration.
  • the video stream includes multiple image frames distributed along the time axis.
  • the information recognition device can use the current image frame as a starting point and extract an image frame every 5 image frames until N-1 image frames are extracted. In the embodiment of the present application, there is no limit to the predetermined number.
  • the information recognition device combines the current image frame and N-1 image frames along the time axis to obtain N starting image frames.
  • the information recognition device compares the current image frame and N-1 image frames along their corresponding time The points are combined in order from front to back to obtain N image frames.
  • the information recognition device extracts N image frames at the current moment based on the sliding window method. By using the dynamic characteristics of the N image frames, it can accurately determine whether a vehicle has arrived or left the platform.
  • FIG 12 is an optional flow diagram of the information identification method provided by the embodiment of the present application.
  • S103 shown in Figure 1 can also be implemented through S127 to S28, which will be explained in conjunction with each step. .
  • the information recognition device adds the character category probability information corresponding to each character in the multiple character recognition results to obtain multiple total probabilities.
  • the information recognition device determines that the character recognition result corresponding to the largest total probability among multiple total probabilities is the target character recognition result.
  • the information recognition device determines which of the characters are in the upper layer and which are in the lower layer (all characters on a single-layer license plate are regarded as lower-layer license plates). Because the relative position of the same layer of license plates on the x-axis is determined, so first The characters of the upper license plate are spliced together in sequence from small to large x-coordinates, and then the characters of the lower license plate are spliced into the character string of the upper license plate in the order of x-coordinates from small to large. The resulting string is the final license plate number. Since license plate recognition starts when the event is triggered and ends when the event is completed, there will be multiple recognition results for a vehicle.
  • the method of merging is to add the probability of each character in each recognition result for each position, and take the character with the highest overall probability as the character corresponding to the position.
  • the information recognition device determines the target character recognition result with the largest probability sum among multiple character recognition results, so the recognition result obtained is more accurate.
  • FIG. 13 is an optional flow diagram of an information identification method provided by an embodiment of the present application, which will be described in combination with each step.
  • the target object needs to mark the rear compartment in a large number of pictures to form sample data for training.
  • the system process of this solution includes training process and usage process.
  • training process it is necessary to collect and label the vehicle body, license plate and license plate character data, and at the same time automatically generate highly simulated license plate and license plate character data to train the detection model of the vehicle body, license plate and license plate characters.
  • model training After the model training is completed, it is deployed on the edge computing box for actual use.
  • the information recognition device extracts pictures in real time from the video stream of the surveillance camera, extracting one frame every 0.2 seconds.
  • the picture extracted by the information recognition device uses the vehicle body recognition module to determine whether there is a vehicle on the current platform.
  • the information recognition device jointly determines the status of the current vehicle and platform and whether an event occurs through the vehicle recognition status of the current frame and the recognition status of the previous N frames.
  • the information recognition device detects the license plate and stops detection after the event is over.
  • the information recognition device extends the detected license plate area outwards to a certain range, intercepts it and sends it to the license plate recognition module.
  • the information recognition device outputs the license plate recognition result.
  • each deep learning model is converted into tensorRT format to optimize resource usage.
  • the surveillance camera should be placed at the top of the warehouse door and shoot diagonally downwards to ensure that the platform surveillance footage is collected without affecting production.
  • the camera is directly connected to the edge box, and all modules of vehicle/license plate recognition are deployed on the edge box to analyze the video collected by the camera and report the recognition results to the online system. Since the camera captures a large range of images, this solution only selects the parking spaces corresponding to the platform for monitoring to eliminate interference from other areas.
  • a camera can be configured with multiple monitoring areas, each area is independent of each other and performs vehicle/license plate recognition respectively.
  • the overall process is shown in Figure 14.
  • the camera 100 collects the video stream of the monitoring area and transmits it to the edge box 101.
  • the edge box 101 processes N image frames at each moment in the video stream, triggers event monitoring, and finally outputs the license plate number.
  • Figure 15 is a schematic structural diagram of an information identification device provided by an embodiment of the present application.
  • the embodiment of the present application also provides an information identification device 800, including: a collection and extraction module 803, a detection processing module 804 and a determination module 805.
  • the collection and extraction module is configured to collect the video stream of a predetermined area in real time, and extract N starting image frames corresponding to the current moment in the video stream based on the sliding window method; N is an integer greater than 1;
  • the detection processing module is configured to perform deduplication optimization processing on the characters in the image of the area to be measured in each image frame to obtain the characters if the first object detected for the N starting image frames is in the starting state.
  • the recognition results will stop when the first object in the N terminal image frames is detected to be in the terminal state, and a plurality of character recognition results are obtained; wherein the N terminal image frames are located in the video stream at a distance of Extracted from the termination moment any length of time after the current moment; each of the image frames belongs to multiple image frames extracted from the video stream from the current moment to the termination moment;
  • the determining module is configured to determine a target character recognition result based on the plurality of character recognition results.
  • the detection processing module 804 in the information recognition device 800 is configured to detect the N initial image frames to obtain an initial detection result; if the initial detection result represents that the first object is is close to or far away from the target area, then starting from the first image frame among the N starting image frames, processing each image frame extracted from the video stream to obtain the corresponding The image of the area to be measured is processed; a preset target detection model is used to process the image of the area to be measured to obtain multiple characters and corresponding character-related information; and the multiple characters are deduplicated and optimized using the character-related information.
  • the termination detection result is obtained by detecting corresponding to the N termination image frames.
  • the detection processing module 804 in the information recognition device 800 is configured to use the first preset detection model to process the N initial image frames to obtain the initial detection result; if the initial detection result is The initial detection result indicates that at least M of the N initial image frames include the first object, and the distance between the first object and the target area is getting larger or smaller, then Each image frame is processed through the second preset detection model to obtain the image of the area to be measured; M is an integer greater than 1 and less than N.
  • the detection processing module 804 in the information recognition device 800 is configured to use a preset target detection model to process the image of the area to be detected to obtain the multiple characters, and the multiple characters respectively correspond to character category probability information and character box information.
  • the detection processing module 804 in the information recognition device 800 is configured to calculate an overlap matrix using multiple character frame information, and perform deduplication processing on the multiple characters in combination with the overlap matrix, to obtain A plurality of first characters; combining the first character frame information corresponding to the plurality of first characters to eliminate abnormal characters among the plurality of first characters to obtain a plurality of second characters; using the plurality of second characters Respectively corresponding second character frame information, perform hierarchical sorting processing on the plurality of second characters to obtain the character recognition result, until the termination detection result represents that there are at least M termination images in the N termination image frames The distance between the first object in the frame and the target area There is no change, or it stops when K of the terminal image frames among the N starting image frames include the first object, and the plurality of character recognition results are obtained; K is an integer greater than or equal to 1 and less than M.
  • the detection processing module 804 in the information recognition device 800 is configured to use the multiple character frame area information included in the multiple character frame information to calculate the degree of overlap between any two characters, and obtain
  • the overlap degree matrix is constructed from multiple overlap degrees; the multiple overlap degrees are compared with the preset threshold in turn to obtain the comparison result of each overlap degree; if the comparison result indicates that the corresponding first overlap degree is greater than If the preset threshold is reached, delete the corresponding rows and columns of the overlapping characters in the overlap matrix to obtain a new overlap matrix, until the overlap degrees in the overlap matrix are all less than the preset threshold.
  • the plurality of characters includes: T characters; the plurality of character frame area information includes: T character frame area information; T is an integer greater than 1; the detection processing module in the information recognition device 800 804 is configured to use the multiple character frame area information included in the multiple character frame information to calculate the overlap degree between any two characters, and construct the overlap degree matrix through the obtained multiple overlap degrees, including : Based on the character box area information of the 1st character to the character box area information of the T-th character, calculate that the first character overlaps with the first group T of the 1st character to the T-th character respectively degree; construct the first row of the overlap degree matrix using the first group of T overlap degrees, until the T-th character is calculated and the T-th group of the 1st character to the T-th character is calculated.
  • T overlap degrees use the T group of T overlap degrees to construct the last row of the overlap degree matrix, and then obtain the overlap degree matrix.
  • the detection processing module 804 in the information recognition device 800 is configured to process the plurality of first character frame information through the preset abnormal point detection model, and eliminate the first characters among the plurality of first characters that are inconsistent with other first characters.
  • the abnormal character with the lowest density among the character densities is used to obtain the plurality of second characters.
  • the detection processing module 804 in the information recognition device 800 is configured to perform linear fitting on a plurality of center point information in the plurality of second character frame information to obtain a fitting line, and calculate the plurality of center point information.
  • the sum of squared errors of the center point information distance from the fitting line; determine the number of character layers of the character recognition result according to the size of the sum of squared errors; if the number of character layers is two, then from the multiple Multiple character combinations including any P second characters are extracted from the second characters; P is an integer greater than 1 and less than T; perform linear fitting on the multiple character combinations, and calculate multiple error sums of squares , determine the P third characters included in the character combination corresponding to the minimum sum of square errors; splice the remaining characters in the order of the abscissa information in the corresponding second character frame information, and then combine the P third characters The third characters are spliced according to the size order of their corresponding abscissa information to obtain the character recognition Result: The remaining characters are characters
  • the detection processing module 804 in the information recognition device 800 is configured to, if the number of character layers is one, then process the plurality of second characters in order of size corresponding to the abscissa information. Splicing to obtain the character recognition result.
  • the determination module 805 in the information recognition device 800 is configured to add the character category probability information corresponding to each character in the multiple character recognition results to obtain multiple total probabilities; determine the multiple total probabilities.
  • the character recognition result corresponding to the largest total probability among the total probabilities is the target character recognition result.
  • the collection and extraction module 803 in the information recognition device 800 is configured to extract the current image frame corresponding to the current moment in the video stream; in the video stream, the current image frame is From the starting point, extract an image frame every predetermined length of time or a predetermined number of image frames along the time axis until N-1 image frames are extracted; combine the current image frame and the N-1 image frames along the time axis The N starting image frames are obtained by combination.
  • the collection and extraction module 803 in the information recognition device 800 is configured to collect the video stream of a predetermined area in real time, and extract the N starting image frames corresponding to the current moment in the video stream based on the sliding window method; N is an integer greater than 1; through the detection processing module 804, if the first object among the N starting image frames is detected to be in the starting state, then the characters in the image of the area to be measured in each image frame are deduplicated and optimized.
  • the processing to obtain the character recognition results will stop when the first object among the N terminal image frames is detected to be in the terminal state, and multiple character recognition results are obtained; among them, the N terminal image frames are in the video stream for any length of time from the current moment.
  • the image frames are extracted from the subsequent termination time; the image frames belong to multiple image frames extracted from the video stream from the current time to the termination time; the target character recognition result is determined based on the multiple character recognition results through the determination module 805. Since this solution captures the license plate information during the arrival and departure of the first object, and simultaneously deduplicates and optimizes the recognized characters, it overcomes the difficulty of format diversification in the license plate recognition process, and can thereby Improve the recognition accuracy of license plate information.
  • the above information identification method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable An information recognition device (which may be a personal computer, etc.) executes all or part of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program code.
  • embodiments of the present application are not limited to any specific combination of hardware and software.
  • embodiments of the present application provide a computer-readable storage medium on which a computer program is stored.
  • the steps in the above method are implemented when the computer program is executed by the processor.
  • the embodiment of the present application provides an information identification device, including a memory 802 and a processor 801.
  • the memory 802 stores a computer program that can be run on the processor 801.
  • the processor 801 executes the program, the steps in the above method.
  • Figure 16 is a schematic diagram of a hardware entity of the information identification device provided by the embodiment of the present application.
  • the hardware entity of the information identification device 800 includes: a processor 801 and a memory 802, where;
  • Processor 801 generally controls the overall operation of information recognition device 800.
  • the memory 802 is configured to store instructions and applications executable by the processor 801, and can also cache data to be processed or processed by the processor 801 and each module in the information recognition device 800 (for example, image data, audio data, voice communication data and video communication data), which can be implemented through flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • the disclosed devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • each group shown or discussed The coupling, direct coupling, or communication connection between components may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical, or other forms.
  • the units described above as separate components may or may not be physically separated; the components shown as units may or may not be physical units; they may be located in one place or distributed to multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • all functional units in the embodiments of the present application can be integrated into one processing unit, or each unit can be separately used as a unit, or two or more units can be integrated into one unit; the above-mentioned integration
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the aforementioned program can be stored in a computer-readable storage medium.
  • the execution includes: The steps of the above method embodiment; and the aforementioned storage media include: removable storage devices, read-only memory (Read Only Memory, ROM), magnetic disks or optical disks and other various media that can store program codes.
  • ROM Read Only Memory
  • the integrated units mentioned above in this application are implemented in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.
  • the computer software product is stored in a storage medium and includes a number of instructions to enable A computer device (which may be a personal computer, a server, a network device, etc.) executes all or part of the methods described in various embodiments of this application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks, optical disks and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

本申请提供了一种信息识别方法、装置及存储介质,方法包括:实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧;若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果;其中,N个终止图像帧是在视频流中距离当前时刻任意时长后的终止时刻提取到的;图像帧属于当前时刻至终止时刻在视频流中提取的多个图像帧;基于多个字符识别结果,确定出目标字符识别结果。提高了对车牌信息的识别准确率。

Description

信息识别方法、装置及存储介质
相关申请的交叉引用
本申请基于申请号为202210824930.3、申请日为2022年07月13日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及信息识别技术领域,涉及一种信息识别方法、装置及存储介质。
背景技术
在智能物流园区,车辆的管理和装货卸货的管理需要高效协同才能充分保证物流操作的效率。车牌识别在车辆管理中是非常重要的环节,而车牌识别本身也有不同的场景,例如道闸的车牌识别和月台的车牌识别,通常是使用计算机视觉的方法。学术界的解决方案一般追求对单张图片上出现的车牌进行准确识别,但没有考虑在现实场景中需要持续性对监控画面进行识别,对干扰的应对能力比较差。工业界现有的解决方案多用于识别车辆的前车牌且摄像头距离车牌的位置较近,场景也比较单一简单。因此现有的方案在面对比较复杂,干扰较多时的场景,对车牌信息的识别准确率较低。
发明内容
本申请实施例提供的一种信息识别方法、装置及存储介质,可以提高对车辆牌信息的识别准确率。
本申请的技术方案是这样实现的:
本申请实施例提供了一种信息识别方法,包括:
实时采集预定区域的视频流,并基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;
若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果;其中,所述N个终止图像帧是在所述视频流中距离所述当前时刻任意时长后的终止时刻提取到的;所述每 个图像帧属于所述当前时刻至所述终止时刻在所述视频流中提取的多个图像帧;
基于所述多个字符识别结果,确定出目标字符识别结果。
本申请实施例还提供了一种信息识别装置,包括:
采集提取模块,被配置为实时采集预定区域的视频流,并基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;
检测处理模块,被配置为若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果;其中,所述N个终止图像帧是在所述视频流中距离所述当前时刻任意时长后的终止时刻提取到的;所述每个图像帧属于所述当前时刻至所述终止时刻在所述视频流中提取的多个图像帧;
确定模块,被配置为基于所述多个字符识别结果,确定出目标字符识别结果。
本申请实施例还提供了一种信息识别装置,包括存储器和处理器,存储器存储有可在处理器上运行的计算机程序,处理器执行程序时实现上述方法中的步骤。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述方法中的步骤。
本申请实施例中,实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果;其中,N个终止图像帧是在视频流中距离当前时刻任意时长后的终止时刻提取到的;图像帧属于当前时刻至终止时刻在视频流中提取的多个图像帧;基于多个字符识别结果,确定出目标字符识别结果。由于本方案通过抓取在第一物体到达和离开的期间对车牌信息进行识别,并且同时对识别得到的字符进行去重和优化处理,克服了车牌识别过程中的格式多样化的困难,进而可以提高对车牌信息的识别准确率。
附图说明
图1为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图2为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图3为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图4为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图5为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图6为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图7为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图8为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图9为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图10为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图11为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图12为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图13为本申请实施例提供的信息识别方法的一个可选的流程示意图;
图14为本申请实施例提供的信息识别方法的一个可选的效果示意图;
图15为本申请实施例提供的信息识别装置的结构示意图;
图16为本申请实施例提供的信息识别装置的一种硬件实体示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面结合附图和实施例对本申请的技术方案进一步详细阐述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
如果申请文件中出现“第一/第二”的类似描述则增加以下的说明,在以下的描述中,所涉及的术语“第一\第二\第三”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
车辆管理在物流领域是至关重要的环节,对车辆进行有效的管理和调度能够保证货物安全高效地流转,保障物流服务的质量。而对车辆信息的正确识别是物流场景中资源分配 的基础前提条件。例如,车辆进入物流园区装卸货需要严格按照提前制定的排班计划,对进入园区的车辆信息进行识别能够帮助确认车辆是否按时到达,能否按计划开始作业。另外,车辆和相应的货物信息需要严格匹配,仓库需要在车辆到达园区时开始做装货/卸货相应的准备,并根据资源占用的情况分配月台。在月台场景进行车牌识别能够帮助校验车辆信息,确认车辆到达正确的装卸货地点。传统的生产流程中,工作人员将通过手写的方式记录每个车辆的车牌号。在如今的智能化、信息化时代,这种传统的记录方式造成了大量的人力、财力的浪费,同时拖慢了运行效率和吞吐量。因此,车牌自动识别系统被大量使用,不仅能够快速、准确的识别出车辆的车牌号,节省人力,还能为实现自动化调度系统的实现提供支持,同时信息化的数据也更便于保存。
当前一般采用计算机视觉的手段实现车牌的自动识别。大多数基于视觉的车牌自动识别算法通常分为两个步骤,第一步为车牌检测,即在图片中定位到车牌所属的像素位置。第二步为字符识别,即识别出车牌中每个字符的内容,进而组成车牌。当前基于计算机视觉的自动车牌识别系统可以分为学术和工业的解决方案。学术界的解决方案一般追求对单张图片上出现的车牌进行准确识别,但没有考虑在现实场景中需要持续性对监控画面进行识别,对干扰的应对能力比较差。工业界现有的解决方案多用于识别车辆的前车牌且摄像头距离车牌的位置较近,场景也比较单一简单,比如停车场的车牌识别系统。对于像月台这样的用于车辆装货卸货的地方,为方便装卸货,车辆大多为倒车入库,所以摄像头拍摄到的多为车辆的后车牌,相较于前车牌统一的单行格式,后车牌的格式会出现单行和双行(黄牌的后车牌为双行)两种情况,而且相较于前车牌而言物流场景中车辆的后车牌更脏,车牌的字符部分总是会被遮挡。同时在真实的物流场景中,由于摄像头的位置多固定于较高的地方,拍摄到的车牌一般较小且容易出现遮挡以及受光照、角度等的影响,常规车牌的自动识别系统在此场景下的准确度很低。
本申请实施例提供了一种信息识别方法,请参阅图1,为本申请实施例提供的信息识别方法的一个可选的流程示意图,将结合图1示出的步骤进行说明。
S101、实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧。
本申请实施例中,信息识别装置实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧。其中,N为大于1的整数。
本申请实施例中,信息识别装置通过设置在预定区域的摄像头采集预定区域的视频流。信息识别装置每隔预定个数的图像帧提取出一个图像帧,并利用滑动窗口的方法确定出当 前时刻对应的N个起始图像帧。
本申请实施例中,信息识别装置在视频流中提取出当前时刻对应的图像帧,并沿着时间轴向前间隔提取出多个图像帧,以得到N个起始图像帧。
本申请实施例中,信息识别装置可以为与预定区域的摄像头连接的终端或者服务器。
S102、若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果。
本申请实施例中,信息识别装置若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果。
其中,N个终止图像帧是在视频流中距离当前时刻任意时长后的终止时刻提取到的;图像帧属于当前时刻至终止时刻在视频流中提取的多个图像帧。
本申请实施例中,信息识别装置若针对N个起始图像帧检测得到其中的第一物体为靠近或者远离目标区域的状态,则触发信息识别的过程。信息识别装置对从当前时刻至终止时刻在视频流中提取的每个图像帧进行处理,提取出每个图像帧的待测区域图像内的字符及其对应的字符相关信息。信息识别装置可以通过字符相关信息对多个字符进行去重优化处理得到每个图像帧对应的字符识别结果。当信息识别装置在对图像帧进行检测一定时长后,针对任意的终止时刻的N个终止图像帧检测得到其中的第一物体为静止状态或者N个终止图像帧中无第一物体时停止,得到多个字符识别结果。
示例性的,信息识别装置若针对N个起始图像帧检测得到其中的第一物体的Y轴坐标发生变化。信息识别装置则对从当前时刻起在视频流中提取的每个图像帧进行处理,提取出每个图像帧的待测区域图像内的字符及其对应的字符相关信息。信息识别装置可以通过字符相关信息对多个字符进行去重优化处理得到每个图像帧对应的字符识别结果,直至信息识别装置针对当前时刻一分钟后的终止时刻对应的N个终止图像帧检测得到其中的第一物体为静止时停止,得到多个字符识别结果。
本申请实施例中,车牌信息识别的过程主要包括:1.车体检测。2.车辆状态判断。3.车牌检测。4.车牌字符识别。其中车体检测用于确认月台区域是否有车辆存在,进而通过检测到的车辆的位置变化判断是否有车辆到达和离开月台。由于车牌也是车辆的显著特征之一,本方案也可以简化成使用车牌检测来判断月台区域是否有车辆出现,进而通过 车牌的位置变化判断是否有车辆到达和离开月台。但相比起车体,车牌目标较小,更难检测,并且可能存在车牌被遮挡的情况。因此,本方案首选使用车体检测来判断月台区域是否有车辆。最后,本方案也提出基于边缘部署的工程化方案。具体如下:
本申请实施例中,信息识别装置使用计算机视觉中的目标检测方法解决车体检测问题,通过摄像头采集的视频流判断月台区域是否有车辆。为了有效识别进入和离开月台区域的车辆并与其它角度的车辆进行区分,信息识别装置可以对车辆的后车厢及进行识别。这里采用基于深度学习的方法进行检测,在深度学习模型的训练过程中需要先用矩形框对车辆后车厢进行标注,如图2所示。在目标对象可以将车辆中的后箱进行标注,将标注数据用于训练。由于目标检测是计算机视觉领域比较成熟的技术,有多种技术可供选择(You Only Look Once,Yolov)系列算法,(Region-CNN,RCNN)算法,centernet算法,本申请实施例中,选用Yolov5,在其他实施例中,还可以采用其他的算法。
信息识别装置通过识别到的车辆位置变化的情况判断当前月台区域车辆的状态和月台被占用的状态。为了对车辆和月台的情况有准确细致的判断,本申请实施例中,信息识别装置可以将识别到的信息概括为车辆状态,月台状态和事件。车辆状态分为无车,车辆入库,车辆出库,车辆驻停。月台状态分为占用和释放。事件包括车辆到达月台,车辆离开月台,车辆位置调整和车辆临时停靠。判断方式如下:
信息识别装置可以通过第一预设检测模型检测到的后车厢位置变化判断。为了实现这一目标,需要使用监控摄像头周期性采集预定区域(月台区域)的画面(可以采用1秒5帧的方式),并对每一帧图片检测车体位置。信息识别装置每次使用当前图像帧及距离当前图像帧最近的N-1个图像帧中的车体位置进行判断,如果N个起始图像帧中至少有M个帧中检测到车体,并且车体位置在靠近月台,则确定车辆在入库状态。如果车体位置在远离月台,则确定车辆为出库状态。如果至少M个图像帧中检测到车体,但车体的位置没有明显变化,则为驻停状态。如果少于M个帧中检测到车体,则为无车状态。由于车体识别可能有误识别和漏识别的情况,在对动态判断之前首先要进行异常检测,排除误识别和漏识别的干扰。
信息识别装置对月台状态的判断和事件的判断需要相辅相成。在系统启动的初始状态,可以根据实际情况对月台状态进行配置,如果当时月台有车,则为占用状态,如果无车,则为释放状态。而对事件的上报,需要结合车辆和月台的状态。一个事件由一个起始状态和一个终止状态共同定义。靠近月台和远离月台为起始状态,无车状态和驻停状态为终止状态。信息识别装置一旦检测到一个起始状态,则触发事件的监控开始对每个图像帧进行 车牌信息识别,在已经检测到起始状态后,如果又检测到一个终止状态,则发现一个完整的事件。结合月台和车辆的状态,事件的定义及相应的月台状态更新如下:
1、在月台状态为释放的情况下,车辆状态从入库状态到驻停状态为车辆到达月台事件,事件结束后,月台状态需要更新为占用。
2、在月台状态为释放的情况下,车辆状态从入库状态到无车状态为车辆临时停靠事件,事件结束后,月台状态仍为释放。
3、在月台状态为占用的情况下,车辆状态从入库状态/出库状态到驻停状态为车辆调整事件,事件结束后,月台状态仍为占用。
4、在月台状态为占用的情况下,车辆状态从入库状态/出库状态到无车状态为车辆离开月台事件,事件结束后,月台状态更新为释放。
5、其它状态的组合为无效事件,直接丢弃,无需上报。
本申请实施例中,信息识别装置还可以针对每个事件形成对应的标识信息。在信息识别装置识别到目标字符识别结果之后,可以将目标字符识别结果与对应的事件的标识信息进行映射存储。
S103、基于多个字符识别结果,确定出目标字符识别结果。
本申请实施例中,信息识别装置基于多个字符识别结果,确定出目标字符识别结果。
本申请实施例中,信息识别装置识别出每个字符识别结果时会形成对应每个字符识别结果中所有字符的字符类别概率信息。信息识别装置可以将所有字符的字符类别概率信息之和最大的字符识别结果确定为目标字符识别结果。
本申请实施例中,实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果;其中,N个终止图像帧是在视频流中距离当前时刻任意时长后的终止时刻提取到的;图像帧属于当前时刻至终止时刻在视频流中提取的多个图像帧;基于多个字符识别结果,确定出目标字符识别结果。由于本方案通过抓取在第一物体到达和离开的期间对车牌信息进行识别,并且同时对识别得到的字符进行去重和优化处理,克服了车牌识别过程中的格式多样化的困难,进而可以提高对车牌信息的识别准确率。
在一些实施例中,参见图3,图3为本申请实施例提供的信息识别方法的一个可选的流程示意图,图1示出的S102还可以通过S104至S107实现,将结合各步骤进行说明。
S104、对N个起始图像帧进行检测得到起始检测结果。
本申请实施例中,信息识别装置对N个起始图像帧进行检测得到起始检测结果。
本申请实施例中,信息识别装置可以通过Yolov5模型对N个起始图像帧进行检测得到起始检测结果。在其他实施例中,还可以采用其他的检测模型,本申请实施例中不做限制。
S105、若起始检测结果表征第一物体为靠近或者远离目标区域的状态,则从N个起始图像帧中的第一图像帧为起点,针对在视频流中提取出的每个图像帧进行处理,得到对应的待测区域图像。
本申请实施例中,若信息识别装置检测到起始检测结果表征第一物体为靠近或者远离目标区域的状态,则从N个起始图像帧中的第一图像帧为起点针对在视频流中提取出的每个图像帧进行处理,得到对应的待测区域图像。
本申请实施例中,若信息识别装置检测到起始检测结果表征第一物体的中心点的Y坐标在增大或者减小,则从N个起始图像帧中的第一图像帧为起点针对在视频流中提取出的每个图像帧通过扭曲平面物体检测网络模型(Warped Planar Object Detection Network,wpod-NET)进行处理,得到对应的待测区域图像。
本申请实施例中,信息识别装置发现触发事件监控以后,在触发事件后到事件结束前,对监控摄像头返回的每一图像帧进行车牌识别。这里需要用到车牌检测技术。信息识别装置使用WPOD-NET进行车牌检测,需要先标注训练数据,标注方式为,给定一张含有车牌的图片,标注车牌的4个顶点,从左上角开始,沿逆时针方向,如图4所示(图中的车牌四个顶点之间已经用直线连接)。完成标注之后,将大量标注数据输入到模型中训练完成,则可以得到训练好的模型参数,在输入一张含有车牌的图片后,能够输出图中车牌的4个顶点的坐标。
本申请实施例中,采用人工标注数据结合自动生成数据的方式收集训练数据,在只需要人工收集和标注少量数据的情况下增加数据多样性。具体的方式为,随机生成车牌图片,并选取一张标注过的图片作为背景,通过仿射变换把生成的车牌粘贴到该图片的车牌位置,作为生成的训练样本,使生成车牌4个顶点的位置与标注的4个顶点位置重合。而由于车牌4个顶点的位置之前已经由人工标注,可以直接把这4个标注点的位置作为生成图片的标记。同时,在生成的图片上也可以通过亮度变换等方式进行增强。如图5所示。可以采用仿射变换的方式将变换前图中的车牌信息“鲁B 12345”变换为“苏D 12346”。
S106、利用预设目标检测模型对待测区域图像进行处理,得到多个字符及对应的字符 相关信息。
本申请实施例中,信息识别装置利用预设目标检测模型对待测区域图像进行处理,得到多个字符及对应的字符相关信息。
本申请实施例中,信息识别装置可以利用Yolov5模型对待测区域图像进行处理,得到多个字符及对应的字符相关信息。在其他实施例中可以采用其他模型,本申请实施例中不做限制。
本申请实施例中,信息识别装置在得到待测区域图像之后,需要对车牌字符进行识别,以判断车牌号码。本方案以目标检测为基础进行车牌字符识别,即,再用目标检测方法识别待测区域图像上的字符及其位置。该部分采用Yolov5在训练数据集上进行训练,通过训练好的目标检测模型,我们可以得到每个字符的类别及位置,如图6所示。目标检测模型可以识别待测区域图像包括字符“F2242”,并得到“F”的字符类别F、位置信息及类别概率0.91;“2”的字符类别2、位置信息及类别概率0.90;“2”的字符类别2、位置信息及类别概率0.90;“4”的字符类别1、位置信息及类别概率0.90;“2”的字符类别2、位置信息及类别概率0.90。这一步也可以使用其它目标检测算法来完成,但由于Yolov5是目前比较成熟的算法,检测准确率和运行效率都能够得到保证,这里使用Yolov5。训练Yolov5模型需要用到大量打标数据,与车牌检测部分类似,数据的收集比较困难,标记也比较繁琐,尤其是,在一个地区采集的图片通常只包含当地所属省份和临近省份的车牌。因此,本方案生成车牌数据,并通过仿射变换粘贴到人工收集图片的背景上,并自动生成车牌字符的标记。同时,使用数据增强的方式让生成的车牌字符更接近实际的效果,如图7所示。可以将“沪0I0R4H”通过仿射变换的方式得到新的车牌信息“沪PPRBS8X”。
S107、利用字符相关信息对多个字符进行去重优化处理,得到一定次序字符组成的字符识别结果,直至终止检测结果表征第一物体为无位移变化的状态,或者终止时刻无第一物体的状态时停止,得到多个字符识别结果。
本申请实施例中,信息识别装置利用字符相关信息对多个字符进行去重优化处理,得到一定次序字符组成的字符识别结果,直至终止检测结果表征第一物体为无位移变化的状态,或者终止时刻无第一物体的状态时停止,得到多个字符识别结果。其中,终止检测结果为对应N个终止图像帧进行检测得到的。
本申请实施例中,信息识别装置利用字符相关信息将多个字符中的重复识别的字符进行去重,并剔除车牌区域外的无关字符,得到一定次序字符组成的字符识别结果。信息识别装置直至终止检测结果表征第一物体为无位移变化的状态或者终止时刻无第一物体的 状态时停止,得到多个字符识别结果。
本申请实施例中,信息识别装置先通过检测车体判断车辆的状态,在车辆靠近或者远离时对提取到图像帧进行检测,由于本方案通过抓取在车辆到达和离开的期间对车牌信息进行识别,并且同时利用字符相关信息对识别得到的字符进行去重和优化处理,克服了车牌识别过程中的格式多样化的困难,进而可以提高对车牌信息的识别准确率。
在一些实施例中,参见图8,图8为本申请实施例提供的信息识别方法的一个可选的流程示意图,图3示出的S104至S107还可以通过S108至S113实现,将结合各步骤进行说明。
S108、利用第一预设检测模型对N个起始图像帧进行处理,得到起始检测结果。
本申请实施例中,信息识别装置利用第一预设检测模型对N个起始图像帧进行处理,得到起始检测结果。
其中,第一预设检测模型可以为Yolov5模型。
S109、若起始检测结果表征N个起始图像帧中至少有M个起始图像帧中包括第一物体,且第一物体与目标区域的距离在变大或者变小,则将每个图像帧通过第二预设检测模型处理,得到待测区域图像。
本申请实施例中,若信息识别装置检测到起始检测结果表征N个起始图像帧中至少有M个起始图像帧中包括第一物体,且第一物体与目标区域的距离在变大或者变小,则将每个图像帧通过第二预设检测模型处理,得到待测区域图像。
其中,第二预设检测模型可以为WPOD-NET模型,本申请实施例中对第二预设检测模型不做限制。
示例性的,若信息识别装置检测到起始检测结果表征20个起始图像帧中至少有15个起始图像帧中包括车辆,且车辆与月台的距离在变大或者变小,则将每个图像帧通过第二预设检测模型处理,得到待测区域图像。
S110、利用预设目标检测模型对待测区域图像进行处理,得到多个字符,及多个字符分别对应的字符类别概率信息和字符框信息。
本申请实施例中,信息识别装置利用预设目标检测模型对待测区域图像进行处理,得到多个字符,及多个字符分别对应的字符类别概率信息和字符框信息。
其中,预设目标检测模型可以为Yolov5模型。本申请实施例中,对预设目标检测模型的种类不做限制。
其中,字符框信息可以包括:字符框面积信息、字符框中心点坐标信息和字符框顶点 坐标信息。
S111、利用多个字符框信息计算得到重叠度矩阵,并结合重叠度矩阵对多个字符进行去重处理,得到多个第一字符。
本申请实施例中,信息识别装置利用多个字符框信息计算得到重叠度矩阵,并结合重叠度矩阵对多个字符进行去重处理,得到多个第一字符。
本申请实施例中,信息识别装置利用任意两个字符的字符框重叠面积比上该两个字符的字符框面积之和,可以得到多个重叠度。信息识别装置可以利用多个重叠度按照次序构建出重叠度矩阵。信息识别装置结合重叠度矩阵对多个字符进行去重处理,得到多个第一字符。
本申请实施例中,所述多个字符包括:T个字符;所述多个字符框面积信息包括:T个字符框面积信息;T为大于1的整数。信息识别装置根据第1个字符的字符框面积信息至第T个字符的字符框面积信息,计算出所述第1个字符分别与所述第1个字符至第T个字符的第一组T个重叠度。信息识别装置利用所述第一组T个重叠度构建出所述重叠度矩阵的第一行,直至计算得到第T个字符分别与所述第1个字符至所述第T个字符的第T组T个重叠度,利用所述第T组T个重叠度构建出所述重叠度矩阵的最后一行,进而得到所述重叠度矩阵。
本申请实施例中,预设目标检测模型最后输出的字符存在一个字符被识别为多个不同的类,或者一个字符被识别出多个相同的类,导致识别的字符数变多。此时这些字符的框存在较大的重叠,重叠度这个指标就是用来衡量两个字符框的重叠范围,重叠度=重叠面积/总的面积。对所有字符两两计算重叠度,形成重叠度矩阵D。
D=[d11,d12,d13,d14
d21,d22,d23,d24
d31,d32,d33,d34
d41,d42,d43,d44]
其中,Dij表示字符i和字符j的重叠度,设置一定的阈值thresh,遍历D矩阵,当dij>thresh判定这两个框重叠范围较大,其中一个为重复识别,于是剔除confidence(confidence表示该字符为当前字符类别的概率)较小的那一个字符,假设confidence(i)<confidence(j),即在D中删除第i行和第i列,然后再次遍历D矩阵,直到D矩阵所有的值都小于thresh。
S112、结合多个第一字符分别对应的第一字符框信息剔除多个第一字符中的异常字符,得到多个第二字符。
本申请实施例中,信息识别装置结合多个第一字符分别对应的第一字符框信息剔除多个第一字符中的异常字符,得到多个第二字符。
本申请实施例中,信息识别装置结合多个第一字符分别对应的第一字符框中心点信息,计算每个第一字符与其他字符的密度。信息识别装置剔除密度最小的字符,得到多个第二字符。
S113、利用多个第二字符分别对应的第二字符框信息,对多个第二字符进行分层排序处理得到字符识别结果,直至终止检测结果表征N个终止图像帧中至少有M个终止图像帧中的第一物体与目标区域的距离无变化,或者N个起始图像帧中有K个终止图像帧中包括第一物体时停止,得到多个字符识别结果
本申请实施例中,信息识别装置利用多个第二字符分别对应的第二字符框信息,对多个第二字符进行分层排序处理得到字符识别结果,直至终止检测结果表征N个终止图像帧中至少有M个终止图像帧中包括第一物体,且第一物体与目标区域的距离无变化,或者N个起始图像帧中有K个终止图像帧中包括第一物体时停止,得到多个字符识别结果。K为大于等于1小于M的整数。
本申请实施例中,信息识别装置通过对多个字符进行去重处理,剔除异常点处理和分层排序处理,可以得到更加准确目标字符识别结果。
在一些实施例中,参见图9,图9为本申请实施例提供的信息识别方法的一个可选的流程示意图,图8示出的S111至S112还可以通过S114至S118实现,将结合各步骤进行说明。
S114、利用多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出重叠度矩阵。
本申请实施例中,信息识别装置利用多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出重叠度矩阵。
S115、将多个重叠度依次与预设阈值进行比较,得到每个重叠度的比较结果。
本申请实施例中,信息识别装置将多个重叠度依次与预设阈值进行比较,得到每个重叠度的比较结果。
本申请实施例中,对预设阈值的大小不做限制。
S116、若比较结果表征对应的第一重叠度大于预设阈值,则将重叠字符在重叠度矩阵中对应的行和列删除得到新的重叠度矩阵,直至重叠度矩阵中的重叠度均小于预设阈值时得到目标矩阵。
本申请实施例中,若信息识别装置检测到比较结果表征对应的第一重叠度大于预设阈值,则将重叠字符在重叠度矩阵中对应的行和列删除得到新的重叠度矩阵,直至重叠度矩阵中的重叠度均小于预设阈值时得到目标矩阵。重叠字符为所述第一重叠度对应的字符类别概率最小的字符。
示例性的,结合矩阵D=[d11,d12,d13,d14
d21,d22,d23,d24
d31,d32,d33,d34
d41,d42,d43,d44]
若比较结果表征d21大于预设阈值,d21为字符2和字符1的重叠度。假如,字符2的字符类别概率小于字符2的字符类别概率,则信息识别装置可以将D中的第2行和第2列删除。
S117、通过目标矩阵包括的任意行中的重叠度对应的字符,确定出多个第一字符。
本申请实施例中,信息识别装置通过目标矩阵包括的任意行中的重叠度对应的字符,确定出多个第一字符。
示例性的,若目标矩阵=[d22,d23,
d32,d33]
其中,目标矩阵的任意行的重叠度对应的字符包括字符2和字符3。此时,多个第一字符可以包括:字符2和字符3。
S118、将多个第一字符框信息通过预设异常点检测模型处理,剔除多个第一字符中与其他的第一字符密度之间密度最低的异常字符,得到多个第二字符。
本申请实施例中,信息识别装置将多个字符框信息通过预设异常点检测模型处理,剔除多个第一字符中与其他的第一字符密度之间密度最低的异常字符,得到多个第二字符。
其中,预设异常点检测模型可以为局部离群因子(Local Outlier Factor,LOF),异常点检测模型。本申请实施例中,对预设异常点检测模型的种类不做限制。
如图10所示,预设目标检测模型错误地识别了左下角的“1”,该字符的位置明显异于其他字符集中的区域,所以采用LOF异常点检测算法进行剔除。LOF算法主要是通过比较每个点p和其邻域点的密度来判断该点是否为异常点,如果点p的密度越低,越可能被认定是异常点。至于密度,是通过点之间的距离来计算的,点之间距离越远,密度越低,距离越近,密度越高。
本申请实施例中,信息识别装置通过计算任意两个字符之间的重叠度,构建出重叠度矩阵,进而利用重叠度矩阵对多个字符进行去重处理,去重效果好,可以得到准确度更高 的目标字符识别结果。
在一些实施例中,参见图11,图11为本申请实施例提供的信息识别方法的一个可选的流程示意图,图8示出的S113还可以通过S119至S23实现,将结合各步骤进行说明。
S119、通过多个第二字符框信息中的多个中心点信息进行线性拟合得到拟合线,并计算多个中心点信息距离拟合线的误差平方和。
本申请实施例中,信息识别装置通过多个第二字符框信息中的多个中心点信息进行线性拟合得到拟合线,并计算多个中心点信息距离拟合线的误差平方和。
S120、根据误差平方和的大小确定出字符识别结果的字符层数。
本申请实施例中,信息识别装置根据误差平方和的大小确定出字符识别结果的字符层数。
本申请实施例中,信息识别装置判断若误差平方和大于第二阈值,则字符层数为二层。若误差平方和不大于第二阈值,则字符层数为一层。第二阈值可以为任意实数。
本申请实施例中,信息识别装置先对所有字符的中心坐标做线性拟合,计算(mean square error,MSE)(所有点到该条拟合出来的直线的误差平方和),当MSE超过一个阈值thresh时判定为双层车牌(因为如果是单层误差肯定很小),否则为单层车牌。若判断为双层车牌则从所有字符中选择出5个点,一共21种组合,对每个组合做线性拟合求出MSE,保留最小MSE的字符组合。因为下层车牌的5个字符本身就在一条直线上所以在所有组合中肯定误差最小如图6所示,所以该组合“F2242”字符为单层车牌。
S121、若字符层数为二层,则从多个第二字符中提取出包括任意的P个第二字符多个字符组合。
本申请实施例中,若字符层数为二层,则信息识别装置从多个第二字符中提取出包括任意的P个第二字符多个字符组合。P为大于1小于T的整数;
本申请实施例中,若所述字符层数为一层,则信息识别装置将所述多个第二字符按照各自对应所述横坐标信息的大小顺序进行拼接,得到所述字符识别结果。
S122、对多个字符组合分别进行线性拟合,并计算得到多个误差平方和,确定出最小的误差平方和对应的字符组合中包括的P个第三字符。
本申请实施例中,信息识别装置对多个字符组合分别进行线性拟合,并计算得到多个误差平方和,确定出最小的误差平方和对应的字符组合中包括的P个第三字符。
本申请实施例中,信息识别装置对多个字符组合内的字符的中心点坐标信息分别进行线性拟合,得到多个组拟合线。信息识别装置对应每个组内的字符计算出每个组对应的误 差平方和,得到多个误差平方和。
S123、将其余字符按照各自对应的第二字符框信息中的横坐标信息的大小顺序进行拼接,再将P个第三字符按照各自对应的横坐标信息的大小顺序进行拼接,得到字符识别结果。
本申请实施例中,信息识别装置将其余字符按照各自对应的第二字符框信息中的横坐标信息的大小顺序进行拼接,再将P个第三字符按照各自对应的横坐标信息的大小顺序进行拼接,得到字符识别结果。其余字符为多个第二字符中的除P个第三字符之外的字符。
本申请实施例中,信息识别装置对多个第二字符进行线性拟合,通过每个字符与拟合线之间的误差平方和,确定出字符的层数。由于现实场景中车牌字符的层数不确定,所以通过本方案可以更加精准的确定出字符的层数,以得到识别结果准确的目标字符识别结果。
在一些实施例中,参见图12,图12为本申请实施例提供的信息识别方法的一个可选的流程示意图,图1示出的S101还可以通过S124至S26实现,将结合各步骤进行说明。
S124、在视频流中提取出当前时刻对应的当前图像帧。
本申请实施例中,信息识别装置在视频流中提取出当前时刻对应的当前图像帧。
示例性的,摄像头实时采集当前预定区域的视频流传送给信息识别装置。信息识别装置接收到实时的视频流后提取出当前时刻对应的图像帧。
S125、在视频流中以当前图像帧为起点,沿着时间轴每隔预定时长或者预定数量的图像帧提取出一个图像帧,直至提取得到N-1个图像帧。
本申请实施例中,信息识别装置在视频流中以当前图像帧为起点,沿着时间轴每隔预定时长或者预定数量的图像帧提取出一个图像帧,直至提取得到N-1个图像帧。
本申请实施例中,视频流中包括了沿着时间轴分布的多个图像帧。信息识别装置可以以当前图像帧为起点,每隔1秒提取出一个图像帧,直至提取出N-1个图像帧。本申请实施例中,对预定时长不做限制。
本申请实施例中,视频流中包括了沿着时间轴分布的多个图像帧。信息识别装置可以以当前图像帧为起点,每隔5个图像帧提取出一个图像帧,直至提取出N-1个图像帧。本申请实施例中,对预定数量不做限制。
S126、将当前图像帧和N-1个图像帧沿时间轴组合得到N个起始图像帧。
本申请实施例中,信息识别装置将当前图像帧和N-1个图像帧沿时间轴组合得到N个起始图像帧。
本申请实施例中,信息识别装置将当前图像帧和N-1个图像帧,沿着各自对应的时间 点从前到后的顺序进行组合得到N个图像帧。
本申请实施例中,信息识别装置基于滑动窗口的方法提取得到当前时刻的N个图像帧,利用N个图像帧的动态特征,可以准确的判断是否有车辆到达或者离开月台。
在一些实施例中,参见图12,图12为本申请实施例提供的信息识别方法的一个可选的流程示意图,图1示出的S103还可以通过S127至S28实现,将结合各步骤进行说明。
S127、将多个字符识别结果中的每个字符对应的字符类别概率信息相加,得到多个总概率。
本申请实施例中,信息识别装置将多个字符识别结果中的每个字符对应的字符类别概率信息相加,得到多个总概率。
S128、确定多个总概率中最大的总概率对应的字符识别结果为目标字符识别结果。
本申请实施例中,信息识别装置确定多个总概率中最大的总概率对应的字符识别结果为目标字符识别结果。
本申请实施例中,信息识别装置确定了字符中哪些是上层,哪些是下层(单层车牌的所有字符看成下层车牌),因为同一层车牌在x轴上的相对位置是确定的,所以先将上层车牌的字符按照x坐标从小到大的位置顺序依次拼接字符,然后将下层车牌的字符按照x坐标从小到大的位置顺序接着上层车牌的字符串拼接字符,最终得到的字符串即为最终的车牌号。由于在事件被触发时开始进行车牌识别,事件完成后结束识别,对于一辆车会有多个识别结果。由于环境等因素的影响,每次的识别结果可能会不一致。因此,需要将多次的识别结果合并。合并的方式为,对每一个位置,将每次识别结果中每个字符的概率相加,取总体概率最大的一个字符作为该位置对应的字符。
本申请实施例中,信息识别装置通过在多个字符识别结果中确定出概率和最大的目标字符识别结果,所以得到的识别结果更加准确。
在一些实施例中,参见图13,图13为本申请实施例提供的信息识别方法的一个可选的流程示意图,将结合各步骤进行说明。
S201、标注后车厢数据。
本申请实施例中,目标对象需要在大量的图片中对后车厢进行标注,以形成样本数据进行训练。
S202、训练车体检测模型。
S203、标注车牌及车牌字符数据。
S204、生成车牌及车牌字符及标记数据。
S205、训练车牌及车牌字符检测模型。
本方案的系统流程包括训练流程和使用流程。在训练流程中,需要收集和标注车体,车牌和车牌字符数据,同时自动生成高仿真的车牌及车牌字符数据,训练车体,车牌和车牌字符的检测模型。模型训练完成后,则部署在边缘计算盒子上供实际使用。
S206、从视频流中实时抽取图片。
信息识别装置从监控摄像头的视频流中实时抽取图片,每隔0.2秒抽取一帧。
S207、车体检测。
信息识别装置抽取的图片使用车体识别模块判断当前月台是否有车辆。
S208、车辆动态/月台状态/事件识别。
信息识别装置通过当前帧的车辆识别情况与前N个帧的识别情况共同判断当前车辆和月台的状态以及是否产生事件。
S209、车牌检测。
如果触发事件监控,则信息识别装置进行车牌检测,事件结束后停止检测
S210、车牌字符检测。
信息识别装置将检测到的车牌区域向外扩展一定范围,截取后输送到车牌识别模块。
S211、生成车牌号。
信息识别装置输出车牌识别结果。
在部署的过程中,每个深度学习模型都转换成tensorRT的格式,以优化资源使用。在实际使用的过程中,监控摄像头要放置在仓库门的顶端,向斜下拍摄,以保证在不影响生产的前提下收集月台监控画面。摄像头直接连接到边缘盒子,车辆/车牌识别的所有模块在边缘盒子上部署,对摄像头收集的视频进行分析,并向线上系统上报识别结果。由于摄像头采集画面的范围较大,本方案只选取月台对应的车位进行监控,以排除其它区域的干扰。一个摄像头可配置多个监控区域,每个区域之间相互独立,分别进行车辆/车牌识别。整体流程如图14所示,摄像头100采集监控区域视频流,传输给边缘盒子101,边缘盒子101对视频流中的每个时刻的N个图像帧进行处理,触发事件监控,最终输出车牌号。
参见图15,图15为本申请实施例提供的信息识别装置的结构示意图。
本申请实施例还提供了一种信息识别装置800,包括:采集提取模块803、检测处理模块804和确定模块805。
采集提取模块,被配置为实时采集预定区域的视频流,并基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;
检测处理模块,被配置为若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果;其中,所述N个终止图像帧是在所述视频流中距离所述当前时刻任意时长后的终止时刻提取到的;所述每个图像帧属于所述当前时刻至所述终止时刻在所述视频流中提取的多个图像帧;
确定模块,被配置为基于所述多个字符识别结果,确定出目标字符识别结果。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为对所述N个起始图像帧进行检测得到起始检测结果;若所述起始检测结果表征所述第一物体为靠近或者远离目标区域的状态,则从所述N个起始图像帧中的第一图像帧为起点,针对在所述视频流中提取出的所述每个图像帧进行处理,得到对应的所述待测区域图像;利用预设目标检测模型对所述待测区域图像进行处理,得到多个字符及对应的字符相关信息;利用所述字符相关信息对所述多个字符进行去重优化处理,得到一定次序字符组成的所述字符识别结果,直至终止检测结果表征所述第一物体为无位移变化的状态,或者所述终止时刻无所述第一物体的状态时停止,得到所述多个字符识别结果;所述终止检测结果为对应所述N个终止图像帧进行检测得到的。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为利用第一预设检测模型对所述N个起始图像帧进行处理,得到所述起始检测结果;若所述起始检测结果表征所述N个起始图像帧中至少有M个起始图像帧中包括所述第一物体,且所述第一物体与所述目标区域的距离在变大或者变小,则将所述每个图像帧通过第二预设检测模型处理,得到所述待测区域图像;M为大于1小于N的整数。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为利用预设目标检测模型对所述待测区域图像进行处理,得到所述多个字符,及所述多个字符分别对应的字符类别概率信息和字符框信息。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为利用多个字符框信息计算得到重叠度矩阵,并结合所述重叠度矩阵对所述多个字符进行去重处理,得到多个第一字符;结合所述多个第一字符分别对应的第一字符框信息剔除所述多个第一字符中的异常字符,得到多个第二字符;利用所述多个第二字符分别对应的第二字符框信息,对所述多个第二字符进行分层排序处理得到所述字符识别结果,直至所述终止检测结果表征所述N个终止图像帧中至少有M个终止图像帧中的所述第一物体与所述目标区域的距离 无变化,或者所述N个起始图像帧中有K个终止图像帧中包括所述第一物体时停止,得到所述多个字符识别结果;K为大于等于1小于M的整数。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为利用所述多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出所述重叠度矩阵;将所述多个重叠度依次与预设阈值进行比较,得到每个重叠度的比较结果;若所述比较结果表征对应的第一重叠度大于所述预设阈值,则将重叠字符在所述重叠度矩阵中对应的行和列删除得到新的重叠度矩阵,直至所述重叠度矩阵中的所述重叠度均小于所述预设阈值时得到目标矩阵;所述重叠字符为所述第一重叠度对应的所述字符类别概率最小的字符;通过所述目标矩阵包括的任意行中的所述重叠度对应的字符,确定出所述多个第一字符。
本申请实施例中,所述多个字符包括:T个字符;所述多个字符框面积信息包括:T个字符框面积信息;T为大于1的整数;信息识别装置800中的检测处理模块804被配置为所述利用所述多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出所述重叠度矩阵包括:根据第1个字符的字符框面积信息至第T个字符的字符框面积信息,计算出所述第1个字符分别与所述第1个字符至第T个字符的第一组T个重叠度;利用所述第一组T个重叠度构建出所述重叠度矩阵的第一行,直至计算得到第T个字符分别与所述第1个字符至所述第T个字符的第T组T个重叠度,利用所述第T组T个重叠度构建出所述重叠度矩阵的最后一行,进而得到所述重叠度矩阵。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为将多个第一字符框信息通过预设异常点检测模型处理,剔除所述多个第一字符中与其他的第一字符密度之间密度最低的所述异常字符,得到所述多个第二字符。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为通过所述多个第二字符框信息中的多个中心点信息进行线性拟合得到拟合线,并计算所述多个中心点信息距离所述拟合线的误差平方和;根据所述误差平方和的大小确定出所述字符识别结果的字符层数;若所述字符层数为二层,则从所述多个第二字符中提取出包括任意的P个第二字符多个字符组合;P为大于1小于T的整数;对所述多个字符组合分别进行线性拟合,并计算得到多个误差平方和,确定出最小的误差平方和对应的字符组合中包括的P个第三字符;将其余字符按照各自对应的第二字符框信息中的横坐标信息的大小顺序进行拼接,再将所述P个第三字符按照各自对应的横坐标信息的大小顺序进行拼接,得到所述字符识别 结果;所述其余字符为所述多个第二字符中的除所述P个第三字符之外的字符。
本申请实施例中,信息识别装置800中的检测处理模块804被配置为若所述字符层数为一层,则将所述多个第二字符按照各自对应所述横坐标信息的大小顺序进行拼接,得到所述字符识别结果。
本申请实施例中,信息识别装置800中的确定模块805被配置为将所述多个字符识别结果中的每个字符对应的字符类别概率信息相加,得到多个总概率;确定所述多个总概率中最大的总概率对应的字符识别结果为所述目标字符识别结果。
本申请实施例中,信息识别装置800中的采集提取模块803被配置为在所述视频流中提取出所述当前时刻对应的当前图像帧;在所述视频流中以所述当前图像帧为起点,沿着时间轴每隔预定时长或者预定数量的图像帧提取出一个图像帧,直至提取得到N-1个图像帧;将所述当前图像帧和所述N-1个图像帧沿时间轴组合得到所述N个起始图像帧。
本申请实施例中,信息识别装置800中的采集提取模块803被配置为实时采集预定区域的视频流,并基于滑动窗口的方法在视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;通过检测处理模块804若针对N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的第一物体为终止状态时停止,得到多个字符识别结果;其中,N个终止图像帧是在视频流中距离当前时刻任意时长后的终止时刻提取到的;图像帧属于当前时刻至终止时刻在视频流中提取的多个图像帧;通过确定模块805基于多个字符识别结果,确定出目标字符识别结果。由于本方案通过抓取在第一物体到达和离开的期间对车牌信息进行识别,并且同时对识别得到的字符进行去重和优化处理,克服了车牌识别过程中的格式多样化的困难,进而可以提高对车牌信息的识别准确率。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的信息识别方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台信息识别装置(可以是个人计算机等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计 算机程序被处理器执行时实现上述方法中的步骤。
对应地,本申请实施例提供一种信息识别装置,包括存储器802和处理器801,所述存储器802存储有可在处理器801上运行的计算机程序,所述处理器801执行所述程序时实现上述方法中的步骤。
这里需要指出的是:以上存储介质和装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
需要说明的是,图16为本申请实施例提供的信息识别装置的一种硬件实体示意图,如图16所示,该信息识别装置800的硬件实体包括:处理器801和存储器802,其中;
处理器801通常控制信息识别装置800的总体操作。
存储器802配置为存储由处理器801可执行的指令和应用,还可以缓存待处理器801以及信息识别装置800中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组 成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储装置、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机、服务器、或者网络装置等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储装置、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (15)

  1. 一种信息识别方法,包括:
    实时采集预定区域的视频流,并基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;
    若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果;其中,所述N个终止图像帧是在所述视频流中距离所述当前时刻任意时长后的终止时刻提取到的;所述每个图像帧属于所述当前时刻至所述终止时刻在所述视频流中提取的多个图像帧;
    基于所述多个字符识别结果,确定出目标字符识别结果。
  2. 根据权利要求1所述的信息识别方法,其中,所述若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果,包括:
    对所述N个起始图像帧进行检测得到起始检测结果;
    若所述起始检测结果表征所述第一物体为靠近或者远离目标区域的状态,则从所述N个起始图像帧中的第一图像帧为起点,针对在所述视频流中提取出的所述每个图像帧进行处理,得到对应的所述待测区域图像;
    利用预设目标检测模型对所述待测区域图像进行处理,得到多个字符及对应的字符相关信息;
    利用所述字符相关信息对所述多个字符进行去重优化处理,得到一定次序字符组成的所述字符识别结果,直至终止检测结果表征所述第一物体为无位移变化的状态,或者所述终止时刻无所述第一物体的状态时停止,得到所述多个字符识别结果;所述终止检测结果为对应所述N个终止图像帧进行检测得到的。
  3. 根据权利要求2所述的信息识别方法,其中,所述对所述N个起始图像帧进行检测得到起始检测结果,包括:
    利用第一预设检测模型对所述N个起始图像帧进行处理,得到所述起始检测结果;
    所述若所述起始检测结果表征所述第一物体为靠近或者远离目标区域的状态,则从所述N个起始图像帧中的第一图像帧为起点,针对在所述视频流中提取出的所述每个图像帧进行处理,得到对应的所述待测区域图像,包括:
    若所述起始检测结果表征所述N个起始图像帧中至少有M个起始图像帧中包括所述第一物体,且所述第一物体与所述目标区域的距离在变大或者变小,则将所述每个图像帧通过第二预设检测模型处理,得到所述待测区域图像;M为大于1小于N的整数。
  4. 根据权利要求2所述的信息识别方法,其中,所述利用预设目标检测模型对所述待测区域图像进行处理,得到多个字符及对应的字符相关信息,包括:
    利用预设目标检测模型对所述待测区域图像进行处理,得到所述多个字符,及所述多个字符分别对应的字符类别概率信息和字符框信息。
  5. 根据权利要求4所述的信息识别方法,其中,所述利用所述字符相关信息对所述多个字符进行去重优化处理,得到一定次序字符组成的所述字符识别结果,直至终止检测结果表征所述第一物体为无位移变化的状态,或者所述终止时刻无所述第一物体的状态时停止,得到所述多个字符识别结果,包括:
    利用多个字符框信息计算得到重叠度矩阵,并结合所述重叠度矩阵对所述多个字符进行去重处理,得到多个第一字符;
    结合所述多个第一字符分别对应的第一字符框信息剔除所述多个第一字符中的异常字符,得到多个第二字符;
    利用所述多个第二字符分别对应的第二字符框信息,对所述多个第二字符进行分层排序处理得到所述字符识别结果,直至所述终止检测结果表征所述N个终止图像帧中至少有M个终止图像帧中的所述第一物体与所述目标区域的距离无变化,或者所述N个起始图像帧中有K个终止图像帧中包括所述第一物体时停止,得到所述多个字符识别结果;K为大于等于1小于M的整数。
  6. 根据权利要求5所述的信息识别方法,其中,所述利用多个字符框信息计算得到重叠度矩阵,并结合所述重叠度矩阵对所述多个字符进行去重处理,得到多个第一字符,包括:
    利用所述多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出所述重叠度矩阵;
    将所述多个重叠度依次与预设阈值进行比较,得到每个重叠度的比较结果;
    若所述比较结果表征对应的第一重叠度大于所述预设阈值,则将重叠字符在所述重叠度矩阵中对应的行和列删除得到新的重叠度矩阵,直至所述重叠度矩阵中的所述重叠度均小于所述预设阈值时得到目标矩阵;所述重叠字符为所述第一重叠度对应的所述字符类别概率最小的字符;
    通过所述目标矩阵包括的任意行中的所述重叠度对应的字符,确定出所述多个第一字符。
  7. 根据权利要求6所述的信息识别方法,其中,所述多个字符包括:T个字符;所述多个字符框面积信息包括:T个字符框面积信息;T为大于1的整数;
    所述利用所述多个字符框信息包括的多个字符框面积信息,计算任意两个字符之间的重叠度,并通过得到的多个重叠度构建出所述重叠度矩阵包括:
    根据第1个字符的字符框面积信息至第T个字符的字符框面积信息,计算出所述第1个字符分别与所述第1个字符至第T个字符的第一组T个重叠度;
    利用所述第一组T个重叠度构建出所述重叠度矩阵的第一行,直至计算得到第T个字符分别与所述第1个字符至所述第T个字符的第T组T个重叠度,利用所述第T组T个重叠度构建出所述重叠度矩阵的最后一行,进而得到所述重叠度矩阵。
  8. 根据权利要求5所述的信息识别方法,其中,结合所述多个第一字符分别对应的第一字符框信息剔除所述多个第一字符中的异常字符,得到多个第二字符,包括:
    将多个第一字符框信息通过预设异常点检测模型处理,剔除所述多个第一字符中与其他的第一字符密度之间密度最低的所述异常字符,得到所述多个第二字符。
  9. 根据权利要求5所述的信息识别方法,其中,所述利用所述多个第二字符分别对应的第二字符框信息,对所述多个第二字符进行分层排序处理得到所述字符识别结果,包括:
    通过多个第二字符框信息中的多个中心点信息进行线性拟合得到拟合线,并计算所述多个中心点信息距离所述拟合线的误差平方和;
    根据所述误差平方和的大小确定出所述字符识别结果的字符层数;
    若所述字符层数为二层,则从所述多个第二字符中提取出包括任意的P个第二字符多个字符组合;P为大于1小于T的整数;
    对所述多个字符组合分别进行线性拟合,并计算得到多个误差平方和,确定出最小的误差平方和对应的字符组合中包括的P个第三字符;
    将其余字符按照各自对应的第二字符框信息中的横坐标信息的大小顺序进行拼接,再将所述P个第三字符按照各自对应的横坐标信息的大小顺序进行拼接,得到所述字符识别结果;所述其余字符为所述多个第二字符中的除所述P个第三字符之外的字符。
  10. 根据权利要求9所述的信息识别方法,其中,所述根据所述误差平方和的大小确定出所述字符识别结果的字符层数之后,所述方法还包括:
    若所述字符层数为一层,则将所述多个第二字符按照各自对应所述横坐标信息的大小顺序进行拼接,得到所述字符识别结果。
  11. 根据权利要求1-10任一项所述的信息识别方法,其中,所述基于所述多个字符识别结果,确定出目标字符识别结果,包括:
    将所述多个字符识别结果中的每个字符对应的字符类别概率信息相加,得到多个总概率;
    确定所述多个总概率中最大的总概率对应的字符识别结果为所述目标字符识别结果。
  12. 根据权利要求1-10任一项所述的信息识别方法,其中,所述基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧,包括:
    在所述视频流中提取出所述当前时刻对应的当前图像帧;
    在所述视频流中以所述当前图像帧为起点,沿着时间轴每隔预定时长或者预定数量的图像帧提取出一个图像帧,直至提取得到N-1个图像帧;
    将所述当前图像帧和所述N-1个图像帧沿时间轴组合得到所述N个起始图像帧。
  13. 一种信息识别装置,包括:
    采集提取模块,被配置为实时采集预定区域的视频流,并基于滑动窗口的方法在所述视频流中提取出当前时刻对应的N个起始图像帧;N为大于1的整数;
    检测处理模块,被配置为若针对所述N个起始图像帧检测得到其中的第一物体为起始状态,则对每个图像帧中待测区域图像内的字符进行去重优化处理得到字符识别结果,直至针对N个终止图像帧检测得到其中的所述第一物体为终止状态时停止,得到多个字符识别结果;其中,所述N个终止图像帧是在所述视频流中距离所述当前时刻任意时长后的终止时刻提取到的;所述每个图像帧属于所述当前时刻至所述终止时刻在所述视频流中提取的多个图像帧;
    确定模块,被配置为基于所述多个字符识别结果,确定出目标字符识别结果。
  14. 一种信息识别装置,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至12任一项所述方法中的步骤。
  15. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至12任一项所述方法中的步骤。
PCT/CN2023/073724 2022-07-13 2023-01-29 信息识别方法、装置及存储介质 WO2024011889A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210824930.3A CN115082832A (zh) 2022-07-13 2022-07-13 信息识别方法、装置及存储介质
CN202210824930.3 2022-07-13

Publications (1)

Publication Number Publication Date
WO2024011889A1 true WO2024011889A1 (zh) 2024-01-18

Family

ID=83259900

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/073724 WO2024011889A1 (zh) 2022-07-13 2023-01-29 信息识别方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN115082832A (zh)
WO (1) WO2024011889A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082832A (zh) * 2022-07-13 2022-09-20 北京京东乾石科技有限公司 信息识别方法、装置及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303153A (zh) * 2014-07-23 2016-02-03 中兴通讯股份有限公司 一种车辆车牌识别方法及装置
CN110070082A (zh) * 2019-04-22 2019-07-30 苏州科达科技股份有限公司 车牌识别方法、装置、设备及存储介质
CN112115904A (zh) * 2020-09-25 2020-12-22 浙江大华技术股份有限公司 车牌检测识别方法、装置及计算机可读存储介质
CN113642560A (zh) * 2021-08-19 2021-11-12 熵基科技股份有限公司 一种车牌字符定位方法及相关设备
US20220207889A1 (en) * 2020-12-29 2022-06-30 Streamax Technology Co., Ltd. Method for recognizing vehicle license plate, electronic device and computer readable storage medium
CN115082832A (zh) * 2022-07-13 2022-09-20 北京京东乾石科技有限公司 信息识别方法、装置及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303153A (zh) * 2014-07-23 2016-02-03 中兴通讯股份有限公司 一种车辆车牌识别方法及装置
CN110070082A (zh) * 2019-04-22 2019-07-30 苏州科达科技股份有限公司 车牌识别方法、装置、设备及存储介质
CN112115904A (zh) * 2020-09-25 2020-12-22 浙江大华技术股份有限公司 车牌检测识别方法、装置及计算机可读存储介质
US20220207889A1 (en) * 2020-12-29 2022-06-30 Streamax Technology Co., Ltd. Method for recognizing vehicle license plate, electronic device and computer readable storage medium
CN113642560A (zh) * 2021-08-19 2021-11-12 熵基科技股份有限公司 一种车牌字符定位方法及相关设备
CN115082832A (zh) * 2022-07-13 2022-09-20 北京京东乾石科技有限公司 信息识别方法、装置及存储介质

Also Published As

Publication number Publication date
CN115082832A (zh) 2022-09-20

Similar Documents

Publication Publication Date Title
Chen Automatic License Plate Recognition via sliding-window darknet-YOLO deep learning
US11455805B2 (en) Method and apparatus for detecting parking space usage condition, electronic device, and storage medium
US9014432B2 (en) License plate character segmentation using likelihood maximization
WO2020151166A1 (zh) 多目标跟踪方法、装置、计算机装置及可读存储介质
CN104303193B (zh) 基于聚类的目标分类
WO2017190574A1 (zh) 一种基于聚合通道特征的快速行人检测方法
CN112233097B (zh) 基于空时域多维融合的道路场景他车检测系统和方法
Huang et al. DMPR-PS: A novel approach for parking-slot detection using directional marking-point regression
JP7414978B2 (ja) 駐車スペース及びその方向角検出方法、装置、デバイス及び媒体
CN111753797B (zh) 一种基于视频分析的车辆测速方法
Peng et al. Drone-based vacant parking space detection
Lee et al. Available parking slot recognition based on slot context analysis
KR102373753B1 (ko) 딥러닝 기반의 차량식별추적 방법, 및 시스템
CN105631418A (zh) 一种人数统计的方法和装置
US11587327B2 (en) Methods and systems for accurately recognizing vehicle license plates
WO2024011889A1 (zh) 信息识别方法、装置及存储介质
CN110751619A (zh) 一种绝缘子缺陷检测方法
CN115841649A (zh) 一种用于城市复杂场景的多尺度人数统计方法
CN116311063A (zh) 监控视频下基于人脸识别的人员细粒度跟踪方法及系统
CN114219829A (zh) 车辆跟踪方法、计算机设备及存储装置
CN112465854A (zh) 基于无锚点检测算法的无人机跟踪方法
CN114495045A (zh) 感知方法、感知装置、感知系统及相关设备
CN114387578A (zh) 停车位定位方法、系统、介质以及无人驾驶环卫机器人
Sharma et al. Automatic vehicle detection using spatial time frame and object based classification
Kalva et al. Smart Traffic Monitoring System using YOLO and Deep Learning Techniques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838391

Country of ref document: EP

Kind code of ref document: A1