CN111931627A

CN111931627A - Vehicle re-identification method and device based on multi-mode information fusion

Info

Publication number: CN111931627A
Application number: CN202010769769.5A
Authority: CN
Inventors: 闫军; 刘艳洋
Original assignee: Intelligent Interconnection Technologies Co ltd
Current assignee: Intelligent Interconnection Technologies Co ltd
Priority date: 2020-08-05
Filing date: 2020-08-05
Publication date: 2020-11-13
Also published as: WO2022027873A1

Abstract

The invention discloses a vehicle re-identification method and device based on multi-mode information fusion, and relates to the field of intelligent parking management, wherein the method comprises the following steps: extracting a vehicle image to be inquired from the monitoring video; the vehicle image to be inquired is used as the input of a preset multi-branch monitoring mechanism network model, the global characteristic and the local characteristics of the vehicle are extracted, and the license plate characters of the vehicle are extracted according to the license plate area in the local characteristics; and acquiring the similarity or matched joint probability between different vehicles and the global characteristics, the local characteristics and the license plate characters of the vehicles respectively from a preset database, and generating a vehicle re-identification sorting table according to the similarity or matched joint probability between the different vehicles and the vehicle to be inquired. The invention can effectively improve the accuracy of vehicle re-identification, further well distinguish the inter-class difference and the intra-class difference between vehicles, and can adapt to various application requirements in real scenes. The invention can be used in the field of intelligent parking management.

Description

Vehicle re-identification method and device based on multi-mode information fusion

Technical Field

The invention relates to the field of intelligent parking management, in particular to a vehicle re-identification method and device based on multi-mode information fusion.

Background

Vehicles are an important object in city monitoring, and have attracted much attention in a large number of vehicle-related tasks such as detection, tracking, scheduling, and the like. The vehicle re-identification is to find out the same vehicle shot by different cameras, or the same vehicle shot by the same camera under different illumination and different viewing angles. Through the vehicle re-identification technology, the cross-camera automatic identification and locking of the same vehicle can be realized, and the method plays an important role in specific tasks such as urban traffic scheduling and illegal vehicle tracking and is beneficial to the planning and development of intelligent traffic and smart cities.

The current common vehicle re-identification methods generally include the following types: one is to use sensors to address the issue of vehicle re-identification. For example, vehicle re-identification is performed through a geomagnetic sensor, an infrared radio frequency sensor and the like, but the method is high in required cost, complex in installation environment requirement and not suitable for large-scale popularization and application. The second method is to realize the tracking and positioning of the vehicle across cameras by identifying the license plate, but under many conditions, the license plate number identification is inaccurate due to various reasons such as illumination, shielding, fouling and the like, and further the error of vehicle re-identification is larger; the third method is to perform vehicle re-identification based on the local appearance characteristics of the vehicle, however, the local appearance characteristics of the vehicle reflect the vehicle more sidedly, and thus the vehicle re-identification progress is low.

Disclosure of Invention

In order to solve the technical problems, the invention provides a vehicle re-identification method and device based on multi-mode information fusion, which can solve the problems that the existing re-identification method only carries out vehicle re-identification through vehicle appearance or license plate number, the applied identification characteristics are single, and the identification precision is low.

To achieve the above object, a vehicle re-recognition method based on multimodal information fusion is characterized in that the method comprises:

extracting a vehicle image to be inquired from the monitoring video;

the vehicle image to be inquired is used as the input of a preset multi-branch monitoring mechanism network model, the global characteristic and the plurality of local characteristics of the vehicle are extracted, and the license plate characters of the vehicle are extracted according to the license plate area in the plurality of local characteristics;

acquiring similarity or matched joint probability between different vehicles and global features, a plurality of local features and license plate characters of the vehicles from a preset database, wherein the global features, the local features and the license plate characters corresponding to the different vehicles are stored in the preset database, and the global features, the local features and the license plate characters corresponding to the different vehicles are stored in the preset database;

and generating a vehicle re-identification sorting table according to the similarity or the matching joint probability between different vehicles and the vehicle to be inquired.

Further, the method further comprises:

extracting the spatiotemporal information of the vehicle image to be inquired from the monitoring video;

predicting the relative driving direction of the vehicle according to the joint probability of matching between different vehicles and the vehicle to be inquired and the spatio-temporal information of the vehicle image to be inquired;

and obtaining the space-time matching probability between the vehicle and different vehicles according to the topological relation among different cameras, the relative driving direction of the vehicle and a preset vehicle space-time transfer model, wherein the preset vehicle space-time transfer model is established according to historical vehicle driving space-time data respectively corresponding to different vehicles.

Further, the step of generating the vehicle re-identification ranking table according to the similarity or the joint probability of matching between different vehicles and the vehicle to be queried comprises:

and generating a vehicle re-identification sorting table according to the joint probability or similarity of matching between different vehicles and the vehicle to be inquired and the space-time matching probability between the vehicle and the different vehicles.

Further, the step of extracting the vehicle image to be inquired from the monitoring video comprises the following steps:

detecting a vehicle target on a to-be-inquired vehicle picture acquired from the monitoring video, and acquiring a vehicle target image completely marked by the boundary frame;

and removing redundant background information in the vehicle target image to obtain the vehicle image to be inquired.

Further, the global feature is an overall appearance feature of the vehicle, and the plurality of local features include a vehicle head appearance feature, a vehicle tail appearance feature and a vehicle license plate area appearance feature.

Further, the step of obtaining the similarity between the global feature, the plurality of local features and the license plate characters of different vehicles and the vehicle from a preset database comprises:

constructing a ternary loss function and calculating the characteristic distance between different vehicles and the vehicles according to the overall appearance characteristics, the vehicle head appearance characteristics, the vehicle tail appearance characteristics, the vehicle license plate area appearance characteristics and the license plate character characteristics of the different vehicles and the vehicles to obtain the similarity between the different vehicles and the vehicles.

Further, the step of obtaining joint probabilities of different vehicles matching the vehicle comprises:

based on a Bayesian probability model, calculating according to a formula Pa ═ PF × θ × Ptpo, wherein Pa is a joint probability of matching between the candidate vehicle and the vehicle to be queried, PF is a joint probability of matching between the candidate vehicle and the vehicle to be queried, and PF is a vehicle head appearance characteristic, a vehicle tail appearance characteristic, a vehicle license plate area appearance characteristic and a vehicle overall appearance characteristic, Ptpo is a license plate matching probability between the candidate vehicle and the vehicle to be queried, and θ is a confidence coefficient of license plate recognition.

Further, the present invention provides a vehicle re-recognition apparatus based on multimodal information fusion, the apparatus comprising:

the extraction module is used for extracting the vehicle image to be inquired from the monitoring video;

the extraction module is further used for taking the vehicle image to be inquired as the input of a preset multi-branch monitoring mechanism network model, extracting the global characteristic and the plurality of local characteristics of the vehicle, and extracting the license plate characters of the vehicle according to the license plate region in the plurality of local characteristics;

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the similarity or matched joint probability between different vehicles and the global features, the local features and the license plate characters of the vehicles from a preset database, the global features, the local features and the license plate characters which correspond to the different vehicles are stored in the preset database, and the global features, the local features and the license plate characters which correspond to the different vehicles are stored in the preset database;

and the generating module is used for generating a vehicle re-identification sorting table according to the similarity or the matching joint probability between different vehicles and the vehicle to be inquired.

Further, the apparatus further comprises: a prediction module;

the extraction module is also used for extracting the spatiotemporal information of the vehicle image to be inquired from the monitoring video;

the prediction module is used for predicting the relative driving direction of the vehicle according to the joint probability of matching between different vehicles and the vehicle to be inquired and the space-time information of the vehicle image to be inquired;

the acquisition module is further used for obtaining space-time matching probabilities between the vehicles and the different vehicles according to topological relations among the different cameras, relative driving directions of the vehicles and a preset vehicle space-time transfer model, wherein the preset vehicle space-time transfer model is established according to historical vehicle driving space-time data respectively corresponding to the different vehicles.

Further, the generating module is specifically configured to generate a vehicle re-identification ranking table according to joint probabilities or similarities of matching between different vehicles and a vehicle to be queried and spatiotemporal matching probabilities between the vehicle and the different vehicles.

Further, the extraction module is specifically configured to perform vehicle target detection on a to-be-queried vehicle picture obtained from the monitoring video, and obtain a vehicle target image completely marked by the bounding box; and removing redundant background information in the vehicle target image to obtain the vehicle image to be inquired.

Further, the obtaining module is specifically configured to construct a ternary loss function and calculate a characteristic distance between different vehicles and the vehicle according to the different vehicles and the overall appearance characteristics, the vehicle head appearance characteristics, the vehicle tail appearance characteristics, the vehicle license plate region appearance characteristics, and the license plate character characteristics of the vehicles, so as to obtain similarities between the different vehicles and the vehicle.

Further, the obtaining module is specifically configured to calculate, based on a bayesian probability model, according to a formula Pa ═ PF × θ × Ptpo, where Pa is a joint probability of matching between the candidate vehicle and the vehicle to be queried, PF is a joint probability of matching between the candidate vehicle and the vehicle to be queried, the vehicle nose appearance, the vehicle tail appearance, the vehicle license plate area appearance, and the vehicle overall appearance, Ptpo is a license plate matching probability between the candidate vehicle and the vehicle to be queried, and θ is a confidence of license plate recognition.

According to the vehicle re-recognition method and device based on multi-mode information fusion, a preset multi-branch monitoring mechanism network model is fused, the global feature, the local features and the license plate characters of the vehicle are respectively compared, the feature matching based on the fusion of the global appearance feature, the local appearance feature and the license plate characters is achieved, and therefore compared with the existing feature matching only through single features or license plate characters, the accuracy rate of vehicle re-recognition can be effectively improved; furthermore, time and geographic position information corresponding to the vehicles are further integrated in the mode to form multi-mode fusion probability characteristics, inter-class differences and intra-class differences among the vehicles are well distinguished, and application requirements of different illumination and different camera model identification in a real scene can be met.

Drawings

FIG. 1 is a flow chart of a vehicle re-identification method based on multi-modal information fusion provided by the present invention;

FIG. 2 is a schematic diagram of a vehicle re-identification device based on multi-modal information fusion provided by the invention.

Detailed Description

The structure and implementation of the device of the present invention are further described in detail below with reference to the accompanying drawings and examples.

The invention provides a parking event determining method based on image depth information, which specifically comprises the following steps of:

101. and extracting the image of the vehicle to be inquired from the monitoring video.

For the embodiment of the present invention, step 101 may specifically include: detecting a vehicle target on a to-be-inquired vehicle picture acquired from the monitoring video, and acquiring a vehicle target image completely marked by the boundary frame; and removing redundant background information in the vehicle target image to obtain the vehicle image to be inquired. For the embodiment of the invention, the vehicle category of the original monitoring image is detected and the instance is segmented, so that the interference of background noise is removed, and all information of the vehicle can be fully extracted, thereby ensuring the definition of the extracted image and further improving the vehicle re-identification precision.

Specifically, the specific process of detecting the vehicle target on the image of the vehicle to be queried, which is obtained from the monitoring video, may include the following steps: when a certain monitoring camera captures a video frame, the geographical position information of the monitoring camera and the time information for shooting the video frame are acquired simultaneously. Meanwhile, a single-stage detection method is used as a target detection tool, vehicle target detection is performed on a vehicle picture acquired from a source monitoring video stream, a vehicle target which is completely marked by a boundary box is acquired and used as vehicle instance segmentation source data, pictures with the resolution less than 256 × 256 are ignored, and the identity ID corresponding to each vehicle is marked in a vehicle detected image. Wherein the geographic location includes, but is not limited to, country, city, latitude and longitude coordinates, and the like. Including, but not limited to, the year, month, day, hour, minute, second, lunar calendar, solar calendar, etc. The single-stage detection method includes, but is not limited to, regression-based target detection methods such as YOLO and SSD. The source vehicle pictures cover a plurality of images of the same vehicle captured by real road traffic scene monitoring under different visual angles, different backgrounds and different illumination intensities, wherein the content covered by the vehicle images comprises: license plate appearance information, license plate character information, color, vehicle type and vehicle marking information.

Further, the step of obtaining a vehicle target image completely marked by the bounding box, and removing redundant background information in the vehicle target image to obtain the vehicle image to be queried may specifically include: and (4) using a polygonal marking tool to scratch out the vehicle target in the vehicle picture data set along the vehicle boundary to prepare a vehicle instance segmentation data set. The method comprises the steps of generating feature vectors by using a convolutional neural network, training a vehicle instance segmentation network model, then carrying out primary instance segmentation on a vehicle picture, and then carrying out data cleaning on a primary segmentation result, such as edge detection, hole filling, connected domain detection and pixel area comparison, perfecting the vehicle and instance segmentation effect and obtaining a segmented vehicle data set. The example segmentation method includes, but is not limited to, MASK-RCNN, SOLO and other example segmentation methods. The convolutional neural network includes, but is not limited to, convolutional neural networks such as VGG, AlexNet, ResNet, etc. The edge detection includes, but is not limited to, edge detection methods such as gaussian filtering, Canny, Sobel, etc. The hole filling includes, but is not limited to, pixel filling methods such as open operation, close operation, and fill with diffused water.

102. And taking the image of the vehicle to be inquired as the input of a preset multi-branch monitoring mechanism network model, extracting the global characteristic and the plurality of local characteristics of the vehicle, and extracting the license plate characters of the vehicle according to the license plate region in the plurality of local characteristics.

The global features are overall appearance features of the vehicle, and the local features comprise vehicle head appearance features, vehicle tail appearance features and vehicle license plate area appearance features. The preset multi-branch monitoring mechanism network model comprises detection models such as SSD, YOLO, M2det and Fast-RCNN, and a data set used by the detection models is a data set of a vehicle image which is divided by an example and is removed of background noise. The license plate character recognition method comprises but is not limited to character recognition methods such as CRNN + CTC, YOLOV3 and the like, so that the accuracy of license plate recognition is ensured.

103. And acquiring the similarity or matched joint probability between different vehicles and the global characteristics, the local characteristics and the license plate characters of the vehicles from a preset database.

The preset database stores global features, a plurality of local features and license plate characters corresponding to different vehicles respectively, and the preset database stores global features, a plurality of local features and license plate characters corresponding to different vehicles respectively.

For the embodiment of the invention, the step of acquiring the similarity between different vehicles and the global characteristics, the local characteristics and the license plate characters of the vehicles from the preset database comprises the following steps: constructing a ternary loss function and calculating the characteristic distance between different vehicles and the vehicles according to the overall appearance characteristics, the vehicle head appearance characteristics, the vehicle tail appearance characteristics, the vehicle license plate area appearance characteristics and the license plate character characteristics of the different vehicles and the vehicles to obtain the similarity between the different vehicles and the vehicles.

Specifically, initial features of each vehicle image are respectively extracted by utilizing the convolution layer, local features of the vehicle head appearance feature, the vehicle tail appearance feature, the license plate appearance feature and the license plate character feature are respectively extracted by utilizing a multi-branch detection module mechanism, and the local features of the four parts are fused with the global feature through image fusion to obtain a re-recognition model.

Further, the loss function is a triplet loss function:

wherein, f (P), f (N) are vehicle images with the same ID shot by different cameras, and f (A) is a vehicle image with different ID. The extraction of the plurality of local features and the global features requires setting initial parameters of a convolutional neural network for extracting features of different regions. Specifically, vehicle identity characteristics are output through a vehicle characteristic extraction neural network; the feature fusion network adopts a 5-layer full-connection layer neural network, and the output of the first layer of full-connection layer is taken as the fusion feature of the vehicle. The feature extraction network includes, but is not limited to, ResNet, VGG, AlexNet, and other feature extraction network models. The method for training the feature fusion network comprises the following steps: the measurement learning of a cross entropy loss function and a triple loss function is adopted, and the characteristic learning process of the network is trained; in the process of training the 5-layer fully-connected layer neural network of the feature fusion network by adopting the loss function of metric learning, the intra-class distance of the same ID of the vehicle is reduced, the inter-class distance of different IDs of the vehicle is enlarged, and the robustness of the fusion feature of the vehicle is enhanced.

For the embodiment of the present invention, the step of obtaining the joint probability of matching different vehicles with the vehicle includes: based on a Bayesian probability model, calculating according to a formula Pa ═ PF × θ × Ptpo, wherein Pa is a joint probability of matching between the candidate vehicle and the vehicle to be queried, PF is a joint probability of matching between the candidate vehicle and the vehicle to be queried, and PF is a vehicle head appearance characteristic, a vehicle tail appearance characteristic, a vehicle license plate area appearance characteristic and a vehicle overall appearance characteristic, Ptpo is a license plate matching probability between the candidate vehicle and the vehicle to be queried, and θ is a confidence coefficient of license plate recognition.

104. And generating a vehicle re-identification sorting table according to the similarity or the matching joint probability between different vehicles and the vehicle to be inquired.

In order to facilitate candidate vehicle screening, the vehicle re-identification sorting table may be sorted according to a descending order of similarity or probability.

For the embodiment of the invention, in order to further improve the vehicle re-identification accuracy and integrate the space-time information of the vehicle, the specific method can comprise the following steps: extracting the spatiotemporal information of the vehicle image to be inquired from the monitoring video; predicting the relative driving direction of the vehicle according to the joint probability of matching between different vehicles and the vehicle to be inquired and the spatio-temporal information of the vehicle image to be inquired; and obtaining the space-time matching probability between the vehicle and different vehicles according to the topological relation among different cameras, the relative driving direction of the vehicle and a preset vehicle space-time transfer model, wherein the preset vehicle space-time transfer model is established according to historical vehicle driving space-time data respectively corresponding to different vehicles.

In this case, step 104 may specifically include: and generating a vehicle re-identification sorting table according to the joint probability or similarity of matching between different vehicles and the vehicle to be inquired and the space-time matching probability between the vehicle and the different vehicles.

According to the vehicle re-recognition method based on multi-mode information fusion, the preset multi-branch monitoring mechanism network model is fused, the global characteristic, the local characteristics and the license plate character of the vehicle are respectively compared, the characteristic matching based on the fusion of the global appearance characteristic, the local appearance characteristic and the license plate character is realized, and the accuracy rate of vehicle re-recognition can be effectively improved compared with the existing characteristic matching only through a single characteristic or the license plate character; furthermore, time and geographic position information corresponding to the vehicles are further integrated in the mode to form multi-mode fusion probability characteristics, inter-class differences and intra-class differences among the vehicles are well distinguished, and application requirements of different illumination and different camera model identification in a real scene can be met.

As a specific implementation manner of the method shown in fig. 1, an embodiment of the present invention provides a vehicle re-recognition apparatus based on multimodal information fusion, as shown in fig. 2, the apparatus includes: the extraction module 21 is used for extracting the vehicle image to be inquired from the monitoring video;

the extracting module 21 is further configured to take the vehicle image to be queried as an input of a preset multi-branch monitoring mechanism network model, extract a global feature and multiple local features of the vehicle, and extract license plate characters of the vehicle according to a license plate region in the multiple local features;

the acquiring module 22 is configured to acquire, from a preset database, similarity or matching joint probabilities between different vehicles and the global features, the local features, and the license plate characters of the vehicles, where the global features, the local features, and the license plate characters corresponding to the different vehicles are stored in the preset database, and the global features, the local features, and the license plate characters corresponding to the different vehicles are stored in the preset database;

and the generating module 23 is configured to generate a vehicle re-identification ranking table according to the similarity or the matching joint probability between different vehicles and the vehicle to be queried.

Further, the apparatus further comprises: a prediction module 24;

the extraction module 21 is further configured to extract spatiotemporal information of the vehicle image to be queried from the monitoring video;

the prediction module 24 is configured to predict the relative driving direction of the vehicle according to the joint probability of matching between different vehicles and the vehicle to be queried and the spatiotemporal information of the vehicle image to be queried;

the obtaining module 22 is further configured to obtain space-time matching probabilities between the vehicle and different vehicles according to topological relationships between different cameras, the relative driving directions of the vehicle, and a preset vehicle space-time transition model, where the preset vehicle space-time transition model is established according to historical vehicle driving space-time data respectively corresponding to different vehicles.

Further, the generating module 23 is specifically configured to generate a vehicle re-identification ranking table according to joint probabilities or similarities of matching between different vehicles and the vehicle to be queried and spatio-temporal matching probabilities between the vehicles and the different vehicles.

Further, the extraction module 21 is specifically configured to perform vehicle target detection on a to-be-queried vehicle picture obtained from the monitoring video, and obtain a vehicle target image completely marked by the bounding box; and removing redundant background information in the vehicle target image to obtain the vehicle image to be inquired. For the embodiment of the invention, the vehicle category of the original monitoring image is detected and the instance is segmented, so that the interference of background noise is removed, and all information of the vehicle can be fully extracted, thereby ensuring the definition of the extracted image and further improving the vehicle re-identification precision.

Further, the obtaining module 22 is specifically configured to construct a ternary loss function and calculate a characteristic distance between different vehicles and the vehicle according to the overall appearance characteristics, the vehicle head appearance characteristics, the vehicle tail appearance characteristics, the vehicle license plate region appearance characteristics, and the license plate character characteristics of the different vehicles and the vehicle, so as to obtain similarities between the different vehicles and the vehicle.

Further, the obtaining module 22 is specifically configured to calculate, based on a bayesian probability model, according to a formula Pa ═ PF × θ × ptp, where Pa is a joint probability that a candidate vehicle matches the vehicle to be queried, PF is a joint probability that a vehicle head appearance feature, a vehicle tail appearance feature, a vehicle license plate area appearance feature, and a vehicle overall appearance feature match between the candidate vehicle and the vehicle to be queried, ptp is a license plate matching probability between the candidate vehicle and the vehicle to be queried, and θ is a confidence of license plate recognition.

According to the vehicle re-recognition device based on multi-mode information fusion, the preset multi-branch monitoring mechanism network model is fused, the global characteristic, the local characteristics and the license plate character of the vehicle are respectively compared, the characteristic matching based on the fusion of the global appearance characteristic, the local appearance characteristic and the license plate character is realized, and the accuracy rate of vehicle re-recognition can be effectively improved compared with the existing characteristic matching only through a single characteristic or the license plate character; furthermore, time and geographic position information corresponding to the vehicles are further integrated in the mode to form multi-mode fusion probability characteristics, inter-class differences and intra-class differences among the vehicles are well distinguished, and application requirements of different illumination and different camera model identification in a real scene can be met.

It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.

In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".

Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.

The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside in different components in a user terminal.

In one or more exemplary designs, the functions described above in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A vehicle re-recognition method based on multi-modal information fusion, characterized in that the method comprises:

extracting a vehicle image to be inquired from the monitoring video;

2. The method of claim 1, further comprising:

3. The method for re-identifying the vehicle based on the multi-modal information fusion as claimed in claim 2, wherein the step of generating the vehicle re-identification ranking table according to the joint probability of similarity or matching between different vehicles and the vehicle to be queried comprises:

4. The vehicle re-identification method based on multi-modal information fusion as claimed in claim 1, wherein the step of extracting the image of the vehicle to be queried from the surveillance video comprises:

5. The method according to any one of claims 1 to 4, wherein the global feature is an overall appearance feature of the vehicle, and the local features include a vehicle head appearance feature, a vehicle tail appearance feature and a vehicle license plate region appearance feature.

6. The method as claimed in claim 5, wherein the step of obtaining the similarity between the global feature, the local features and the license plate characters of different vehicles and the vehicle from a preset database comprises:

7. The method according to claim 5, wherein the step of obtaining the joint probability of matching different vehicles with the vehicle comprises:

8. A vehicle re-recognition apparatus based on multimodal information fusion, the apparatus comprising:

9. The device for vehicle re-recognition based on multi-modal information fusion as claimed in claim 8, further comprising: a prediction module;

10. The device for re-recognizing vehicles based on multi-modal fusion of information as claimed in claim 9,

the generating module is specifically used for generating a vehicle re-identification sequencing table according to the joint probability or similarity of matching between different vehicles and the vehicle to be inquired and the space-time matching probability between the vehicle and different vehicles.

11. The device for re-recognizing vehicles based on multi-modal fusion of information as claimed in claim 8,

the extraction module is specifically used for detecting a vehicle target on a to-be-inquired vehicle picture acquired from a monitoring video and acquiring a vehicle target image completely marked by a boundary frame; and removing redundant background information in the vehicle target image to obtain the vehicle image to be inquired.

12. The device according to any one of claims 8 to 11, wherein the global feature is an overall appearance feature of the vehicle, and the local features include a vehicle head appearance feature, a vehicle tail appearance feature, and a vehicle license plate region appearance feature.

13. The device for re-recognizing vehicles based on multi-modal fusion of information as claimed in claim 12,

the acquisition module is specifically used for constructing a ternary loss function and calculating characteristic distances between different vehicles and the vehicles according to the different vehicles, the overall appearance characteristics of the vehicles, the appearance characteristics of vehicle heads, the appearance characteristics of vehicle tail, the appearance characteristics of vehicle license plate areas and the character characteristics of license plates, so as to obtain the similarity between the different vehicles and the vehicles.

14. The device for re-recognizing vehicles based on multi-modal fusion of information as claimed in claim 12,

the obtaining module is specifically configured to calculate, based on a bayesian probability model, according to a formula Pa ═ PF × θ × Ptpo, where Pa is a joint probability of matching between the candidate vehicle and the vehicle to be queried, PF is a joint probability of matching between the candidate vehicle and the vehicle to be queried, between the vehicle nose appearance feature and the vehicle tail appearance feature, between the candidate vehicle and the vehicle to be queried, between the vehicle to be queried and the vehicle license plate area appearance feature, and θ is a confidence of license plate recognition.