CN112069969B

CN112069969B - Expressway monitoring video cross-mirror vehicle tracking method and system

Info

Publication number: CN112069969B
Application number: CN202010897531.0A
Authority: CN
Inventors: 李春杰; 赵建东; 韩明敏; 郭玉彬; 侯晓青; 严华; 高海涛
Original assignee: Beijing Jiaotong University; Hebei Communications Planning Design and Research Institute Co Ltd
Current assignee: Beijing Jiaotong University; Hebei Communications Planning Design and Research Institute Co Ltd
Priority date: 2020-08-31
Filing date: 2020-08-31
Publication date: 2023-07-25
Anticipated expiration: 2040-08-31
Also published as: CN112069969A

Abstract

The invention relates to a method and a system for tracking a highway monitoring video cross-mirror vehicle, belongs to the technical field of computer vision images, and solves the problems that the conventional cross-mirror vehicle tracking method is difficult to implement in an actual scene and has low algorithm applicability. Acquiring frame images in video files of a plurality of cameras of a highway to be monitored, and carrying out vehicle detection on each frame image based on an improved YOLO target detection model to obtain a vehicle detection image containing a complete vehicle rectangular frame; inputting the vehicle detection image into a multi-target tracking model to obtain a vehicle tracking result; establishing a vehicle information database according to the vehicle detection image and the vehicle tracking result; the target vehicle image is intercepted based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, and the motion trail of the target vehicle corresponding to the target vehicle image is matched according to the vehicle information database, so that cross-mirror tracking is realized, the implementation difficulty of a tracking method is reduced, and the applicability is improved.

Description

Expressway monitoring video cross-mirror vehicle tracking method and system

Technical Field

The invention relates to the technical field of computer vision images, in particular to a method and a system for tracking a highway monitoring video cross-mirror vehicle.

Background

At present, expressway monitoring systems are more and more perfect, cameras are more and more densely distributed, but the expressway is still the place where accidents are most likely to occur, and the reason is that the expressway has high speed, more cars and limited monitoring camera volatilization.

The existing cross-mirror vehicle re-identification technology is based on marked data sets, and although the vehicle re-identification technology is further developed along with the update and release of high-quality data sets, the data set has limited scenes, the algorithm model has strong limitation, and the algorithm is difficult to implement in actual scenes and has low applicability.

Disclosure of Invention

In view of the above analysis, the embodiment of the invention aims to provide a method and a system for tracking a highway monitoring video cross-mirror vehicle, which are used for solving the problems that the conventional cross-mirror vehicle tracking method is difficult to implement in an actual scene and the algorithm has low applicability.

In one aspect, an embodiment of the present invention provides a method for tracking a highway monitoring video cross-mirror vehicle, including the following steps:

acquiring frame images in video files of a plurality of cameras of a highway to be monitored, and carrying out vehicle detection on each frame image based on an improved YOLO target detection model to obtain a vehicle detection image containing a complete vehicle rectangular frame;

inputting the vehicle detection image into a multi-target tracking model to obtain a vehicle tracking result; the vehicle tracking result comprises a vehicle ID and a vehicle track;

establishing a vehicle information database according to the vehicle detection image and the vehicle tracking result;

intercepting a target vehicle image based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, and matching a motion track of a target vehicle corresponding to the target vehicle image according to the vehicle information database to realize cross-mirror tracking.

Further, the improved YOLO target detection model includes a feature extraction network layer and a YOLO detection layer; wherein the feature extraction network layer comprises a stem unit and an OSA unit;

a stem unit, configured to downsample frame images in the video files of the plurality of cameras on the highway to obtain an image with a size of 304×304×128;

an OSA unit configured to convolve an image with an input size of 304×304×128 to obtain an image with a size of 19×19×512;

and the YOLO detection layer is used for obtaining a vehicle detection image containing a complete vehicle rectangular frame according to the image with the size of 19 x 512 output by the OSA unit.

Further, the multi-target tracking model includes a motion prediction unit and a depth appearance feature extraction unit:

the motion prediction unit is used for predicting and obtaining a current frame of vehicle prediction image according to a previous frame of vehicle detection image;

the depth appearance feature extraction unit comprises a re-identification network, obtains a vehicle track based on a vehicle detection image and a vehicle prediction image input into the re-identification network, and numbers the vehicle track to obtain a vehicle ID corresponding to the vehicle track.

Further, the building a vehicle information database according to the vehicle tracking result specifically includes: and storing the vehicle detection image and the vehicle track into a database based on the camera number and the vehicle ID to obtain a vehicle information database.

Further, intercepting a target vehicle image based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, and matching a motion track of a target vehicle corresponding to the target vehicle image according to the vehicle information database to realize cross-mirror tracking, wherein the method comprises the following steps:

acquiring a certain vehicle detection image corresponding to any camera number in the vehicle information database, and intercepting a target vehicle image;

based on the target vehicle image and a certain vehicle detection image corresponding to the other camera numbers, respectively obtaining a depth feature matrix of the target vehicle image and a depth feature matrix of a certain vehicle detection image corresponding to the other camera numbers;

based on the depth feature matrix of the target vehicle image and the depth feature matrix of a certain vehicle detection image corresponding to other camera numbers, acquiring the cosine similarity distance of the vehicle in the certain vehicle detection image corresponding to the target vehicle and the other camera numbers by using the re-identification network;

classifying and sequencing the cosine similarity distances according to the camera numbers to obtain the minimum cosine similarity distance corresponding to the same camera number;

judging whether the minimum cosine similarity distance is smaller than a similarity threshold value, if so, taking the vehicle in the corresponding vehicle detection image as a target vehicle, and if not, judging that the target vehicle drives away from the expressway to be monitored;

and matching the camera numbers and the vehicle IDs corresponding to the vehicle detection images with the vehicle information database to obtain the motion trail of the target vehicle, so as to realize cross-mirror tracking.

On the other hand, the embodiment of the invention provides a highway monitoring video cross-mirror vehicle tracking system, which comprises the following components:

the detection module is used for acquiring frame images in the video files of the cameras of the expressway to be monitored, and carrying out vehicle detection on each frame image based on the improved YOLO target detection model to obtain a vehicle detection image containing a complete vehicle rectangular frame;

the tracking module is used for inputting the vehicle detection image into a multi-target tracking model to obtain a vehicle tracking result; the vehicle tracking result comprises a vehicle ID and a vehicle track;

the vehicle information database obtaining module is used for establishing a vehicle information database according to the vehicle detection image and the vehicle tracking result;

and the motion track obtaining module is used for intercepting a target vehicle image according to a certain vehicle detection image corresponding to any camera number in the vehicle information database, and matching the motion track of the target vehicle corresponding to the target vehicle image according to the vehicle information database so as to realize cross-mirror tracking.

Further, the detection module comprises a feature extraction network layer and a YOLO detection layer; wherein the feature extraction network layer comprises a stem unit and an OSA unit;

Further, the tracking module includes a motion prediction unit and a depth appearance feature extraction unit:

Further, the vehicle information database obtaining module stores the vehicle detection image and the vehicle track into a database according to the camera numbers and the vehicle IDs to obtain a vehicle information database.

Further, the motion trail obtaining module executes the following procedures:

Compared with the prior art, the invention has at least one of the following beneficial effects:

1. a method for tracking a vehicle by a highway monitoring video cross-mirror comprises the steps of detecting the vehicle through an improved YOLO target detection model to obtain a vehicle detection image, tracking the vehicle through a multi-target tracking model to obtain a vehicle ID and a vehicle track, finally obtaining cosine similarity based on a re-identification network, and then splicing to obtain a complete motion track of the target vehicle, thereby providing an efficient, quick and high-precision video analysis technology for safety monitoring and searching of the target vehicle for a highway management department.

2. The backbone network in the YOLOv3 is replaced by the variant VoVNet of DenseNet with better learning capability to obtain the feature extraction network layer, meanwhile, the size of a vehicle is shot aiming at the expressway monitoring video image, and the three middle and large scales of the YOLOv3 detection layer are reduced to be the two middle and small scales, so that an improved YOLO target detection model is obtained, the model size is smaller, the calculation speed is faster, the calculation amount is reduced, and the vehicle detection image containing a complete rectangular frame of the vehicle can be obtained more quickly.

3. The vehicle information database can be obtained by storing the vehicle detection image and the vehicle track into the database according to the naming rule of the camera number-vehicle ID, and data support and basis are provided for the re-identification of the vehicle track and the complete splicing of the target vehicle track.

4. The depth appearance feature extraction unit is utilized to intercept a target vehicle image in a certain vehicle detection image corresponding to any camera in the vehicle information database, the cosine similarity of the target vehicle image and a plurality of vehicles in a certain vehicle detection image in other cameras is calculated, the target vehicle is matched with a plurality of vehicles in a certain vehicle detection image in other cameras through the cosine similarity, and then the motion trail of the target vehicle is obtained through splicing, so that cross-mirror tracking is realized, and the matching efficiency and the matching precision are high.

In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.

FIG. 1 is an overall block diagram of a highway surveillance video cross-mirror vehicle tracking method;

FIG. 2 is a schematic flow chart of a method for tracking a highway monitoring video cross-mirror vehicle;

FIG. 3 is a schematic diagram of a modified YOLO target detection model;

FIG. 4 is a schematic structural diagram of a DeepSort multi-target tracking model;

FIG. 5 is a schematic diagram of a highway monitoring video cross-mirror vehicle tracking system;

reference numerals:

100-detection module, 200-tracking module, 300-vehicle information database obtaining module, 400-motion track obtaining module.

Detailed Description

Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.

The cross-mirror tracking refers to tracking the motion trail of the target vehicle based on video files shot by a plurality of cameras, and splicing the motion trail of the target vehicle corresponding to each camera to obtain the complete motion trail of the target vehicle. The existing cross-mirror vehicle re-identification technology is based on marked data sets, and although the vehicle re-identification technology is further developed along with the update and release of high-quality data sets, the data set has limited scenes, the algorithm model has strong limitation, and the algorithm is difficult to implement in actual scenes and has low applicability. Therefore, the application provides a method and a system for tracking a vehicle of a highway monitoring video through a mirror, as shown in fig. 1, each frame image in a plurality of camera video files of the highway to be monitored is subjected to vehicle detection through an improved YOLO target detection model to obtain a vehicle detection image, the vehicle detection image is input into a multi-target tracking model to obtain a vehicle track and a vehicle ID corresponding to the vehicle track, and the vehicle ID is the number of the vehicle track; and storing the vehicle detection image and the vehicle track into a database according to the camera numbers and the vehicle IDs, and finally matching the motion track of the target vehicle corresponding to the target vehicle image according to the vehicle information database to realize cross-mirror tracking. According to the method, the vehicle track information corresponding to the video shot by each monitoring camera is extracted, the tracks of the target vehicles are connected across the lens, so that the complete movement track of the target vehicles on the expressway to be monitored is obtained, the cross-mirror tracking of the vehicles is realized, the problems that the conventional cross-mirror vehicle tracking method is difficult to implement in an actual scene, the algorithm is low in applicability and the like are solved, and an efficient, quick and high-precision video analysis technology is provided for the expressway management department to safely monitor and search the target vehicles, so that the method has high practical value.

In one embodiment of the invention, a method for tracking a highway monitoring video cross-mirror vehicle is disclosed, as shown in fig. 2. The method comprises the following steps:

step S1, frame images in video files of a plurality of cameras of a highway to be monitored are obtained, and vehicle detection is carried out on each frame image based on an improved YOLO target detection model to obtain a vehicle detection image containing a complete vehicle rectangular frame. Considering that the characteristics of the expressway monitoring video image comprise that the expressway monitoring camera is about 10 meters away from the ground, the definition of a shot image is poor, the vehicle target is small, the angle is mostly inclined downwards, the vehicle target is mostly overlook lateral vehicle body, the number of obstacles in the image is less, and the like, the method and the device utilize an improved YOLO target detection model to respectively detect the vehicle of each frame image in the acquired video files of the plurality of cameras of the expressway to be monitored so as to obtain the vehicle detection image containing the complete rectangular frame of the vehicle. The improved YOLO target detection model replaces a backbone network in original YOLO v3 with a variant VoVNet (feature extraction network layer) of Densenet with better learning ability, is smaller in size and higher in speed, and meanwhile, the three large, middle and small scales of the YOLO 3 detection layer are reduced to two small scales aiming at the size of a highway monitoring video image shooting vehicle, so that the operation amount is further reduced.

Preferably, the improved YOLO target detection model comprises a feature extraction network layer and a YOLO detection layer; the feature extraction network layer comprises a stem unit and an OSA unit; the system unit is used for downsampling frame images in video files of a plurality of cameras of the expressway to obtain images with the size of 304 x 128; an OSA unit configured to convolve an image with an input size of 304×304×128 to obtain an image with a size of 19×19×512; and the YOLO detection layer is used for obtaining a vehicle detection image containing a complete vehicle rectangular frame according to the image with the size of 19 x 512 output by the OSA unit, wherein the vehicle detection image comprises images of a plurality of vehicles which are formed by the complete vehicle rectangular frame.

Specifically, as shown in FIG. 3, the improved Yolo target detection model includes a feature extraction network layer and a Yolo detection layer. The feature extraction network layer comprises a stem unit and an OSA unit, and the stem unit is used for downsampling frame images in video files of a plurality of cameras on the expressway to obtain images with the size of 304×304×128. The OSA unit comprises four OSA subunits, and the first OSA subunit is configured to convolve the image with the size of 304×304×128 output by the stem unit, so as to obtain an image with the size of 152×152×128; the second OSA subunit is configured to convolve the image with the size of 152×152×128 output by the first OSA subunit to obtain an image with the size of 76×76×256; the third OSA subunit is configured to convolve the image with the size of 76×76×256 output by the second OSA subunit to obtain an image with the size of 38×38×384; the fourth OSA subunit is configured to convolve the image with the size of 38×38×384 output by the third OSA subunit to obtain an image with the size of 19×19×512. Meanwhile, the YOLO detection layer in the improved YOLO target detection model is to reduce the three middle and middle dimensions of the original YOLO v3 detection layer into two middle and middle dimensions, and the function is to obtain a vehicle detection image comprising a complete vehicle rectangular frame according to the image with the size of 19 x 512 output by the fourth OSA subunit. Based on the improved YOLO target detection model, training the model by utilizing a road monitoring data set, searching through super parameters, namely, for each new generation super parameter, selecting the previous generation with highest adaptability (in all previous generations) to mutate, and simultaneously mutating all parameters according to normal distribution with about 20% of 1-sigma to obtain proper learning rate, weights of loss functions of all parts and other super parameters, performing multi-scale training, and finally storing the super parameter of the generation with the best network performance as the super parameter of formal training.

The backbone network in the YOLOv3 is replaced by the variant VoVNet of DenseNet with better learning capability to obtain the feature extraction network layer, meanwhile, the size of a vehicle is shot aiming at the expressway monitoring video image, and the three middle and large scales of the YOLOv3 detection layer are reduced to be the two middle and small scales, so that an improved YOLO target detection model is obtained, the model size is smaller, the calculation speed is faster, the calculation amount is reduced, and the vehicle detection image containing a complete rectangular frame of the vehicle can be obtained more quickly.

S2, inputting a vehicle detection image into a multi-target tracking model to obtain a vehicle tracking result; the vehicle tracking results include the vehicle ID and the vehicle trajectory. Preferably, the multi-target tracking model includes a motion prediction unit and a depth appearance feature extraction unit: the motion prediction unit is used for predicting and obtaining a current frame of vehicle prediction image according to the previous frame of vehicle detection image; the depth appearance feature extraction unit comprises a re-identification network, obtains a vehicle track based on a vehicle detection image and a vehicle prediction image input into the re-identification network, and numbers the vehicle track to obtain a vehicle ID corresponding to the vehicle track.

Specifically, as shown in fig. 4, the deep source multi-target tracking model includes a motion prediction unit and a depth appearance feature extraction unit, where the motion prediction unit predicts a current frame of vehicle prediction image according to a previous frame of vehicle detection image, the depth appearance feature extraction unit calculates depth feature information of the vehicle detection image and the vehicle prediction image mainly through a re-recognition network, uses the vehicle prediction image successfully matched in association as a vehicle position of the vehicle in the current frame, connects center point coordinates of multiple frames of vehicle positions to obtain a vehicle track, numbers the vehicle track while obtaining the vehicle track, and obtains a vehicle ID corresponding to the vehicle track.

And step S3, a vehicle information database is established according to the vehicle detection image and the vehicle tracking result.

Preferably, building the vehicle information database according to the vehicle tracking result specifically includes: and storing the vehicle detection image and the vehicle track into a database based on the camera number and the vehicle ID to obtain a vehicle information database. Specifically, based on the vehicle detection image obtained in step S1 and the vehicle ID corresponding to the vehicle track obtained in step S2, the vehicle information database is obtained by storing the vehicle detection image and the vehicle track in the database according to the naming rule of the camera number-the vehicle ID.

The vehicle information database can be obtained by storing the vehicle detection image and the vehicle track into the database according to the naming rule of the camera number-vehicle ID, and data support and basis are provided for the re-identification of the vehicle track and the complete splicing of the target vehicle track.

And S4, intercepting a target vehicle image based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, and matching a motion track of a target vehicle corresponding to the target vehicle image according to the vehicle information database to realize cross-mirror tracking. Specifically, in step S3, after the vehicle detection images and the vehicle trajectories are saved in the database by using the camera numbers and the vehicle IDs, a certain vehicle detection image corresponding to any camera number may be selected to intercept the target vehicle image, and then the target vehicle in a certain vehicle detection image corresponding to other cameras is searched according to the target vehicle image, and the motion trajectories of the target vehicles in each camera are spliced, so that the complete motion trajectories of the target vehicle can be obtained. Preferably, a target vehicle image is intercepted based on a vehicle detection image in a vehicle information database, and a motion track of a target vehicle corresponding to the target vehicle image is matched according to the vehicle information database, so as to realize cross-mirror tracking, and the method comprises the following steps:

step S401, a certain vehicle detection image corresponding to any camera in the vehicle information database is obtained, and a target vehicle image is intercepted. Specifically, after the vehicle detection image and the vehicle track are saved in the database based on the camera number-vehicle ID in step S3, the target vehicle image may be intercepted from a certain vehicle detection image corresponding to any camera number in the vehicle information database.

Step S402, based on the target vehicle image and a certain vehicle detection image corresponding to other cameras, a depth feature matrix of the target vehicle image and a depth feature matrix of a certain vehicle detection image corresponding to other camera numbers are respectively obtained. Specifically, a depth feature matrix of the target vehicle image and a depth feature matrix of a certain vehicle detection image corresponding to other camera numbers can be obtained by inputting the target vehicle image and a certain vehicle detection image corresponding to other camera numbers into the depth appearance feature extraction unit. Here, the depth feature matrix is obtained according to a certain vehicle detection image corresponding to each camera, and compared with the depth feature matrix obtained by processing each frame of image in real time in the process of obtaining the vehicle track in step S2, the calculation amount is reduced and the accuracy is higher.

Step S403, obtaining cosine similarity distances of a plurality of vehicles in a certain vehicle detection image corresponding to other cameras of the target vehicle by using the re-recognition network based on the depth feature matrix of the target vehicle image and the depth feature matrix of the certain vehicle detection image corresponding to the other camera numbers. Specifically, the cosine similarity distance is calculated by the following formula:

where cos (θ) is a cosine similarity distance, a is a depth feature matrix corresponding to the target vehicle image, b is a depth feature matrix corresponding to the vehicle detection image, and x _i For the ith dimension element, y in the depth feature matrix corresponding to the target vehicle image _i And (3) the i-th dimension element in the depth feature matrix corresponding to the vehicle detection image, wherein n is the dimension of the depth feature matrix, and i is more than or equal to 1 and less than or equal to n.

And step S404, classifying and sequencing the cosine similarity distances according to the camera numbers to obtain the minimum cosine similarity distance corresponding to the same camera number. Specifically, one vehicle detection image is selected based on each camera number, and each vehicle detection image contains images of a plurality of vehicles. Based on step S404, the cosine similarity distances of a plurality of vehicles in a certain vehicle detection image corresponding to the target vehicle and other cameras can be calculated, and the cosine similarity distances are classified and ordered according to the camera numbers, so as to obtain the minimum cosine similarity distance corresponding to the same camera number.

And step S405, judging whether the minimum cosine similarity distance is smaller than a similarity threshold value, if so, taking the vehicle corresponding to the minimum cosine similarity distance in the corresponding vehicle detection image as a target vehicle, and if not, judging that the target vehicle drives away from the expressway to be monitored. Specifically, a certain vehicle detection image corresponding to each other camera can obtain a minimum cosine similarity distance, and whether a target vehicle exists in a certain vehicle detection image corresponding to each other camera can be judged based on the minimum cosine similarity distance. The similarity threshold is obtained by taking an average value according to a large number of experiments, and different vehicle types correspond to different similarity thresholds under different external conditions such as illumination, overcast and rainy conditions and the like affecting monitoring shooting.

Step S406, matching the vehicle information database based on the camera numbers and the vehicle IDs corresponding to the vehicle detection images to obtain the motion trail of the target vehicle, and realizing cross-mirror tracking. Based on the previous step, when the vehicle in the vehicle detection image corresponding to each camera is judged to be the target image, the vehicle information database can be queried through the camera-vehicle ID to obtain the corresponding vehicle track of the target vehicle in the vehicle detection image corresponding to each camera, and the corresponding vehicle tracks are spliced to obtain the complete motion track of the target vehicle, so that the cross-mirror tracking is realized.

The depth appearance feature extraction unit is utilized to intercept a target vehicle image in a certain vehicle detection image corresponding to any camera in the vehicle information database, the cosine similarity of the target vehicle image and a plurality of vehicles in a certain vehicle detection image in other cameras is calculated, the target vehicle is matched with a plurality of vehicles in a certain vehicle detection image in other cameras through the cosine similarity, and then the movement track of the target vehicle is obtained, so that cross-mirror tracking is realized, and the matching efficiency and the matching precision are high.

Compared with the prior art, the expressway monitoring video cross-mirror vehicle tracking method provided by the embodiment detects vehicles through the improved YOLO target detection model to obtain vehicle detection images, tracks the vehicles through the multi-target tracking model to obtain vehicle IDs and vehicle tracks, finally obtains cosine similarity based on the re-identification network, and further splices to obtain complete motion tracks of target vehicles, and provides an efficient, quick and high-precision video analysis technology for safety monitoring and searching of the target vehicles by expressway management departments.

In another embodiment of the present invention, as shown in fig. 5, a system for tracking a vehicle with a cross-mirror monitored video on a highway is disclosed, which includes a detection module 100, configured to obtain frame images in video files of a plurality of cameras on a highway to be monitored, and perform vehicle detection on each frame image based on an improved YOLO target detection model to obtain a vehicle detection image including a complete rectangular frame of the vehicle; the tracking module 200 is used for inputting the vehicle detection image into the multi-target tracking model to obtain a vehicle tracking result, wherein the vehicle tracking result comprises a vehicle ID and a vehicle track; a vehicle information database obtaining module 300 for establishing a vehicle information database according to the vehicle detection image and the vehicle tracking result; the motion track obtaining module 400 is configured to intercept the target vehicle image according to the vehicle detection image in the vehicle information database, and match the motion track of the target vehicle corresponding to the target vehicle image according to the vehicle information database, so as to implement cross-mirror tracking. Specifically, the system can select a video file, track and display tracking results, namely images of each vehicle, and display the coordinates of the center point of the vehicle in a form to obtain the vehicle track, namely the corresponding vehicle ID, and also has a pause and resume tracking function.

A highway monitoring video cross-mirror vehicle tracking system detects vehicles through an improved YOLO target detection model to obtain vehicle detection images, tracks the vehicles through a multi-target tracking model to obtain vehicle IDs and vehicle tracks, and finally obtains cosine similarity based on a re-identification network to obtain complete motion tracks of target vehicles by splicing, so that an efficient, quick and high-precision video analysis technology is provided for safety monitoring and searching of target vehicles by highway management departments.

Preferably, the detection module comprises a feature extraction network layer and a YOLO detection layer; the feature extraction network layer comprises a stem unit and an OSA unit; the system unit is used for downsampling frame images in video files of a plurality of cameras of the expressway to obtain images with the size of 304 x 128; an OSA unit configured to convolve an image with an input size of 304×304×128 to obtain an image with a size of 19×19×512; and the YOLO detection layer is used for obtaining a vehicle detection image containing a complete vehicle rectangular frame according to the image with the size of 19 x 512 output by the OSA unit.

The detection module replaces a backbone network in the YOLOv3 with a variant VoVNet of DenseNet with better learning capability to obtain a feature extraction network layer, and simultaneously reduces the three middle and large dimensions of the YOLOv3 detection layer into two middle and small dimensions according to the size of a highway monitoring video image shooting vehicle to obtain an improved YOLO target detection model, so that the model is smaller in size, higher in calculation speed and reduced in operation amount, and a vehicle detection image comprising a complete vehicle rectangular frame can be obtained more quickly.

Preferably, the tracking module comprises a motion prediction unit and a depth appearance feature extraction unit, wherein the motion prediction unit is used for predicting and obtaining a current frame of vehicle prediction image according to a previous frame of vehicle detection image; the depth appearance feature extraction unit comprises a re-identification network, obtains a vehicle track based on a vehicle detection image and a vehicle prediction image input into the re-identification network, and numbers the vehicle track to obtain a vehicle ID corresponding to the vehicle track.

Preferably, the vehicle information database obtaining module stores the vehicle detection image and the vehicle track into a database according to the camera number and the vehicle ID to obtain the vehicle information database.

The vehicle information database is obtained by storing the vehicle detection image and the vehicle track into the database through the vehicle information database obtaining module according to the naming rule of the camera number-vehicle ID, and data support and basis are provided for re-identification of the vehicle track and complete splicing of the target vehicle track.

Preferably, the motion trajectory obtaining module executes the following procedure:

acquiring a certain vehicle detection image corresponding to any camera number in a vehicle information database, and intercepting a target vehicle image by using a depth appearance feature extraction unit;

based on a target vehicle image and a certain vehicle detection image corresponding to other camera numbers, respectively obtaining a depth feature matrix of the target vehicle image and a depth feature matrix of a certain vehicle detection image corresponding to other camera numbers;

based on the depth feature matrix of the target vehicle image and the depth feature matrix of a certain vehicle detection image corresponding to the other camera numbers, obtaining the cosine similarity distance of the vehicle in the certain vehicle detection image corresponding to the other camera numbers of the target vehicle by utilizing the re-identification network;

classifying and sequencing the cosine similarity distances according to the numbers of the cameras to obtain the minimum cosine similarity distance corresponding to the same camera number;

judging whether the minimum cosine similarity distance is smaller than a similarity threshold value, if so, determining that the vehicle in the corresponding vehicle detection image is a target vehicle, and if not, judging that the target vehicle drives away from the expressway to be monitored;

The motion track obtaining module can calculate the similarity between the target vehicle and all tracked vehicles, and display the similarity on the computer interface according to the similarity, the closer the sequence is to the target vehicle, the vehicle ID successfully matched is output on the right of the interface after the matching is successful, and the tracks are spliced, so that the complete track of the target vehicle is obtained. The depth appearance feature extraction unit is utilized to intercept a target vehicle image in a certain vehicle detection image corresponding to any camera in the vehicle information database, the cosine similarity of the target vehicle image and a plurality of vehicles in a certain vehicle detection image in other cameras is calculated, the target vehicle is matched with a plurality of vehicles in a certain vehicle detection image in other cameras through the cosine similarity, and then the movement track of the target vehicle is obtained, so that cross-mirror tracking is realized, and the matching efficiency and the matching precision are high.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims

1. The method for tracking the expressway monitoring video cross-mirror vehicle is characterized by comprising the following steps of:

the improved YOLO target detection model is to replace a backbone network in original YOLO v3 with a variant VoVNet of Densenet with better learning ability, and reduce three dimensions of a YOLO 3 detection layer into two dimensions of a middle and a small;

intercepting a target vehicle image based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, matching the target vehicle image according to the vehicle information database to obtain a corresponding target vehicle, and splicing to obtain a motion track based on the corresponding target vehicle to realize cross-mirror tracking;

intercepting a target vehicle image based on a certain vehicle detection image corresponding to any camera number in the vehicle information database, matching the target vehicle according to the vehicle information database, and splicing to obtain a motion track based on the corresponding target vehicle to realize cross-mirror tracking, wherein the method comprises the following steps:

based on the depth feature matrix of the target vehicle image and the depth feature matrix of a certain vehicle detection image corresponding to other camera numbers, obtaining cosine similarity distances of a plurality of vehicles in the certain vehicle detection image corresponding to the other camera numbers of the target vehicle by utilizing a re-identification network;

classifying and sequencing the cosine similarity distances according to the numbers of the cameras to obtain the minimum cosine similarity distance corresponding to the numbers of the cameras;

judging whether the minimum cosine similarity distance corresponding to the numbers of other cameras is smaller than a similarity threshold value, if so, determining that the vehicle in the corresponding vehicle detection image is a target vehicle, and if not, judging that the target vehicle drives away from the expressway to be monitored;

and matching the camera numbers and the vehicle IDs corresponding to the vehicle detection images with the vehicle information database, and splicing to obtain the motion trail of the target vehicle so as to realize cross-mirror tracking.

2. The highway monitoring video cross-mirror vehicle tracking method according to claim 1, wherein the improved YOLO target detection model comprises a feature extraction network layer and a YOLO detection layer; wherein the feature extraction network layer comprises a stem unit and an OSA unit;

a stem unit for downsampling frame images in the video files of the cameras on the highway to obtain a size ofIs a picture of (1);

OSA unit for inputting a size ofConvolving the image of (2) to obtain a size ofIs a picture of (1);

a YOLO detection layer for outputting a size of the OSA unitA vehicle detection image including a complete vehicle rectangular frame is obtained.

3. The method for tracking a highway monitoring video cross-mirror vehicle according to claim 1, wherein the multi-target tracking model comprises a motion prediction unit and a depth appearance feature extraction unit:

4. The method for highway monitoring video cross-mirror vehicle tracking according to claim 3, wherein establishing a vehicle information database according to the vehicle tracking result specifically comprises: and storing the vehicle detection image and the vehicle track into a database based on the camera number and the vehicle ID to obtain a vehicle information database.

5. A highway monitoring video cross-mirror vehicle tracking system, comprising:

the motion trail obtaining module is used for intercepting a target vehicle image according to a certain vehicle detection image corresponding to any camera number in the vehicle information database, matching the target vehicle according to the vehicle information database, and splicing to obtain a motion trail based on the obtained corresponding target vehicle so as to realize cross-mirror tracking;

the motion trail obtaining module executes the following procedures:

acquiring a certain vehicle detection image corresponding to the camera number in the vehicle information database, and intercepting a target vehicle image;

based on the depth feature matrix of the target vehicle image and the depth feature matrix of a certain vehicle detection image corresponding to the number of other cameras, obtaining the cosine similarity distance of the vehicle in the certain vehicle detection image corresponding to the number of other cameras by utilizing a re-identification network;

6. The highway monitoring video cross-mirror vehicle tracking system of claim 5, wherein the detection module comprises a feature extraction network layer and a YOLO detection layer; wherein the feature extraction network layer comprises a stem unit and an OSA unit;

7. The highway monitoring video cross-mirror vehicle tracking system of claim 6, wherein the tracking module comprises a motion prediction unit and a depth appearance feature extraction unit:

8. The expressway surveillance video cross-mirror vehicle tracking system of claim 7, wherein said vehicle information database acquisition module stores vehicle detection images and vehicle trajectories to a database according to camera numbers and vehicle IDs to obtain a vehicle information database.