CN112101433B - Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT - Google Patents

Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT Download PDF

Info

Publication number
CN112101433B
CN112101433B CN202010924261.8A CN202010924261A CN112101433B CN 112101433 B CN112101433 B CN 112101433B CN 202010924261 A CN202010924261 A CN 202010924261A CN 112101433 B CN112101433 B CN 112101433B
Authority
CN
China
Prior art keywords
vehicle
yolo
track
data
deepsort
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010924261.8A
Other languages
Chinese (zh)
Other versions
CN112101433A (en
Inventor
王晨
周威
陆振波
夏井新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010924261.8A priority Critical patent/CN112101433B/en
Publication of CN112101433A publication Critical patent/CN112101433A/en
Application granted granted Critical
Publication of CN112101433B publication Critical patent/CN112101433B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/065Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a split-lane vehicle automatic counting method based on YOLO V4 and DeepSORT, which comprises the following steps: collecting a YOLO V4 training data set, a vehicle re-identification data set and data enhancement, building a YOLO V4 model, training, building DeepSORT a target tracking model, tracking vehicles, extracting running tracks of each vehicle, building track record files, storing running track information of each vehicle, clustering endpoint coordinates of track data by using a DBSCAN clustering algorithm, associating a cluster with lane information, and realizing a lane dividing counting function of the vehicles according to a change rule of the track data and a corresponding relation between the track and the cluster; the method adopts the YOLOV4+ DeepSORT vehicle detection and tracking model, ensures the real-time performance of vehicle detection and tracking, and greatly improves the accuracy.

Description

Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT
Technical Field
The invention relates to the field of traffic big data, in particular to an automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT.
Background
The traffic flow parameter extraction is a basic and important task of traffic management and control, and brings convenience to decision and management of traffic managers. At present, some typical traffic flow parameter extraction methods are mainly divided into (1) a method based on coil detection; (2) a manner based on red line sensor detection; (3) detection method based on microwave technology, etc. The coil-based detection mode is a contact type detection mode, and has the defects of troublesome installation and disassembly, poor extraction of track data of vehicles and inapplicability to road sections with serious congestion. The detection mode based on infrared or microwave technology adopts a non-contact detection mode, so that the installation and the disassembly are convenient, but vehicles in different lanes cannot be distinguished, and the detection mode is also limited by road sections with more traffic jams.
In recent years, with the large-area coverage of traffic monitoring, applications such as vehicle track extraction based on traffic monitoring and lane-dividing vehicle counting have been paid attention to, and compared with some traditional traffic flow parameter extraction methods, a video-based detection method has the following advantages:
(1) The non-contact detection mode is convenient to install and disassemble;
(2) The vehicle type of the vehicle and the lane where the vehicle is located can be distinguished;
(3) The method is less limited by conditions such as traffic jams and the like, and vehicles in the traffic jams can be detected and tracked better.
Paper "Real-Time Traffic Flow Parameter Estimation From UAV Video Based on Ensemble Classifier and Optical Flow" is based on unmanned aerial vehicle monitoring videos, HAAR CASCADE and a convolutional neural network are adopted to detect vehicles in video monitoring, then the optical flow theory is used for capturing motion information of the vehicles in the time dimension, finally traffic flow parameters (track, speed and traffic flow) of a monitored road section are extracted according to the motion information, and the method is suitable for specific monitoring videos (unmanned aerial vehicle monitoring videos) but is limited by other monitoring types (such as bayonet monitoring, high-altitude cameras and the like). The paper Vision-based vehicle detection and counting system using DEEP LEARNING IN HIGHWAY SCENES is based on expressway monitoring video, adopts a deep learning YOLO V3 target detection model to detect vehicles, then uses an ORB algorithm to acquire vehicle tracks, and realizes the counting function of different vehicles. The paper Vehicle Count System based on TIME INTERVAL IMAGE Capture Method AND DEEP LEARNING MASK R-CNN adopts a MaskRCNN target detection and segmentation model to Capture and count vehicles on a specific road, but has poor recognition capability on small objects such as non-motor vehicles and the like and cannot guarantee detection speed.
In summary, the main disadvantages of the current research methods are:
(1) Regarding a vehicle position detection algorithm/model, most of current research methods detect vehicles mainly through a background subtraction method or an optical flow method, the detection accuracy is not high, and the accuracy and efficiency of extracting traffic flow parameters are affected; a few researches use a convolutional neural network model with good precision to detect the vehicle, but the convolutional neural network model is large, so that the real-time requirement of detection cannot be well met, and the speed and efficiency of extracting the traffic flow parameters are affected.
(2) As for the motion information extraction of vehicles, most of the current research methods mainly perform vehicle tracking by an optical flow method/image feature matching/kalman filtering method of a detection area, which is suitable for rare road sections of vehicles, but is limited to complex/congested road sections.
(3) Most of the current research methods do not have the functions of track analysis, automatic lane division, automatic lane-dividing vehicle counting and the like, the lane judgment needs to rely on manually set rules, and the manual judgment is time-consuming and labor-consuming for the monitoring pictures of more lanes.
Disclosure of Invention
In order to solve the defects in the background art, the invention aims to provide an automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT, which utilizes a target detection model YOLO V4 to detect the vehicle position in a monitoring video, has the advantages of high detection precision, high detection speed and the like, and on the basis of acquiring the vehicle position by YOLO V4, the vehicle is tracked in the time dimension by using a target tracking model DeepSORT, and as DeepSORT, the position is predicted by using Kalman filtering and the characteristics are extracted by using a re-identification model, compared with the traditional target tracking algorithm/model, the precision is greatly improved;
Meanwhile, when the vehicle track data storage CSV file is built in the background, and the center coordinates of each vehicle detection frame and the corresponding vehicle ID of each frame are recorded. And finally, performing cluster analysis on the track data by using a DBSCAN clustering algorithm to realize the functions of track end point clustering, automatic lane dividing and automatic vehicle counting of each lane.
The aim of the invention can be achieved by the following technical scheme:
an automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT comprises the following steps:
S1, collecting a Yolo V4 training data set, a vehicle re-identification data set and data enhancement;
s2, constructing a YOLO V4 model by using a Pytorch deep learning frame and training;
S3, constructing DeepSORT a target tracking model, training a vehicle characteristic extraction network by vehicle re-identification data, and completing construction of a YOLO V4+ DeepSORT vehicle tracking model by taking a detection frame of YOLO V4 in each frame as input;
s4, tracking vehicles by using a YOLOV4+ DeepSORT model, extracting running tracks of each vehicle, building track record files and storing running track information of each vehicle;
S5, clustering end point coordinates of the track data by using a DBSCAN clustering algorithm, and associating the clustered clusters with the lane information;
S6, according to the change rule of the track data and the corresponding relation between the track and the cluster, the lane dividing counting function of the vehicle is realized.
Further, the specific data set and data enhancement manner in S1 include:
S11. Collection and data enhancement of Yolo V4 training dataset: collecting the labeling pictures and labeling information of cars, trucks, buses and non-motor vehicles in the PASCAL VOC and COCO data set, manually labeling the types and the position information of vehicles in 2000 monitoring video frames with different visual angles, adopting a data enhancement mode of random cutting, random overturning, randomly adjusting picture parameters of saturation, hue and brightness, enhancing mosaic data, enhancing mixed cutting data, and uniformly scaling the picture data to 608x608 resolution;
s12, collecting and enhancing a vehicle re-identification data set: the data set is collected VeRi, the vehicle feature extraction model in DeepSORT is trained, and the data enhancement mode adopted is random clipping.
Further, the process of constructing and training YOLO V4 in S2 is as follows:
S21.yolo V4 consists of: 1. feature extraction network CSPDARKNET; 2. the multi-scale feature fusion network PAN and the spatial pyramid pooling SPP; 3. a head network resembling the YOLO V3 model for classification and detection frame regression; sequentially constructing CSPDARKNET53 feature extraction networks by using Pytorch deep learning frames, carrying out feature fusion on three different width and height feature graphs output by the feature extraction networks through SSP+PAN, and finally enabling the three different width and height feature graphs obtained after the feature fusion to pass through a 1x1 convolutional neural network once to obtain an output result of YOLO V4;
S22, training according to the output of the network and the real label set loss function of the data set on the basis of the YOLO V4 built in the S21, inheriting the cross entropy loss of the YOLO V3 by the object classification loss, and updating the network parameters of the YOLO V4 by using a back propagation algorithm after the loss function is set;
S23, setting the YOLO V4 super parameters in the training process as follows: an Adam optimizer was selected, the initial learning rate was set to 1e-5, the dataset training round was set to 50, and the batch size was set to 16.
Further, the building and training process of the DeepSORT target tracking model in S3 specifically includes:
S31, taking the size and position information of a candidate frame output by YOLO v4 as input, and building three components of DeepSORT: (1) The kalman filtering algorithm is used as a position predictor and comprises two stages:
(1.1) prediction stage: when the target moves, predicting the speed and position information of the target in the current frame according to the speed and position information of the target in the previous frame;
(1.2) in the updating stage, according to the predicted value and the observed value captured by the algorithm, obtaining the state of the current system through linear weighting of two normal distributions;
(2) Training and testing a small residual error network as a feature extractor, training the small residual error network by using ReID data sets, using cross entropy as a training loss function, setting the training round to be 50 rounds, selecting an Adam optimizer by the optimizer, setting the initial learning rate to be 0.0001, and after training, scaling a vehicle picture in a YOLO V4 detection frame to 112 pixels by 112 pixels to be used as an input to obtain a 128-dimensional low-dimensional vector for calculating the similarity at the back;
(3) And after the similarity of vectorization of the detection frames is calculated by using the cosine distance, matching vehicles in the detection frames in the front frame and the rear frame by using the Hungary algorithm, wherein the vehicles with high matching degree are identified as the same vehicle, and uniform ID numbers are allocated.
Further, the specific step of S4 is as follows:
s41, obtaining position information of each vehicle in each frame, and distributing a unique identifier for each vehicle;
s42, replacing the vehicle with the center of the detection frame, and drawing the track of the same ID vehicle in time;
S43, creating a CSV file as a track record file, and importing the ID information of all vehicles and the vehicle track information into the CSV file in real time.
Further, the specific step of S5 is as follows:
S51, selecting a track end point coordinate to perform track clustering;
s52, importing track data in the CSV file and acquiring end point coordinates of all the track data;
S53, clustering the end point coordinates of the track data by using a DBSCAN clustering algorithm;
S54, analyzing the tracks of which the track end points fall into the same cluster, connecting the gravity centers of the track start point positions and the gravity centers of the end point positions to realize the identification and division of the lanes corresponding to the cluster, and if a plurality of obvious differences occur at the start points of the tracks corresponding to a certain cluster, respectively connecting the gravity centers of the start points with larger differences to the gravity centers of the end points to realize the association of the cluster and the lane information;
S54, after the cluster and the lane information are associated, generating a piece of complete track data, and distributing the track data to the lane corresponding to the cluster to which the end point coordinates belong.
Further, the step S6 of implementing a lane dividing counting function of the vehicle specifically includes:
s61, if track data of a certain vehicle are not updated in the next 10 frames of video segments, storing the track data into a CSV file;
S62, distributing newly added track data in the CSV file to a cluster closest to the CSV file according to the terminal point coordinates;
S63, if one end point coordinate data is newly added in one cluster, increasing the traffic flow of the lane corresponding to the cluster to generate the standard vehicle equivalent number of the track vehicle, and realizing the lane-dividing vehicle counting function.
The invention has the beneficial effects that:
1. the method adopts the YOLOV4+ DeepSORT vehicle detection and tracking model, ensures the real-time performance of vehicle detection and tracking, and greatly improves the accuracy;
2. When the method is used, track end positions replace track data, DBSCAN is used for clustering the track end positions, the track data is clustered, after clustering is finished, matching correspondence between clusters and lanes is carried out according to the cluster positions where the tracks fall in and the track start positions, and finally, the automatic counting function of the lane-dividing vehicles is completed according to the matched lanes and track analysis.
Drawings
The invention is further described below with reference to the accompanying drawings.
FIG. 1 is a general flow chart of the present invention;
FIG. 2 is a representation of a portion of a vehicle re-identification dataset of the present invention;
FIG. 3 is a diagram of the construction of YOLO V4 of the present invention;
FIG. 4 is a schematic representation of the location and ID of the vehicle in each frame of the present invention;
FIG. 5 is a schematic representation of a vehicle trajectory of the present invention;
FIG. 6 is a graph of the detection effect of the YOLO V4 vehicle of the present invention;
FIG. 7 is a graph of the vehicle tracking effect of YOLOV4+ DeepSORT of the present invention;
FIG. 8 is a vehicle track extraction effect diagram of the present invention;
FIG. 9 is a vehicle track endpoint profile of the invention 4;
FIG. 10 is a cluster distribution diagram after DBSCAN clustering in accordance with the present invention;
FIG. 11 is a cluster-to-lane correspondence map of the present invention;
fig. 12 is a lane-splitting vehicle counting function implementation of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be understood that the terms "open," "upper," "lower," "thickness," "top," "middle," "length," "inner," "peripheral," and the like indicate orientation or positional relationships, merely for convenience in describing the present invention and to simplify the description, and do not indicate or imply that the components or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present invention.
An automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT, as shown in fig. 1, comprises the following steps:
S1, collecting a Yolo V4 training data set, a vehicle re-identification data set and data enhancement;
S11. Collection and data enhancement of Yolo V4 training dataset: and collecting labeling pictures and labeling information of all cars, trucks, buses and non-motor vehicles in the PASCAL VOC and COCO data set, and manually labeling the types and the position information of the vehicles in 2000 monitoring video frames with different visual angles. The adopted data enhancement mode mainly comprises random clipping, random overturning, random adjustment of picture parameters such as saturation, tone, brightness and the like, mosaic data enhancement, mixed clipping data enhancement and the like; the picture data is uniformly scaled to 608x608 resolution;
S12, collecting and enhancing a vehicle re-identification data set: collecting VeRi data sets (37778 training pictures and 11579 measured data sets) for training DeepSORT a vehicle characteristic extraction model, wherein the adopted data enhancement mode is mainly random cutting, and part of the data sets are shown in fig. 2 (hereinafter, the data sets are photographed at different angles of the same vehicle);
s2, constructing a YOLO V4 model by using a Pytorch deep learning frame and training;
S21.yolo V4 consists of: (1) a feature extraction network CSPDARKNET53; (2) The multi-scale feature fusion network PAN and the spatial pyramid pooling SPP; (3) A head network resembling the YOLO V3 model for classification and detection frame regression; the specific structure diagram is shown in the following figure 3, a Pytorch deep learning framework is used for sequentially constructing CSPDARKNET feature extraction networks, three different width and height feature graphs (width and height are 19x19,38x38 and 76x 76) output by the feature extraction networks are subjected to feature fusion through SSP+PAN, and finally the three different width and height feature graphs obtained after the feature fusion are subjected to one-time 1x1 convolutional neural network to obtain an output result of YOLO V4;
s22, training according to the output of the network and the real label set loss function of the data set on the basis of the YOLO V4 built in the S21, wherein the YOLO V4 model learns the position and the size of a detection frame by adopting CIOU loss function, and compared with the MSE loss function adopted in the YOLO V3, the speed is faster and the precision is higher. The object classification loss still inherits the cross entropy loss of the YOLO V3, and after the loss function is set, the backward propagation algorithm is used for updating the parameters of the YOLO V4 network.
S23, setting some super parameters of YOLO V4 in the training process as follows: selecting an Adam optimizer, and setting the initial learning rate to be 1e-5; the data set training round is set to 50; batch size was set to 16;
S3, constructing DeepSORT a target tracking model, training a vehicle characteristic extraction network by vehicle re-identification data, and completing construction of a YOLO V4+ DeepSORT vehicle tracking model by taking a detection frame of YOLO V4 in each frame as input;
s31, taking the size and position information of a candidate frame output by YOLO v4 as input, and building three components of DeepSORT: (1) The Kalman filtering algorithm is used as a position predictor, and a uniform motion and linear observation model of the Kalman filter is used; the method is mainly divided into two stages,
(1.1) Prediction stage: when the target moves, predicting the speed and position information of the target in the current frame according to the speed and position information of the target in the previous frame;
(1.2) in the updating stage, according to the predicted value and the observed value captured by the algorithm, obtaining the state of the current system through linear weighting of two normal distributions;
(2) The small residual error network is used for training and testing the feature extractor. The mini-residual network was trained using ReID datasets, the training round was set to 50 rounds, the optimizer selected Adam optimizer, and the initial learning rate was set to 0.0001 using cross entropy as the training loss function. After training, the vehicle picture in the YOLO V4 detection frame is scaled to 112 pixels as input to obtain a 128-dimensional low-dimensional vector for subsequent similarity calculation.
(3) The Hungary algorithm is used as a feature matcher, after the vectorized approximation degree of the detection frame is calculated by using the cosine distance, vehicles in the detection frames in the front frame and the rear frame are matched by using the Hungary algorithm, the vehicles with high matching degree are identified as the same vehicle, and uniform ID numbers are allocated;
s4, tracking vehicles by using a YOLOV4+ DeepSORT model, extracting running tracks of each vehicle, building track record files and storing running track information of each vehicle;
S41, obtaining position information (the position of a detection frame) of each vehicle in each frame, and allocating a unique identifier to each vehicle, as shown in the following figure 4;
S42, replacing the vehicle with the center of the detection frame, and drawing the track of the same ID vehicle in time, as shown in FIG. 5;
S43, creating a CSV file as a track record file, and importing the ID information of all vehicles and the vehicle track information into the CSV file in real time;
S5, clustering end point coordinates of the track data by using a DBSCAN clustering algorithm, and associating the clustered clusters with the lane information;
s51, the lengths of different track data are different, and the problems that the distance between tracks is difficult to calculate and the like occur when the track data are clustered directly, so that the track end point coordinates (based on an image pixel coordinate system) are selected to replace the track data for track clustering;
s52, importing track data in the CSV file and acquiring end point coordinates of all the track data;
S53, clustering the end point coordinates of the track data by using a DBSCAN clustering algorithm;
S54, analyzing the tracks of which the track end points fall into the same cluster, connecting the gravity centers of the track start point positions and the gravity centers of the end point positions to realize the identification and division of the lanes corresponding to the cluster, and if a plurality of obvious differences occur at the start points of the tracks corresponding to a certain cluster, respectively connecting the gravity centers of the start points with larger differences to the gravity centers of the end points to realize the association of the cluster and the lane information;
S54, after the cluster and the lane information are associated, generating a piece of complete track data, wherein the track data are distributed to lanes corresponding to the cluster to which the terminal coordinates belong;
S6, according to the change rule of track data (whether updating is stopped or not in a period of time) and the corresponding relation between the track and the cluster, the lane dividing counting function of the vehicle is realized.
S61, if track data of a certain vehicle are not updated in the next 10 frames of video segments, storing the track data into a CSV file;
S62, distributing newly added track data in the CSV file to a cluster closest to the CSV file according to the terminal point coordinates;
S63, if one end point coordinate data is newly added in one cluster, increasing the traffic flow of the lane corresponding to the cluster to generate the standard number of vehicle equivalent of the track vehicle, and realizing the lane-dividing vehicle counting function.
In fig. 6, there are shown the YOLO V4 detection effect diagrams under four different road environments/weather environments, and it can be seen from the detection diagram of the upper left corner (heavy fog) that the effect of our model on vehicle detection under foggy weather environment is still good, the upper right corner (night) shows the detection effect diagram of vehicle under night environment, and the YOLO V4 model still ensures the accuracy of vehicle detection under the luminance environment. The following two figures represent two common road environments, namely, congested traffic and normal traffic. The model ensures the detection precision under the road environments, and rarely has the conditions of repeated detection, missed detection and the like.
Fig. 7 shows DeepSORT that on the basis of YOLO V4 vehicle detection, a unique ID identifier is allocated to each vehicle detection frame, and the ID identifier of the same vehicle is not changed in the time dimension along with video playing, so that tracking of the vehicle is realized. Fig. 8 is a diagram showing the vehicle track extraction by selecting the center position of the vehicle detection frame instead of detecting the vehicle and connecting the vehicle detection center positions in the time dimension based on the vehicle tracking.
Fig. 9 (right) depicts the end points of the completed (not updated) vehicle track on a graph, and it can be clearly seen that the distribution of the end points of the track exhibits the characteristics of smaller intra-class distances and larger inter-class distances. We clustered the trace endpoint profile using a DBSCAN clustering algorithm to obtain the clustered profile of fig. 10. The method analyzes the tracks of which the track end points fall into the same cluster, connects the gravity centers of the track start point positions and the gravity centers of the track end point positions, realizes the identification and division of the lanes corresponding to the cluster, and if a plurality of obvious differences occur at the start points of the tracks corresponding to a certain cluster, respectively connects the gravity centers of the start points with larger differences to the gravity centers of the end points, so as to obtain the corresponding relation between the cluster and the lanes as shown in figure 11. And finally, storing the track into a CSV file according to the fact that the track is not updated, distributing the track to the traffic flow of which the end points fall into the traffic lane corresponding to the cluster, and adding the equivalent standard traffic equivalent traffic flow to the corresponding traffic lane according to the vehicle type generating the track to realize the automatic counting function of the traffic flow of the lane, as shown in fig. 12.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims.

Claims (6)

1. An automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT is characterized by comprising the following steps:
S1, collecting a Yolo V4 training data set, a vehicle re-identification data set and data enhancement;
s2, constructing a YOLO V4 model by using a Pytorch deep learning frame and training;
S3, constructing DeepSORT a target tracking model, training a vehicle characteristic extraction network by vehicle re-identification data, and completing construction of a YOLO V4+ DeepSORT vehicle tracking model by taking a detection frame of YOLO V4 in each frame as input;
s4, tracking vehicles by using a YOLOV4+ DeepSORT model, extracting running tracks of each vehicle, building track record files and storing running track information of each vehicle;
S5, clustering end point coordinates of the track data by using a DBSCAN clustering algorithm, and associating the clustered clusters with the lane information;
s6, according to the change rule of track data and the corresponding relation between the track and the cluster, the lane dividing counting function of the vehicle is realized;
the specific steps of the S5 are as follows:
S51, selecting a track end point coordinate to perform track clustering;
s52, importing track data in the CSV file and acquiring end point coordinates of all the track data;
S53, clustering the end point coordinates of the track data by using a DBSCAN clustering algorithm;
S54, analyzing the tracks of which the track end points fall into the same cluster, connecting the gravity centers of the track start point positions and the gravity centers of the end point positions to realize the identification and division of the lanes corresponding to the cluster, and if a plurality of obvious differences occur at the start points of the tracks corresponding to a certain cluster, respectively connecting the gravity centers of the start points with larger differences to the gravity centers of the end points to realize the association of the cluster and the lane information;
S54, after the cluster and the lane information are associated, generating a piece of complete track data, and distributing the track data to the lane corresponding to the cluster to which the end point coordinates belong.
2. The YOLO V4 and DeepSORT based lane-splitting vehicle automatic counting method according to claim 1, wherein the data set and data enhancement mode in S1 includes:
S11. Collection and data enhancement of Yolo V4 training dataset: collecting the labeling pictures and labeling information of cars, trucks, buses and non-motor vehicles in the PASCAL VOC and COCO data set, manually labeling the types and the position information of vehicles in 2000 monitoring video frames with different visual angles, adopting a data enhancement mode of random cutting, random overturning, randomly adjusting picture parameters of saturation, hue and brightness, enhancing mosaic data, enhancing mixed cutting data, and uniformly scaling the picture data to 608x608 resolution;
s12, collecting and enhancing a vehicle re-identification data set: the data set is collected VeRi, the vehicle feature extraction model in DeepSORT is trained, and the data enhancement mode adopted is random clipping.
3. The automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT according to claim 1, wherein the process of building and training YOLO V4 in S2 is as follows:
S21.yolo V4 consists of: 1. feature extraction network CSPDARKNET; 2. the multi-scale feature fusion network PAN and the spatial pyramid pooling SPP; 3. a head network resembling the YOLO V3 model for classification and detection frame regression; sequentially constructing CSPDARKNET53 feature extraction networks by using Pytorch deep learning frames, carrying out feature fusion on three different width and height feature graphs output by the feature extraction networks through SSP+PAN, and finally enabling the three different width and height feature graphs obtained after the feature fusion to pass through a 1x1 convolutional neural network once to obtain an output result of YOLO V4;
s22, training according to the output of the network and the real label set loss function on the basis of the YOLO V4 built in the S21, and updating the network parameters of the YOLO V4 by using a back propagation algorithm after the loss function is set;
S23, setting the YOLO V4 super parameters in the training process as follows: an Adam optimizer was selected, the initial learning rate was set to 1e-5, the dataset training round was set to 50, and the batch size was set to 16.
4. The automatic lane-splitting vehicle counting method based on YOLO V4 and DeepSORT according to claim 1, wherein the building and training process of the DeepSORT target tracking model in S3 specifically includes:
S31, taking the size and position information of a candidate frame output by YOLO v4 as input, and building three components of DeepSORT:
(1) The kalman filtering algorithm is used as a position predictor and comprises two stages:
(1.1) prediction stage: when the target moves, predicting the speed and position information of the target in the current frame according to the speed and position information of the target in the previous frame;
(1.2) in the updating stage, according to the predicted value and the observed value captured by the algorithm, obtaining the state of the current system through linear weighting of two normal distributions;
(2) Training and testing a small residual error network as a feature extractor, training the small residual error network by using ReID data sets, using cross entropy as a training loss function, setting the training round to be 50 rounds, selecting an Adam optimizer by the optimizer, setting the initial learning rate to be 0.0001, and after training, scaling a vehicle picture in a YOLO V4 detection frame to 112 pixels by 112 pixels to be used as an input to obtain a 128-dimensional low-dimensional vector for calculating the similarity at the back;
(3) And after the similarity of vectorization of the detection frames is calculated by using the cosine distance, matching vehicles in the detection frames in the front frame and the rear frame by using the Hungary algorithm, wherein the vehicles with high matching degree are identified as the same vehicle, and uniform ID numbers are allocated.
5. The automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT according to claim 1, wherein the specific step of S4 is:
s41, obtaining position information of each vehicle in each frame, and distributing a unique identifier for each vehicle;
s42, replacing the vehicle with the center of the detection frame, and drawing the track of the same ID vehicle in time;
S43, creating a CSV file as a track record file, and importing the ID information of all vehicles and the vehicle track information into the CSV file in real time.
6. The automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT according to claim 1, wherein the step S6 is implemented as a lane-dividing vehicle counting function, and specifically includes:
s61, if track data of a certain vehicle are not updated in the next 10 frames of video segments, storing the track data into a CSV file;
S62, distributing newly added track data in the CSV file to a cluster closest to the CSV file according to the terminal point coordinates;
S63, if one end point coordinate data is newly added in one cluster, increasing the traffic flow of the lane corresponding to the cluster to generate the standard vehicle equivalent number of the track vehicle, and realizing the lane-dividing vehicle counting function.
CN202010924261.8A 2020-09-04 2020-09-04 Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT Active CN112101433B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010924261.8A CN112101433B (en) 2020-09-04 2020-09-04 Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010924261.8A CN112101433B (en) 2020-09-04 2020-09-04 Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT

Publications (2)

Publication Number Publication Date
CN112101433A CN112101433A (en) 2020-12-18
CN112101433B true CN112101433B (en) 2024-04-30

Family

ID=73757391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010924261.8A Active CN112101433B (en) 2020-09-04 2020-09-04 Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT

Country Status (1)

Country Link
CN (1) CN112101433B (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634332A (en) * 2020-12-21 2021-04-09 合肥讯图信息科技有限公司 Tracking method based on YOLOv4 model and DeepsORT model
CN112668432A (en) * 2020-12-22 2021-04-16 上海幻维数码创意科技股份有限公司 Human body detection tracking method in ground interactive projection system based on YoloV5 and Deepsort
CN112528963A (en) * 2021-01-09 2021-03-19 江苏拓邮信息智能技术研究院有限公司 Intelligent arithmetic question reading system based on MixNet-YOLOv3 and convolutional recurrent neural network CRNN
CN112836641A (en) * 2021-02-04 2021-05-25 浙江工业大学 Hand hygiene monitoring method based on machine vision
CN113052011A (en) * 2021-03-05 2021-06-29 浙江科技学院 Road target flow monitoring system based on computer vision
CN113160283B (en) * 2021-03-23 2024-04-16 河海大学 Target tracking method under multi-camera scene based on SIFT
CN113033449A (en) * 2021-04-02 2021-06-25 上海国际汽车城(集团)有限公司 Vehicle detection and marking method and system and electronic equipment
CN113139442A (en) * 2021-04-07 2021-07-20 青岛以萨数据技术有限公司 Image tracking method and device, storage medium and electronic equipment
CN113327416B (en) * 2021-04-14 2022-09-16 北京交通大学 Urban area traffic signal control method based on short-term traffic flow prediction
CN113112489B (en) * 2021-04-22 2022-11-15 池州学院 Insulator string-dropping fault detection method based on cascade detection model
CN113257003A (en) * 2021-05-12 2021-08-13 上海天壤智能科技有限公司 Traffic lane-level traffic flow counting system, method, device and medium thereof
CN113822844A (en) * 2021-05-21 2021-12-21 国电电力宁夏新能源开发有限公司 Unmanned aerial vehicle inspection defect detection method and device for blades of wind turbine generator system and storage medium
CN113256690B (en) * 2021-06-16 2021-09-17 中国人民解放军国防科技大学 Pedestrian multi-target tracking method based on video monitoring
CN113781521B (en) * 2021-07-12 2023-08-08 山东建筑大学 Bionic robot fish detection tracking method based on improved YOLO-deep
CN114187489B (en) * 2021-12-14 2024-04-30 中国平安财产保险股份有限公司 Method and device for detecting abnormal driving risk of vehicle, electronic equipment and storage medium
CN114399714A (en) * 2022-01-12 2022-04-26 福州大学 Vehicle-mounted camera video-based vehicle illegal parking detection method
CN114882393B (en) * 2022-03-29 2023-04-07 华南理工大学 Road reverse running and traffic accident event detection method based on target detection
CN114566052B (en) * 2022-04-27 2022-08-12 华南理工大学 Method for judging rotation of highway traffic flow monitoring equipment based on traffic flow direction
CN115620518B (en) * 2022-10-11 2023-10-13 东南大学 Intersection traffic conflict judging method based on deep learning
CN115620042B (en) * 2022-12-20 2023-03-10 菲特(天津)检测技术有限公司 Gear model determination method and system based on target detection and clustering
CN116758259A (en) * 2023-04-26 2023-09-15 中国公路工程咨询集团有限公司 Highway asset information identification method and system
CN116778224A (en) * 2023-05-09 2023-09-19 广州华南路桥实业有限公司 Vehicle tracking method based on video stream deep learning
CN116504068A (en) * 2023-06-26 2023-07-28 创辉达设计股份有限公司江苏分公司 Statistical method, device, computer equipment and storage medium for lane-level traffic flow
CN116863711B (en) * 2023-07-29 2024-03-29 广东省交通运输规划研究中心 Lane flow detection method, device, equipment and medium based on highway monitoring
CN117455957B (en) * 2023-12-25 2024-04-02 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Vehicle track positioning and tracking method and system based on deep learning
CN117671972B (en) * 2024-02-01 2024-05-14 北京交通发展研究院 Vehicle speed detection method and device for slow traffic system
CN117994987B (en) * 2024-04-07 2024-06-11 东南大学 Traffic parameter extraction method and related device based on target detection technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766902A (en) * 2017-11-09 2019-05-17 杭州海康威视系统技术有限公司 To the method, apparatus and equipment of the vehicle cluster in same region
CN109871763A (en) * 2019-01-16 2019-06-11 清华大学 A kind of specific objective tracking based on YOLO
CN110176139A (en) * 2019-02-21 2019-08-27 淮阴工学院 A kind of congestion in road identification method for visualizing based on DBSCAN+

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766902A (en) * 2017-11-09 2019-05-17 杭州海康威视系统技术有限公司 To the method, apparatus and equipment of the vehicle cluster in same region
CN109871763A (en) * 2019-01-16 2019-06-11 清华大学 A kind of specific objective tracking based on YOLO
CN110176139A (en) * 2019-02-21 2019-08-27 淮阴工学院 A kind of congestion in road identification method for visualizing based on DBSCAN+

Also Published As

Publication number Publication date
CN112101433A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN112101433B (en) Automatic lane-dividing vehicle counting method based on YOLO V4 and DeepSORT
CN110136449B (en) Deep learning-based traffic video vehicle illegal parking automatic identification snapshot method
Asha et al. Vehicle counting for traffic management system using YOLO and correlation filter
US11138442B2 (en) Robust, adaptive and efficient object detection, classification and tracking
Chiu et al. A robust object segmentation system using a probability-based background extraction algorithm
US9323991B2 (en) Method and system for video-based vehicle tracking adaptable to traffic conditions
Mithun et al. Detection and classification of vehicles from video using multiple time-spatial images
Wang et al. Review on vehicle detection based on video for traffic surveillance
CN104303193B (en) Target classification based on cluster
Albiol et al. Detection of parked vehicles using spatiotemporal maps
US20150063628A1 (en) Robust and computationally efficient video-based object tracking in regularized motion environments
Breitenstein et al. Hunting nessie-real-time abnormality detection from webcams
Varghese et al. An efficient algorithm for detection of vacant spaces in delimited and non-delimited parking lots
CN106933861A (en) A kind of customized across camera lens target retrieval method of supported feature
CN111723773A (en) Remnant detection method, device, electronic equipment and readable storage medium
Rezaei et al. Traffic-Net: 3D traffic monitoring using a single camera
Buch et al. Vehicle localisation and classification in urban CCTV streams
Huang et al. A real-time and color-based computer vision for traffic monitoring system
Yeshwanth et al. Estimation of intersection traffic density on decentralized architectures with deep networks
Chen et al. An SSD algorithm based on vehicle counting method
Ghahremannezhad et al. Robust road region extraction in video under various illumination and weather conditions
Rashid et al. Detection and classification of vehicles from a video using time-spatial image
CN109389177B (en) Tunnel vehicle re-identification method based on cooperative cascade forest
CN110443142A (en) A kind of deep learning vehicle count method extracted based on road surface with segmentation
Dike et al. Unmanned aerial vehicle (UAV) based running person detection from a real-time moving camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant