CN115170611A - Complex intersection vehicle driving track analysis method, system and application - Google Patents

Complex intersection vehicle driving track analysis method, system and application Download PDF

Info

Publication number
CN115170611A
CN115170611A CN202210808478.1A CN202210808478A CN115170611A CN 115170611 A CN115170611 A CN 115170611A CN 202210808478 A CN202210808478 A CN 202210808478A CN 115170611 A CN115170611 A CN 115170611A
Authority
CN
China
Prior art keywords
vehicle
model
image
target
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210808478.1A
Other languages
Chinese (zh)
Inventor
严忠贞
周辉
王相龙
朱信远
郭峰
丁静文
严赛男
陈豪
周可薇
刘春�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Original Assignee
Hubei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology filed Critical Hubei University of Technology
Priority to CN202210808478.1A priority Critical patent/CN115170611A/en
Publication of CN115170611A publication Critical patent/CN115170611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of computer vision, and discloses a method, a system and application for analyzing vehicle running tracks of complex intersections, wherein the method comprises the steps of firstly, carrying out transfer learning on YOLOv5 to realize vehicle perception of the complex intersections; and then, establishing an origin of a reference coordinate system by using the central point of the virtual coil, establishing the reference coordinate system in a self-adaptive mode, tracking the vehicle running track by using DeepsORT, and finally, realizing the analysis of the vehicle running track by using a KNN algorithm by using the change of the same vehicle in pixel positions as a classification basis. The YOLOv5 and DeepsORT models are trained in a transfer learning mode, so that accurate perception and target tracking of the vehicle in a complex environment are realized; the method comprises the steps of taking the change of a vehicle in the position of an image pixel as a classification basis, and analyzing the driving track of the vehicle by utilizing a KNN algorithm; and the complex intersection vehicle perception and trajectory analysis are realized by using the transfer learning and the KNN, and the complex intersection vehicle perception and trajectory analysis have higher precision and robustness.

Description

Complex intersection vehicle driving track analysis method, system and application
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method and a system for analyzing vehicle running tracks at complex intersections and application of the method and the system.
Background
At present, with the acceleration of urbanization construction in China in recent years, urban population and traffic vehicles are increasing continuously, and urban traffic in China faces serious challenges, such as various problems of frequent traffic accidents, traffic jam and the like caused by the rapid increase of traffic flow of a traffic key road; the analysis of the vehicle running track and the data statistics of the complex intersection are more and more important, and the data analysis result is helpful for the traffic department to manage and schedule the vehicles at each intersection to pass. The traditional traffic flow trajectory analysis and data statistics mainly use geomagnetic and RFID electronic tag modes for statistics; for example, patent CN108831163A, entitled "a geomagnetic-based cooperative main road signal control method for a main road" with an authorization date of 2021, 6 months and 25 days, discloses a method for detecting vehicle positions and track information at multiple traffic intersections by deploying geomagnetism, and formulating an optimal cooperative signal control scheme, so as to effectively alleviate traffic road congestion. For example, the granted patent CN109215350A, the granted date of which is 2021, 5 and 25, is entitled "a short-term traffic state prediction method based on RFID electronic license plate data", and discloses a method for detecting vehicles by deploying an RFID system, calculating the traffic flow between road sections within a certain time interval, and obtaining the current traffic state. The above traditional mode is not only high in construction cost and maintenance cost, but also complex in installation mode, and not beneficial to field deployment.
With the development of the current deep learning technology in the field of computer vision, the technology is widely applied to the field of traffic; the method is mainly embodied in image-based vehicle detection, vehicle tracking, license plate number recognition and vehicle track analysis. The technology is low in cost, and high in accuracy and real-time performance of vehicle targets for identification and tracking. Vehicle trajectory analysis is a more advanced visual task based on vehicle tracking and vehicle detection; the method is characterized in that the change relation of the position coordinates of the vehicle targets in the image frames of a plurality of moving targets in a continuous time interval is analyzed and accurately judged, and the traffic flow of each complex intersection is counted. Most of the existing methods set a virtual coil to set a threshold value for the change of the position of a tracked target vehicle at an intersection to judge the movement direction of the tracked target vehicle, and the method cannot accurately judge the movement direction in a complex traffic scene.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method, a system and application for analyzing vehicle running tracks at complex intersections.
The invention is realized in such a way that the method for analyzing the vehicle running track at the complex intersection comprises the following steps:
firstly, carrying out transfer learning on YOLOv5 to realize vehicle perception at a complex intersection;
then, a reference coordinate system origin is established by the virtual coil center point, a reference coordinate system is established in a self-adaptive mode, and the vehicle running track is tracked by the DeepsORT;
and finally, the change of the same vehicle in the pixel position is taken as a classification basis, and the KNN algorithm is utilized to realize the analysis of the vehicle running track.
Further, the complex intersection vehicle driving track analysis method specifically comprises the following steps:
step1, introducing a traffic key road monitoring camera to collect a plurality of video image data sets in continuous time; manually labeling all vehicles in each image in the data set, wherein the vehicles are externally connected with rectangular frame labels, and identifying the same target vehicle in the continuous video by using the same ID to construct a deep learning model training set;
preferably, the traffic vehicle image data set in step1 is:
{data k (x,y),k∈[1,K],x∈[1,W],y∈[1,H])
wherein, the data k (x, y) represents the pixel information of the x row and the y column in the K frame image in the image training set of the target vehicle, K represents the number of images in the image training set of the vehicle, W is the image width, and H is the image height;
in the step1, the data set of the vehicle external rectangular frame tag is defined as follows:
Figure BDA0003739340050000021
wherein, box k,n The marking information of the nth vehicle rectangular frame in the kth frame image in the traffic vehicle image training set, c _ x k,n Represents the center abscissa, c _ y, of the n-th vehicle rectangular frame in the k-th frame image in the traffic vehicle image training set k,n Represents the vertical coordinate, w, of the center of the rectangular frame of the nth vehicle in the kth frame image in the traffic vehicle image training set k,n The width h of the rectangular frame of the nth vehicle in the kth frame image in the traffic vehicle image training set is shown k,n Represents the height, id, of the rectangular frame of the nth vehicle in the kth image of the training set of images of the transportation vehicle k,n The number plate of the n-th vehicle rectangular frame in the k-th frame image; n is a radical of k Representing the number of vehicle targets in the kth frame image in the traffic vehicle image training set, namely the number of vehicle circumscribed rectangular frames;
step1, the classes in the image training set of the traffic vehicles are defined as:
Figure BDA0003739340050000031
the vehicle weight identification data set in step1 is defined as:
{Reid (c,i) (x,y),c∈[1,C],i∈[1,1],x∈[1,W_1],y∈[1,H_1]}
wherein, reid (c,i) (x, y) represents the pixel information of the x row and the y column in the ith image in the vehicle belonging to the C category in the re-recognition training set, C represents the number of vehicle categories in the re-recognition training set, I represents the number of the categories, W _1 is the image width in the re-recognition training set, and H _1 is the image height in the re-recognition training set;
step2, introducing a YOLOv5 target detection model network, constructing a deep learning target detection network loss function model, inputting the traffic vehicle image training set and the vehicle external rectangular frame label data set in the step1 into the YOLOv5 network model, and optimizing the training model based on a transfer learning mode to obtain a trained YOLOv5 network model;
as is preferred. The YOLOv5 deep learning network structure in the step2 is as follows:
YOLOv5 is composed of three modules, namely a feature extraction backbone network layer, a feature fusion network layer and an output network layer;
each module of the YOLOv5 network model comprises a plurality of convolution layers, and the parameter to be optimized of each convolution layer is
Figure BDA0003739340050000032
That is, the optimization parameter of the e-th convolution layer in the i-th module is
Figure BDA0003739340050000033
L i The number of convolutional layers in the ith module;
the feature extraction backbone network is mainly used for extracting image semantic features and position features, and the layer inputs the training set images in the step1 and outputs feature maps with different scales. The feature fusion network mainly fuses high-dimensional features and low-dimensional features, detection accuracy can be effectively improved, and feature graphs of different scales are input on the layer. The output is a fixed size profile. And the output network layer is used for final prediction output, the input of the output network layer is the output characteristic diagram of the last module, and the output is the predicted vehicle target rectangular frame and the confidence coefficient in the image.
The output network layer predicts the central abscissa, the ordinate, the width, the height, 5 pieces of target data information and target category information of the confidence coefficient of the rectangular frame of the vehicle target, the central coordinate and the height of the rectangular frame of the vehicle target are predicted by the output network layer, and the central abscissa of the frame of the nth vehicle target predicted by the ith grid of the kth frame image can be obtained by the width
Figure BDA0003739340050000041
Center ordinate
Figure BDA0003739340050000042
Width of
Figure BDA0003739340050000043
Height
Figure BDA0003739340050000044
Target class information is the nth class predicted by the ith grid of the kth frame image of the image training set
Figure BDA0003739340050000045
Confidence information confidence of the nth class type predicted by the ith grid of the kth frame image of the image training set can be obtained
Figure BDA0003739340050000046
Step2, introducing YOLOv5 to construct a deep learning network loss function model, wherein the deep learning network loss function model comprises the following steps:
the deep learning network loss function comprises: class loss, target bounding box loss and confidence loss;
the class loss is defined as:
Figure BDA0003739340050000047
wherein G is the number of grids divided by the image, B is the number of predicted boundary frames in each grid, i represents the number of unit grids, j represents the number of anchor frames,
Figure BDA0003739340050000048
indicating the situation of the target vehicle in the jth prior frame on the ith grid,
Figure BDA0003739340050000049
and representing the situation that the jth prior frame on the ith grid has no target vehicle. type k,n,i An nth object class representing an ith mesh of a kth image of a training set of images,
Figure BDA00037393400500000410
an nth vehicle object class representing an ith mesh prediction of a kth frame image; lambda [ alpha ] noobj Representing the confidence penalty weight coefficient in the absence of an object. The number of target vehicles at a complex traffic intersection is large, and the number of positive samples is far larger than that of negative samples, so that the imbalance of the positive samples and the negative samples causes difficulty in convergence in a model training process, and therefore, a balance factor gamma and a optimization loss function training is introduced on the basis of a binary cross entropy function.
The target bounding box loss is defined as:
Figure BDA0003739340050000051
Figure BDA0003739340050000052
Figure BDA0003739340050000053
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003739340050000054
the nth vehicle target frame center abscissa representing the ith grid prediction of the kth frame image,
Figure BDA0003739340050000055
representing the nth vehicle target frame center ordinate, c _ x, of the ith mesh prediction of the kth frame image k,n,i Represents the center abscissa, c _ y, of the nth vehicle target frame of the ith grid of the kth image in the image training set k,n,i Representing the vertical coordinate of the center of the nth vehicle target frame of the ith grid of the kth image of the image training set, wherein rho represents the Euclidean distance between two central points, c represents the diagonal distance of the minimum closure area simultaneously containing the predicted vehicle target frame and the real vehicle target frame,
Figure BDA0003739340050000056
respectively representing the width and the height of the nth vehicle target frame predicted by the ith grid of the kth frame image, w k,n,i ,h k,n,i The frame width and the height of the nth vehicle target frame of the ith grid of the kth frame image of the image training set are represented, and the IOU is the ratio of intersection and union between the prediction frame and the real frame.
The confidence loss is defined as:
Figure BDA0003739340050000057
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003739340050000058
representing the confidence of the nth vehicle object class predicted by the ith grid of the model network predicted kth frame image i (type k,n,s,i ) And representing the confidence of the nth vehicle object class of the ith grid of the kth frame image of the image training set.
The YOLOv5 loss function is:
LOSS=LOSS(class)+LOSS(conf)+LOSS(box)
further, the training of the YOLO V5 model based on the transfer learning manner includes:
taking the deep learning model image training set in the step1 as input data, loading pre-training weights by using a YOLOv5 model, modifying classification categories of the pre-training weights, and performing transfer training on the network by using the constructed data set;
further, freezing a Yolov5 network feature extraction backbone network layer L 1 And feature fusion network layer L 2 Since it is only necessary to identify whether the vehicle is a vehicle, the output network layer L is modified 3 The number of medium classification categories is 1. Updating output network layer L using SGD optimizer 3 Parameter(s)
Figure BDA0003739340050000061
Setting the training round epoch of a data set to be 50, setting the batch large batch to be 8, setting the initial learning rate a to be 1e-5, and adjusting the learning rate by using a cosine annealing formula; the formula is as follows:
Figure BDA0003739340050000062
wherein
Figure BDA0003739340050000063
And
Figure BDA0003739340050000064
representing the range of variation of the learning rate, T cur Representing how many epochs have passed since the last restart. T is i Representing how many epochs need to be trained in total for the ith restart. The optimized optimization parameters of the e-th convolutional layer in the i-th component are obtained through the training
Figure BDA0003739340050000065
The number of convolutional layers in the ith module;
step3, introducing a DeepSORT target tracking network model, constructing a deep learning target tracking network loss function model, inputting the vehicle weight recognition data set in the step1 into the DeepSORT network model to optimize the training model, and obtaining the trained DeepSORT network model;
preferably, the deep learning network structure of DeepSORT in step3 is:
the deep SORT comprises a target vehicle image feature extraction module, a target vehicle position prediction module and a target vehicle feature matching module;
further, the vehicle feature extraction module is composed of a small ResNet network, the network comprises a plurality of convolution layers, and each convolution layer has a parameter to be optimized defined as eta f I.e. the optimum parameter of the f-th convolution layer is eta f ,e∈[1,L]L is the number of the first convolution layers, which are input as the image dataset in step1 and output as the feature vector X (X) defined in 128 dimensions 0 ,x 1 ,x 2 ,…,x 128 ) And the layer mainly extracts image features corresponding to the vehicle target rectangular frame for subsequent similarity calculation to perform target matching and next frame position prediction. The target vehicle position prediction module mainly adopts a Kalman filtering algorithm to predict the position coordinates of a rectangular frame of the target vehicle in the next frame in the current frame. And the target vehicle feature matching module matches the position of the rectangular area of the vehicle in the current frame with the position coordinates of the rectangular frame of the predicted next frame by adopting Hungarian algorithm and allocates a unique id number for target tracking.
Further, the ResNet feature extraction network consists of a plurality of convolution layers, a pooling layer and a BN layer; a plurality of convolution layers are stacked to form a Residual module, and the Residual module introduces a Residual structure to effectively solve the problems of gradient loss and gradient explosion in the back propagation process. The network output layer can solve the problem of fixed size of the input image by using an average pooling layer to replace a full connection layer. The final output of the input is a 128-dimensional feature vector.
Further, a ResNet feature extraction network is defined, and the loss function of the ResNet feature extraction network is as follows:
Figure BDA0003739340050000071
Figure BDA0003739340050000072
wherein x i The ith characteristic value is extracted from the representation prediction target frame,
Figure BDA0003739340050000073
the ith characteristic value is extracted from the real target frame, and the difference value of the ith characteristic value and the ith characteristic value is mapped to
Figure BDA0003739340050000074
The function obtains its loss value and updates the parameters by back-propagation based algorithm.
Further, vehicle target appearance characteristics are extracted by combining ResNet, motion characteristics are calculated through a Kalman filtering algorithm to obtain a cost matrix, and the Hungarian algorithm is used for carrying out id distribution on target vehicles to achieve the function of tracking the target vehicles.
Further, the training the DeepSORT target tracking model comprises: setting the training round epoch as 100 rounds, selecting an SGD optimizer by the optimizer, setting the initial learning rate as 1e-5, and obtaining the optimized optimization parameter of the e-th convolutional layer after the training is finished as
Figure BDA0003739340050000075
Is the number of convolutional layers;
and 4, step4: acquiring initial images in real time through a road monitoring camera and transmitting the initial images to a calculation processing host, detecting target vehicles in each frame of image by using a YOLO target detection model, outputting a target vehicle position rectangular frame as input of a DeepsORT target tracking model to predict the track of each vehicle in the next frame, and distributing ID (identification) to the vehicles for real-time tracking;
preferably, the capturing of the video image data in step4 is defined as:
Figure BDA0003739340050000076
wherein the content of the first and second substances,
Figure BDA0003739340050000077
represents the x line in the k frame image acquired by the camera in real timey columns of pixel information, wherein K represents the number of images in an image training set of the vehicle, W is the image width, and H is the image height;
further, using the trained YOLOv5 model to identify a rectangular frame for each frame of image of the input test continuous video frame as follows:
Figure BDA0003739340050000081
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003739340050000082
represents the output of the YOLOv5 model of the x row and y column pixel information, type of the k frame image k,n The vehicle class label in the n-th vehicle circumscribed rectangular frame of the k-th frame image output by the YOLOv5 model is 1,c _x k,n Represents the center abscissa, c _ y, of the nth vehicle rectangular frame in the k frame image output by the YOLOv5 model k,n Represents the vertical coordinate of the center of the nth vehicle rectangular frame in the k frame image output by the YOLOv5 model, w k,n Representing the width h of the n-th vehicle rectangular frame in the k-th frame image output by the YOLOv5 model k,n Represents the height, type, of the nth vehicle rectangular frame in the k frame image output by the YOLOv5 model k,n Representing the confidence coefficient, N, of the rectangular frame of the nth vehicle in the k frame image output by the YOLOv5 model k Representing the number of vehicle targets in the k frame image output by the YOLOv5 model, namely the number of vehicle circumscribed rectangular frames;
further, a rectangle frame circumscribed by the target vehicle predicted by YOLOv5 is used as a deepSORT input, and the deepSORT output is defined as:
Figure BDA0003739340050000083
wherein the content of the first and second substances,
Figure BDA0003739340050000084
represents the x row and y column pixel information, type of k +1 frame image output by the DeepsORT model k,n Representing the n-th vehicle external rectangle of the k-th frame image output by the DeepsORT modelThe vehicle class label in the frame is 1,c _x k,n Represents the center abscissa, c _ y, of the nth vehicle rectangular frame in the k +1 th frame image output by the DeepsORT model k,n Represents the vertical coordinate of the center of the rectangular frame of the nth vehicle in the k +1 th frame image output by the DeepsORT model, w k,n Represents the width of the rectangular frame of the nth vehicle in the k +1 frame image output by the DeepsORT model, h k,n Represents the height, type of the nth vehicle rectangular frame in the k +1 th frame image output by the DeepsORT model k,n Representing the confidence coefficient, id, of the rectangular frame of the nth vehicle in the k +1 frame image output by the DeepsORT model k,n Represents the N-th vehicle rectangular frame id identifier, N, in the k + 1-th frame image output by the DeepsORT model k The number of vehicle targets in the k +1 frame image output by the DeepsORT model is represented, namely the number of vehicle circumscribed rectangular frames;
and 5, predicting a target vehicle external rectangle by using a YOLOv5 model to be used as input of a DeepsORT algorithm to track the same target vehicle, constructing a vehicle track direction prediction data set by manually marking the driving direction of the same target vehicle by using the coordinate variation of the central point of the external rectangle of the same target vehicle in a video frame as a data set sample, and calculating the coordinate transformation of the target vehicle in image frames in different time sequences by using a KNN algorithm to judge the target vehicle track.
Preferably, the vehicle coordinate transformation data set in step5 is used for determining vehicle trajectories by a KNN algorithm, and mainly a target tracking model DeepSORT is used for tracking different vehicles in a video stream to transform coordinates (dx, dy) within t time, the coordinates take a virtual coil as a coordinate reference system, linear trajectories or turning trajectories are labeled manually, and the KNN algorithm selects the coordinate transformation of 1000 target vehicles as a sample for testing.
Further, defining the data set of the direction of the artificial labeling track as follows:
{data(dx i ,dx i ,dire i ),dx∈[0,1],dy∈[0,1],dire∈{0,1}}
wherein data (dx) i ,dx i ,dire i ) Representing the x-axis direction and the y-axis direction variation of the ith tracking target vehicle in the t time and marking the variation as straight line or turning, 0 representing straight line and 1 tableAnd (4) turning the corner.
Further, a unique ID number is allocated to the same target vehicle by carrying out identification tracking effect on the vehicle in the video image, and the difference between the initial coordinate appearing in the video frame and the ending coordinate disappearing at last is carried out to calculate the transformation amount (dx, dy) of the coordinates. And manually marking the data in a manual observation mode, marking the condition of straight line or turning, and taking the data as sample information for prediction and judgment by using a KNN algorithm.
Further, the labeling track direction data set is subjected to 7:3, dividing a training set and a verification set, and obtaining k through a large number of cross test training to ensure that the vehicle target track classification accuracy is the best; after the various vehicle target tracks are classified in two ways, if the vehicle target tracks are turned, the vehicle target tracks are classified, and whether the vehicle is driven left or right is directly classified according to the positive and negative of coordinates of the vehicle target tracks.
Another object of the present invention is to provide a complex intersection vehicle driving trajectory analysis system.
Another object of the present invention is to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to execute the complex intersection vehicle travel track analysis method.
Another object of the present invention is to provide an information data processing terminal for implementing the complex intersection vehicle driving trajectory analysis method.
By combining all the technical schemes, the invention has the advantages and positive effects that:
the target vehicle track analysis and judgment method based on the KNN is applied to vehicle track classification, is more accurate than the traditional method and is more convenient to implement.
The method utilizes a transfer learning mode to train a YOLOv5 and DeepsORT model, and realizes accurate perception and target tracking of the vehicle in a complex environment; the method comprises the steps that the change of a vehicle in the position of an image pixel is taken as a classification basis, and the running track of the vehicle is analyzed by utilizing a KNN algorithm; compared with the traditional method for judging the vehicle track by using a threshold method, the method for judging the vehicle track by using the complex intersection realizes the vehicle perception and the track analysis by using the transfer learning and the KNN, and has higher precision and robustness.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
FIG. 1 is a flow chart of a method for analyzing a vehicle driving track at a complex intersection according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a method for analyzing a vehicle driving track at a complex intersection according to an embodiment of the present invention;
FIG. 3 is a diagram of the effect of training the Yolov5 model provided by the embodiment of the present invention;
FIG. 4 is a graph of the effect of training the feature extraction network of the DeepsORT model according to the embodiment of the present invention;
FIG. 5 is a diagram of the effect of the test YOLOv5 model identification detection vehicle provided by the embodiment of the invention;
FIG. 6 is a diagram of the effect of the test YOLOv5 and DeepsORT models on tracking target vehicles with different track states respectively in a multi-frame image according to the embodiment of the present invention (from left to right, respectively: left turn, straight, right turn, from top to bottom, as time axes);
fig. 7 is a distribution of training data sets for training the KNN algorithm to classify the vehicle trajectory state according to the embodiment of the present invention;
fig. 8 is a decision of the trained KNN algorithm on the classification of the vehicle trajectory in the test set according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The problems and defects of the prior art are as follows: based on the traditional modes such as the geomagnetic sensor and the like, the geomagnetic sensor is difficult to deploy, the geomagnetic sensor is high in cost, the geomagnetic sensor consumes huge manpower and material resources, and the geomagnetic sensor is troublesome to deploy and disassemble. And under the specific condition, the higher precision can not be achieved. The development of a downstream model framework in the deep learning vision processing field is mature, and further proving and practical application in the related application field are needed.
The significance for solving the problems is as follows: the deep learning vision technology processing has huge application prospect in related fields, and the technology can accurately acquire vehicle track direction data in real time and then further analyze and process the data.
Aiming at the problems in the prior art, the invention provides a method, a system and an application for analyzing a vehicle running track at a complex intersection, and the invention is described in detail with reference to the accompanying drawings.
As shown in fig. 1, the method for analyzing a vehicle driving track at a complex intersection according to an embodiment of the present invention includes:
s101, carrying out transfer learning on YOLOv5 to realize vehicle perception at a complex intersection;
s102, establishing an origin of a reference coordinate system by using the central point of the virtual coil, establishing the reference coordinate system in a self-adaptive mode, and tracking a vehicle running track by using DeepsORT;
and S103, carrying out vehicle running track analysis by using the KNN algorithm with the change of the same vehicle in the pixel position as a classification basis.
In a preferred embodiment of the present invention, fig. 2 is a schematic diagram illustrating a method for analyzing a vehicle driving track at a complex intersection according to an embodiment of the present invention, where the method includes:
and Step1, shooting the traffic road image at the current time point by the intersection camera.
And Step2, detecting the position information of the target vehicle by taking the image shot at the current time as input based on the well-trained YOLO v5 model of the transfer learning, cutting out the image of the recognized vehicle and acquiring the position coordinate information of the image as output.
Step3, using a DeepSORT model trained on the basis of transfer learning to recognize the positions of the vehicles in the last Step and cut vehicle images as input, and tracking a plurality of vehicles; the position of the vehicle in the image is marked by using a circumscribed rectangular frame, and a unique ID number is assigned to the same target vehicle.
Step4: establishing a rectangular coordinate system by taking the image central point as the coordinate origin, taking the central point of the rectangular external frame of the tracked vehicle as a reference coordinate, and acquiring the coordinate (x) of the appearance point of the tracked target vehicle in the video frame 0 ,y 0 ) Coordinates, last position coordinates (x) of the target vehicle disappearing in the video frame t ,y t ) And calculates the trace offset (dx, dy)
Step5: using the vehicle offset (dx, dy) collected in the previous step, marking a straight-going or turning state of partial data of the vehicle in advance by a manual marking mode to serve as a reference sample, and calculating the distance between the reference sample and the track classification by using a KNN algorithm; whether the vehicle is going straight or turning is classified.
Step6: if the curve track is the curve track, judging whether the left curve or the right curve is obtained through the positive and negative properties of the x coordinate of the vanishing point.
Embodiment 1. Target vehicles are identified based on the transfer learning training YOLOv 5.
The method comprises the following steps: 1) Training a YOLOv5 model 2) traffic intersection recognition by adopting a transfer learning mode.
Specifically, the method comprises the following steps:
(1) Training YOLOv5 model by adopting transfer learning mode
The method mainly comprises the steps of identifying a vehicle at the complex traffic intersection and carrying out subsequent processing, wherein the accuracy of the target vehicle is very important for subsequent tasks, and a YOLOv5 model is adopted for training to accurately position the target vehicle in real time.
In the YOLOV5 model, after a series of operations such as feature extraction and feature fusion, the output of three tensors (tensors) with different scales respectively representing different receptive fields is finally obtained, that is, a prediction on feature maps with different scales is completed. This contributes to an improvement in the accuracy of small target detection. The model ultimately performs both regression and classification tasks for the target vehicle. Since the target detection section of the present invention only needs to detect whether the target is a vehicle, a two-classification task (distinguishing whether it is a background or a target vehicle) is performed in the target recommendation region (anchor) generated in YOLOv 5. Therefore, the activation function of the network classification needs to be modified, and the sigmoid activation function is adopted:
Figure BDA0003739340050000131
and z is an output value of the current neuron node, the probability distribution of the value is between 0 and 1 after the sigmoid function is activated, and the target is determined if the probability value is more than 0.5, otherwise, the background is determined. The loss function of the classification is also redesigned, and in addition, as the existing area of the target vehicle is more under the condition of complex traffic, the positive and negative samples are unbalanced, the loss function is designed into an improved binary cross entropy loss function:
Figure BDA0003739340050000132
where y is the probability of the prediction,
Figure BDA0003739340050000133
is a label; when in use
Figure BDA0003739340050000134
When the value is 0, the target vehicle does not exist in the target suggestion area, otherwise, the target vehicle exists, and a is a balance positive and negative sample hyperparameter. Because each layer in the YOLOv5 model is elaborately designed and is inconvenient to directly modify, only the activation function and the classification loss function of the YOLOv5 model need to be modified, and the additional hyper-parameters are also modified and adjusted to a certain extent, and then the network is trained for transfer learning after being adjusted to a certain extent; loading a pre-training file in a transfer learning mode, adjusting and setting hyper-parameters, and shooting a complex traffic intersection by selfThe pictures train the pictures; specifically, the feature extraction region and the feature fusion region of the network model are frozen, and only one training is performed on the classifier. This enables a better result to be achieved in a very short time batch. From this, the entire YOLOv5 model training portion is completed.
(2) The method comprises the steps of deploying a camera at a detected crossroad to enable the camera to cover the whole crossroad for monitoring, and sampling an obtained monitoring image video frame in a differential frame method, for example, extracting a frame of video image every 3 frames, wherein the frame extraction is to keep the video stream processing speed as far as possible to achieve real-time processing. And inputting the detected image into a YOLOv5 network for prediction, and obtaining the output of the central coordinate values x and y and the width and height w and h of the vehicle target in the image picture. After the data are obtained, the next target tracking processing is carried out.
Embodiment 2, the vehicle is tracked in real time based on DeepSORT target tracking.
The method comprises the following steps: 1) Training the DeepsORT model 2) based on the transfer learning to obtain a target vehicle track.
Specifically, the method comprises the following steps:
(1) And (3) carrying out a small amount of training on the DeepsORT model by using the constructed re-recognition data set through a transfer learning method to achieve the effect.
(2) Depsolt takes the position coordinates of the plurality of vehicles output by the target detector YOLOv5 as input, and outputs the same id assigned to the trajectory of the target vehicle and the same target vehicle amount thereof. Saving the coordinate position of the target vehicle appearing in the first frame image in the video picture on the basis of tracking the target vehicle; the coordinate position is obtained by using an artificial virtual coil as an origin of a coordinate axis (x) 0 ,y 0 ) Coordinates and the final position coordinates (x) of the target vehicle disappearing in the video frame are obtained at the same time t ,y t ) The change (dx, dy) of the coordinates is calculated, and the set of coordinates is stored as a sample of the classification by the KNN algorithm.
Embodiment 3. Training and target vehicle direction determination based on KNN algorithm: acquiring a plurality of target vehicle coordinate transformation quantities dx and dy of a start frame in a video through the target vehicle identification tracking in the previous step as samples, and performing KNN classification operation on the samples; the KNN distance formula adopts a formula 3 cosine distance formula:
Figure BDA0003739340050000141
wherein (dx) 0 ,dy 0 ) For the data coordinate transformation quantity to be classified, (dx) i ,dy i ) For the coordinate transformation of the ith sample, n data sets exist in total;
and training by adopting a cross test method in a training stage to obtain the most appropriate k value, classifying the k value if the k value turns after the k value is classified twice in a test classification stage, and directly classifying whether the vehicle runs left or right according to the positive and negative of coordinates of the k value.
Referring to fig. 6, the initial coordinates (x) of a plurality of tracking target vehicles (unique id identifiers) appearing in a video frame are obtained with the center of the image as a coordinate axis 1 ,y 1 ) Vanishing final coordinate (x) 2 ,y 2 ) And calculating its offset (dx, dy). And (5) scaling the coordinates to be between 0 and 1 according to the width and the height of the video, and marking the target turning driving or straight driving. Collecting coordinates when the target vehicle appears, coordinates when the target vehicle disappears and the track direction of the target vehicle, and calculating the coordinate offset of the target vehicle as a KNN classification part sample shown in a table 1;
TABLE 1
Numbering Initial coordinates Final coordinates Offset coordinates Direction
1 (-0.06,0.08) (-0.39,0.27) (-0.33,0.3) Go straight
2 (-0.06,0.24) (0.02,-0.28) (0.07,-0.44) Straight going
3 (-0.05,-0.31) (-0.42,0.11) (-0.27,-0.18) Straight line
4 (0.31,-0.27) (0.25,-0.25) (0.28,0.03) Turning corner
5 (-0.08,-0.28) (0.42,-0.22) (0.5,0.06) Turning
6 (-0.05,-0.29) (0.46,-0.24) (0.51,0.05) Turning
7 (-0.04,-0.3) (0.05,-0.28) (0.09,0.02) Turning corner
Fig. 7 is the distribution of the sample points in the collected part.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for analyzing vehicle running tracks at complex intersections is characterized by comprising the following steps: carrying out complex intersection vehicle perception by using a YOLOv5 transfer learning method;
constructing an origin of a reference coordinate system by using the central point of the virtual coil, constructing the reference coordinate system in a self-adaptive mode, and tracking the vehicle running track by using DeepSORT;
and (4) analyzing the vehicle running track by taking the change of the same vehicle in the pixel position as a classification basis.
2. The complex intersection vehicle travel track analysis method of claim 1, wherein the complex intersection vehicle travel track analysis method specifically comprises:
introducing a traffic vehicle image data set acquired in a continuous video frame; manually marking all vehicles in each image in the data set to obtain a vehicle circumscribed rectangular frame label data set, cutting out the same target vehicle in different frames of images and marking the same target to obtain a vehicle re-identification data set, and constructing a deep learning model training set;
and step two, introducing a YOLOv5 target detection model network, constructing a model loss function, inputting a traffic vehicle image data set and a vehicle external rectangular frame label data set in a deep learning model training set into YOLOv5, and training based on a transfer learning mode to obtain a trained YOLOv5 network model.
And step three, introducing a DeepSORT target detection model network, constructing a model loss function, inputting a re-recognition data set in the deep learning model training set into the DeepSORT for training based on a transfer learning mode, and obtaining a trained DeepSORT network model.
Step four: acquiring an initial image in real time through a road monitoring camera and transmitting the initial image to a calculation processing host, detecting a rectangular frame of a target vehicle by using a YOLO target detection model as deep SORT input, and outputting the model to perform real-time tracking on an ID allocated to the rectangular frame;
and step five, classifying the track direction by using a KNN algorithm based on the coordinate variation of the same target vehicle in different image frames, and calculating the coordinate transformation of the target vehicle in different time sequence image frames by using the KNN algorithm to judge the track of the target vehicle.
3. The complex intersection vehicle driving trajectory analysis method of claim 1, wherein in step one, the deep learning model training set comprises a traffic vehicle image dataset, a vehicle bounding rectangle label dataset, and a vehicle re-identification dataset;
the traffic vehicle image data set is derived from various types of vehicle information of a traffic intersection shot in different scenes, and is marked by adopting a YOLO format to construct a vehicle external rectangular frame label data set;
the vehicle weight identification data set is used for cutting vehicle targets in a plurality of frame images of a shot traffic intersection video, and manually marking the vehicle targets to classify the same target vehicle.
4. The complex intersection vehicle travel track analysis method of claim 1, characterized in that in step one, the target vehicle data set is subjected to 7: and 3, dividing the training set and the verification set, carrying out image enhancement technologies such as random cutting, turning, erasing and the like on the images, expanding the data set of the images, so that the generalization capability of the model is stronger, and carrying out data set expansion on the vehicle weight identification data set by adopting the processing image enhancement technology.
5. The complex intersection vehicle driving trajectory analysis method of claim 1, characterized in that in step two, the training of YOLOv5 model based on the transfer learning manner comprises:
loading pre-training weights by using a YOLOv5 model, and modifying the number of classification categories; carrying out migration training on the network by the constructed data set; by freezing the feature extraction backbone network and the feature fusion network part, only the final classification network needs to be trained.
6. The complex intersection vehicle driving trajectory analysis method of claim 1, wherein in step three, said constructing and training a DeepsORT model comprises:
introducing a ResNet feature extraction network in a DeepSORT model, loading a pre-training weight to remove a final full-connection layer of the network, and performing migration training on the ResNet network by using the constructed data set; only the last layer is trained by freezing the previous convolutional layer, and a smaller learning rate is set for training the last layer;
and taking the trained ResNet network as a feature extraction network of the DeepsORT model.
7. The complex intersection vehicle driving trajectory analysis method of claim 1, wherein in step four, training the deepSORT target tracking framework and building a YOLOv5 combined deepSORT model comprises: YOLOv5 performs vehicle detection on the input video frame image and outputs position information of a detection frame; the deepSORT input is the position information of the detected target vehicle and the image with the cut-out position is output as the position information of the predicted next frame and the ID identification of the same target.
8. A complex intersection vehicle travel track analysis system that performs the complex intersection vehicle travel track analysis method of any one of claims 1-7.
9. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to execute the complex intersection vehicle travel track analysis method according to any one of claims 1 to 7.
10. An information data processing terminal characterized by being used for implementing the complex intersection vehicle travel track analysis method of any one of claims 1 to 7.
CN202210808478.1A 2022-07-11 2022-07-11 Complex intersection vehicle driving track analysis method, system and application Pending CN115170611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210808478.1A CN115170611A (en) 2022-07-11 2022-07-11 Complex intersection vehicle driving track analysis method, system and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210808478.1A CN115170611A (en) 2022-07-11 2022-07-11 Complex intersection vehicle driving track analysis method, system and application

Publications (1)

Publication Number Publication Date
CN115170611A true CN115170611A (en) 2022-10-11

Family

ID=83492320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210808478.1A Pending CN115170611A (en) 2022-07-11 2022-07-11 Complex intersection vehicle driving track analysis method, system and application

Country Status (1)

Country Link
CN (1) CN115170611A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758732A (en) * 2023-05-18 2023-09-15 内蒙古工业大学 Intersection vehicle detection and bus priority passing method under fog computing environment
CN117077042A (en) * 2023-10-17 2023-11-17 北京鑫贝诚科技有限公司 Rural level crossing safety early warning method and system
CN117437792A (en) * 2023-12-20 2024-01-23 中交第一公路勘察设计研究院有限公司 Real-time road traffic state monitoring method, device and system based on edge calculation
CN117994987A (en) * 2024-04-07 2024-05-07 东南大学 Traffic parameter extraction method and related device based on target detection technology

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758732A (en) * 2023-05-18 2023-09-15 内蒙古工业大学 Intersection vehicle detection and bus priority passing method under fog computing environment
CN117077042A (en) * 2023-10-17 2023-11-17 北京鑫贝诚科技有限公司 Rural level crossing safety early warning method and system
CN117077042B (en) * 2023-10-17 2024-01-09 北京鑫贝诚科技有限公司 Rural level crossing safety early warning method and system
CN117437792A (en) * 2023-12-20 2024-01-23 中交第一公路勘察设计研究院有限公司 Real-time road traffic state monitoring method, device and system based on edge calculation
CN117437792B (en) * 2023-12-20 2024-04-09 中交第一公路勘察设计研究院有限公司 Real-time road traffic state monitoring method, device and system based on edge calculation
CN117994987A (en) * 2024-04-07 2024-05-07 东南大学 Traffic parameter extraction method and related device based on target detection technology
CN117994987B (en) * 2024-04-07 2024-06-11 东南大学 Traffic parameter extraction method and related device based on target detection technology

Similar Documents

Publication Publication Date Title
CN110059554B (en) Multi-branch target detection method based on traffic scene
CN108830188B (en) Vehicle detection method based on deep learning
CN111062413B (en) Road target detection method and device, electronic equipment and storage medium
CN115170611A (en) Complex intersection vehicle driving track analysis method, system and application
Nie et al. Pavement Crack Detection based on yolo v3
CN108171136B (en) System and method for searching images by images for vehicles at multi-task gate
CN112069944B (en) Road congestion level determining method
Nie et al. Pavement distress detection based on transfer learning
CN109508715A (en) A kind of License Plate and recognition methods based on deep learning
CN110363122A (en) A kind of cross-domain object detection method based on multilayer feature alignment
CN105512640A (en) Method for acquiring people flow on the basis of video sequence
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN102902983B (en) A kind of taxi identification method based on support vector machine
CN104978567A (en) Vehicle detection method based on scenario classification
CN114648665A (en) Weak supervision target detection method and system
CN114998748B (en) Remote sensing image target fine identification method, electronic equipment and storage medium
Xiang et al. Lightweight fully convolutional network for license plate detection
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN112084890A (en) Multi-scale traffic signal sign identification method based on GMM and CQFL
CN112738470A (en) Method for detecting parking in expressway tunnel
CN115376101A (en) Incremental learning method and system for automatic driving environment perception
CN103679214A (en) Vehicle detection method based on online area estimation and multi-feature decision fusion
Gu et al. Local Fast R-CNN flow for object-centric event recognition in complex traffic scenes
CN102708384A (en) Bootstrapping weak learning method based on random fern and classifier thereof
Wang Vehicle image detection method using deep learning in UAV video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination