CN112800879B - Vehicle-mounted video-based front vehicle position prediction method and prediction system - Google Patents

Vehicle-mounted video-based front vehicle position prediction method and prediction system Download PDF

Info

Publication number
CN112800879B
CN112800879B CN202110051940.3A CN202110051940A CN112800879B CN 112800879 B CN112800879 B CN 112800879B CN 202110051940 A CN202110051940 A CN 202110051940A CN 112800879 B CN112800879 B CN 112800879B
Authority
CN
China
Prior art keywords
vehicle
sequence
optical flow
front vehicle
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110051940.3A
Other languages
Chinese (zh)
Other versions
CN112800879A (en
Inventor
宋建新
苏万亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110051940.3A priority Critical patent/CN112800879B/en
Publication of CN112800879A publication Critical patent/CN112800879A/en
Application granted granted Critical
Publication of CN112800879B publication Critical patent/CN112800879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于车载视频的前方车辆位置预测方法,包括:构建基于编解码框架的车辆位置预测模型,用于根据前车包围框和包围框内光流的历史数据、本车运动信息的预测数据,预测前车的位置和尺度;构建样本集并对车辆位置预测模型进行训练;获取车载视频;对视频帧进行车辆检测与跟踪并计算光流,得到前车的包围框序列和光流序列;预测本车的运动信息,构成运动预测序列;截取当前时刻t前的T个视频帧中前车包围框、包围框内的光流,和t后的△个视频帧中本车运动信息预测值,输入车辆位置预测模型,得到前车在t后的△个视频帧中的包围框序列,预测出前车的位置和尺度。该方法仅基于行车记录仪拍摄的视频信息,能够实时对前车位置和尺度做出预测。

Figure 202110051940

The invention discloses a method for predicting the position of a vehicle ahead based on vehicle video, which includes: constructing a vehicle position prediction model based on an encoding and decoding framework, which is used to predict the position of the vehicle based on the enclosing frame of the preceding vehicle and the historical data of the optical flow in the enclosing frame, and the motion information of the vehicle. Predict the position and scale of the preceding vehicle; build a sample set and train the vehicle position prediction model; obtain vehicle video; perform vehicle detection and tracking on video frames and calculate optical flow to obtain the bounding box sequence and optical flow of the preceding vehicle. Sequence; predict the motion information of the vehicle to form a motion prediction sequence; intercept the bounding box of the preceding vehicle and the optical flow in the bounding box in the T video frames before the current time t, and the motion information of the vehicle in the △ video frames after t Predicted value, input the vehicle position prediction model, get the bounding box sequence of the preceding vehicle in △ video frames after t, and predict the position and scale of the preceding vehicle. This method is only based on the video information captured by the dash cam, and can predict the position and scale of the preceding vehicle in real time.

Figure 202110051940

Description

一种基于车载视频的前方车辆位置预测方法和预测系统A method and system for predicting the position of a vehicle ahead based on vehicle video

技术领域technical field

本发明属于辅助驾驶技术领域,具体涉及一种基于车载视频的前方车辆位置预测方法和系统。The invention belongs to the technical field of assisted driving, and in particular relates to a method and system for predicting the position of a vehicle ahead based on in-vehicle video.

背景技术Background technique

随着社会的不断发展,家用汽车得到了普及。在享受到汽车带来的便捷时,很多问题也随之而来,如交通安全事故频繁发生、道路行驶环境恶劣、生态环境受到污染等。种种问题都使得人们的生命和财产受到威胁,尤其是交通事故问题,因此安全行车成为了大众迫切的需求。造成交通事故往往是因为驾驶员对驾驶道路上其他交通参与者的行为不能及时做出反应,而行车记录仪现已经被大量车主使用,可以记录车主行驶全过程中的视频图像和声音,如果能够根据行车记录仪拍摄的视频,实时对本车前方车辆的位置进行预测,就能让驾驶员在行车过程中有足够的时间避免交通事故的发生,但目前的行车记录仪还没有这种功能。With the continuous development of society, family cars have been popularized. When enjoying the convenience brought by cars, many problems also follow, such as frequent traffic safety accidents, harsh road driving conditions, and pollution of the ecological environment. All kinds of problems threaten people's lives and property, especially traffic accidents. Therefore, safe driving has become an urgent need of the public. Traffic accidents are often caused by the driver's inability to respond in time to the behavior of other traffic participants on the driving road, and the driving recorder has been used by a large number of car owners, which can record the video images and sounds of the entire driving process of the car owner. According to the video captured by the driving recorder, the real-time prediction of the position of the vehicle in front of the vehicle can allow the driver to have enough time to avoid traffic accidents during the driving process, but the current driving recorder does not have such a function.

目前国内外提出的关于车辆位置的预测方法其大致可以分为传统方法和基于深度学习方法两类。At present, the prediction methods about vehicle position proposed at home and abroad can be roughly divided into two categories: traditional methods and deep learning-based methods.

传统的车辆位置预测方法如贝叶斯滤波方法,该方法的结构过于简单,无法分析复杂的车辆运动模式,而且往往不能很好的进行长期预测。动态贝叶斯网络利用图形模型描述了决定车辆轨迹的各种潜在因素,对生成车辆轨迹的物理过程进行显示建模,虽然能够解决上述问题,但由于基于设计人员的直觉确定的模型结构不足以捕获各种动态交通场景,在真实交通场景的性能受到限制,并且其计算复杂度高,不能满足实时预测的要求。Traditional vehicle position prediction methods such as Bayesian filtering methods are too simple in structure to analyze complex vehicle motion patterns and often fail to perform long-term predictions well. The dynamic Bayesian network uses a graphical model to describe the various potential factors that determine the vehicle trajectory, and models the physical process that generates the vehicle trajectory. Although it can solve the above problems, the model structure determined based on the designer's intuition is not enough. Capturing various dynamic traffic scenarios has limited performance in real traffic scenarios, and its computational complexity is high, which cannot meet the requirements of real-time prediction.

近几年,基于深度学习的方法在图像处理领域展现出强大的能力,许多研究者也将深度学习方法中的循环神经网络结构及其各种变体结构应用在车辆位置预测的任务中。这些方法利用车辆过去的行驶数据,在深度学习网络模型中训练,在各自的应用场景中都获得了很好的预测效果。但是这些研究存在两个问题:第一,车辆过去的行驶数据都需要通过车辆上安装的多种传感器捕获得到,这在今天的生产车辆上并不常见;第二,仅能预测出前方车辆的像素位置,不能预测出前方车辆的尺度。In recent years, methods based on deep learning have shown powerful capabilities in the field of image processing, and many researchers have also applied the recurrent neural network structure and its various variant structures in deep learning methods to the task of vehicle position prediction. These methods use the past driving data of the vehicle, train in the deep learning network model, and obtain good prediction results in their respective application scenarios. But these studies have two problems: first, the past driving data of the vehicle needs to be captured by various sensors installed on the vehicle, which is not common in today's production vehicles; second, it can only predict the vehicle ahead. Pixel position, the scale of the vehicle ahead cannot be predicted.

而本发明仅基于行车记录仪拍摄的图像信息实时对前方车辆位置和尺度做出预测,让驾驶员在行车过程中有足够的时间避免交通事故,可以较好的运用到实际场景中。However, the present invention makes real-time prediction on the position and size of the vehicle ahead based on the image information captured by the driving recorder, so that the driver has enough time to avoid traffic accidents during the driving process, and can be better applied to the actual scene.

发明内容SUMMARY OF THE INVENTION

发明目的:针对现有技术中存在的问题,本发明提供一种基于车载视频的前方车辆位置预测方法,该方法仅基于行车记录仪拍摄的视频信息,能够实时对前方车辆位置和尺度做出预测,让驾驶员在行车过程中有足够的时间避免交通事故,可以较好的运用到实际场景中。Purpose of the invention: In view of the problems existing in the prior art, the present invention provides a method for predicting the position of the vehicle ahead based on on-board video. The method can predict the position and scale of the vehicle ahead in real time only based on the video information captured by the driving recorder. , so that the driver has enough time to avoid traffic accidents during the driving process, which can be better applied to the actual scene.

技术方案:本发明一方面公开了一种基于车载视频的前方车辆位置预测方法,包括训练阶段和预测阶段,其中训练阶段包括:Technical solution: On the one hand, the present invention discloses a method for predicting the position of the vehicle ahead based on the vehicle video, including a training stage and a prediction stage, wherein the training stage includes:

S1、构建基于编解码框架的车辆位置预测模型,所述车辆位置预测模型用于根据当前时刻t之前的t-0,t-1,…,t-(T-1)时刻前方车辆包围框、所述包围框内的光流、本车在当前时刻t之后的t+1,t+2,…,t+△时刻的运动信息,预测前方车辆在当前时刻t之后的t+1,t+2,…,t+△时刻的包围框;S1. Construct a vehicle position prediction model based on an encoding and decoding framework. The vehicle position prediction model is used for the bounding box of the vehicle ahead at the time t-0, t-1, . . . , t-(T-1) before the current time t, The optical flow in the bounding box, the motion information of the vehicle at times t+1, t+2, ..., t+△ after the current time t, and the prediction of the vehicle ahead at t+1, t+2 after the current time t ,…, the bounding box at time t+△;

所述车辆位置预测模型的输入包括:当前时刻t前的T个时刻的视频帧中,前方车辆的包围框序列B、前方车辆包围框内的光流序列F,以及当前时刻t后的△个时刻的视频帧中,本车的运动预测序列M;The input of the vehicle position prediction model includes: in the video frames at T times before the current time t, the bounding box sequence B of the preceding vehicle, the optical flow sequence F in the bounding box of the preceding vehicle, and the △ frames after the current time t. In the video frame at the moment, the motion prediction sequence M of the vehicle;

所述车辆位置预测模型的输出为当前时刻t后的△个时刻的视频帧图像中前方车辆的预测包围框序列Y;The output of the vehicle position prediction model is the predicted bounding box sequence Y of the preceding vehicle in the video frame images at △ times after the current time t;

所述车辆位置预测模型包括:前方车辆包围框编码器、前方车辆光流编码器、特征融合单元、前方车辆位置预测解码器;The vehicle position prediction model includes: a front vehicle bounding box encoder, a front vehicle optical flow encoder, a feature fusion unit, and a front vehicle position prediction decoder;

所述前方车辆包围框编码器用于对前方车辆的包围框序列B编码,得到前方车辆的时序特征矢量

Figure BDA0002899371690000021
The preceding vehicle bounding box encoder is used to encode the bounding box sequence B of the preceding vehicle to obtain the time series feature vector of the preceding vehicle
Figure BDA0002899371690000021

所述前方车辆光流编码器用于对前方车辆包围框内的光流序列F编码,得到前方车辆的运动特征矢量

Figure BDA0002899371690000022
The preceding vehicle optical flow encoder is used to encode the optical flow sequence F in the bounding box of the preceding vehicle to obtain the motion feature vector of the preceding vehicle
Figure BDA0002899371690000022

所述特征融合单元将前方车辆的时序特征矢量

Figure BDA0002899371690000023
和运动特征矢量
Figure BDA0002899371690000024
连接为前车的融合特征矢量
Figure BDA0002899371690000025
The feature fusion unit combines the time series feature vector of the preceding vehicle
Figure BDA0002899371690000023
and motion feature vectors
Figure BDA0002899371690000024
Concatenated as the fused feature vector of the preceding vehicle
Figure BDA0002899371690000025

所述前方车辆位置预测解码器根据本车的运动预测序列M对特征矢量

Figure BDA0002899371690000026
解码,得到当前时刻t后的△个时刻的视频帧中前方车辆的预测包围框;The preceding vehicle position prediction decoder predicts the sequence M pairs of feature vectors according to the motion of the vehicle.
Figure BDA0002899371690000026
Decoding to obtain the predicted bounding box of the vehicle ahead in the video frame at △ times after the current time t;

S2、构建样本集并对车辆位置预测模型进行训练,包括:S2. Construct a sample set and train a vehicle position prediction model, including:

S2-1、采集能够拍摄到前车的多个时长为s的车载视频片段,对每个视频片段中的视频帧进行采样,并确定采样后的视频帧中前方车辆的包围框序列Btr、包围框内的光流序列Ftr和视频帧对应时刻本车的运动预测序列Mtr,构成样本集;S2-1. Collect a plurality of in-vehicle video clips with a duration of s that can capture the preceding vehicle, sample the video frames in each video clip, and determine the bounding box sequence B tr of the preceding vehicle in the sampled video frame, The optical flow sequence F tr in the bounding box and the motion prediction sequence M tr of the vehicle at the corresponding moment of the video frame constitute a sample set;

S2-2、将样本集划分为训练集和验证集;设置学习率σ,批处理数量N;S2-2. Divide the sample set into training set and validation set; set learning rate σ and batch number N;

S2-3、训练过程采用Adam优化器,根据训练集样本数和N确定训练批次N′;将训练样本中的视频片段前s′时长的视频帧对应的Btr、Ftr,后s″时长的视频帧对应的Mtr作为车辆位置预测模型的输入,后s″时长的视频帧对应的Btr作为输出,对所述模型进行训练,保存模型参数,并用验证集验证模型的预测准确度;s′+s″=s; S2-3 . The Adam optimizer is used in the training process, and the training batch N′ is determined according to the number of samples in the training set and N; The M tr corresponding to the video frame of the duration is used as the input of the vehicle position prediction model, and the B tr corresponding to the video frame of the later s″ duration is used as the output, the model is trained, the model parameters are saved, and the verification set is used to verify the prediction accuracy of the model ;s'+s"=s;

S2-4、选择N′批训练中预测准确度最高的模型参数作为车辆位置预测模型的参数;S2-4, select the model parameters with the highest prediction accuracy in the N' batch training as the parameters of the vehicle position prediction model;

预测阶段包括:The forecast phase includes:

车辆上设置可以拍摄前方车辆的摄像头,获取所述摄像头在车辆行驶中采集的视频数据;A camera that can shoot the vehicle ahead is set on the vehicle, and video data collected by the camera while the vehicle is driving is acquired;

对视频中每一帧图像进行车辆检测与跟踪,得到每一辆前车的包围框序列,并存入Btest(i)中,i为前车编号;同时计算包围框内的光流,存入Ftest(i);获取本车在未来帧中的运动信息,存入序列MtestPerform vehicle detection and tracking on each frame of image in the video, obtain the bounding box sequence of each preceding vehicle, and store it in B test (i), where i is the number of the preceding vehicle; at the same time, calculate the optical flow in the bounding box and store Enter F test (i); obtain the motion information of the vehicle in the future frame, and store it in the sequence M test ;

在序列Btest(i)和Ftest(i)中采用长度为T的第一滑动窗,在序列Mtest中采用长度为△的第二滑动窗,分别截取当前时刻t前的T个视频帧中车辆i的包围框、所述包围框内的光流,以及当前时刻t后的△个视频帧中本车的运动信息预测值,输入训练好的车辆位置预测模型中,得到前方车辆i在当前时刻t后的△个视频帧中的包围框序列Y′(i)=[Y′t+1(i),Y′t+2(i),…,Y′t+δ(i),…,Y′t+△(i)],计算前方车辆i的包围框在当前时刻视频帧中的相对位置:

Figure BDA0002899371690000031
其中Btest,t+0(i)为前方车辆i在当前时刻t的包围框;1≤δ≤△;A first sliding window of length T is used in sequences B test (i) and F test (i), and a second sliding window of length △ is used in sequence M test to intercept T video frames before the current time t respectively The bounding box of vehicle i, the optical flow in the bounding box, and the predicted value of the motion information of the vehicle in △ video frames after the current time t are input into the trained vehicle position prediction model, and the position of the vehicle i ahead is obtained. The bounding box sequence Y′(i) in △ video frames after the current time t=[Y′ t+1 (i), Y′ t+2 (i),…,Y′ t+δ (i), ...,Y′ t+△ (i)], calculate the relative position of the bounding box of the preceding vehicle i in the video frame at the current moment:
Figure BDA0002899371690000031
where B test,t+0 (i) is the bounding box of the preceding vehicle i at the current time t; 1≤δ≤△;

根据Y′(i)中包围框的中心得到前方车辆i的预测轨迹;根据Y′(i)中包围框的宽高得到前方车辆i尺度。The predicted trajectory of the preceding vehicle i is obtained according to the center of the bounding box in Y'(i); the scale of the preceding vehicle i is obtained according to the width and height of the bounding box in Y'(i).

所述前方车辆的包围框序列采用如下步骤计算:The bounding box sequence of the preceding vehicle is calculated using the following steps:

A.1、对连续T个时刻的视频帧图像进行车辆检测,得到每帧图像中所有车辆的包围框;A.1. Perform vehicle detection on video frame images of T consecutive moments to obtain the bounding boxes of all vehicles in each frame of image;

A.2、采用多目标跟踪算法跟踪步骤A.1得到的车辆包围框,对不同帧中同一车辆给出相同编号,按时间顺序构成T个时刻前方车辆包围框序列B。A.2. Use the multi-target tracking algorithm to track the vehicle bounding box obtained in step A.1, give the same number to the same vehicle in different frames, and form the bounding box sequence B of the preceding vehicle at T times in time order.

所述前方车辆包围框内的光流序列采用如下步骤计算:The optical flow sequence in the bounding box of the preceding vehicle is calculated by the following steps:

B.1、对连续T个时刻的视频帧图像,计算每一帧与其前一帧图像的光流,得到每一帧图像对应的光流图;所述光流图中第j个像素点的二维光流矢量为:Ij=(uj,vj),uj,vj分别为光流矢量的垂直分量和水平分量;B.1. For video frame images at consecutive T times, calculate the optical flow of each frame and its previous frame image, and obtain the optical flow map corresponding to each frame image; the jth pixel in the optical flow map The two-dimensional optical flow vector is: I j =(u j , v j ), u j , v j are the vertical and horizontal components of the optical flow vector, respectively;

B.2、在第t-τ时刻的图像对应的光流图中截取第t-τ时刻图像中前方车辆包围框覆盖部分,并缩放至预设的统一尺寸,得到第t-τ时刻的包围框内的光流图,按时间顺序构成T个时刻前方车辆包围框内的光流序列F,t-τ表示时刻t前的第τ个时刻,0≤τ<T。B.2. In the optical flow diagram corresponding to the image at time t-τ, intercept the part covered by the bounding box of the vehicle ahead in the image at time t-τ, and scale it to a preset uniform size to obtain the bounding box at time t-τ The optical flow graph in the box constitutes the optical flow sequence F in the bounding box of the vehicle ahead at T times in time order, t-τ represents the τth moment before time t, 0≤τ<T.

所述本车的运动预测序列采用如下步骤计算:The motion prediction sequence of the vehicle is calculated by the following steps:

C.1、对当前时刻t之前的t-0,t-1,…,t-(T-1)时刻的视频帧,计算相邻时刻视频帧Pt-τ-1和Pt-τ的相机旋转矩阵Rt-τ和平移向量Vt-τ,构成旋转矩阵序列RS和平移向量序列VS,0≤τ<T,具体包括步骤C.1-1至步骤C.1-2:C.1. For the video frames at the time t-0, t-1,..., t-(T-1) before the current time t, calculate the difference between the video frames P t-τ-1 and P t-τ at the adjacent time Camera rotation matrix R t-τ and translation vector V t-τ , constitute rotation matrix sequence RS and translation vector sequence VS, 0≤τ<T, specifically including steps C.1-1 to C.1-2:

C.1-1、采用八点法,计算得到本质矩阵E,方法如下:C.1-1. Using the eight-point method, calculate the essential matrix E, the method is as follows:

C.1-1-1、采用Surf算法,提取Pt-τ-1和Pt-τ的特征点,并选取8对最匹配的特征点(al,a′l),l=1,2,…,8;其中al,a′l分别表示视频帧Pt-τ-1和Pt-τ中第l对匹配的特征点像素位置在归一化平面上的坐标,al=[xl,yl,1]T,a′l=[x′l,y′l,1]T;al,a′l均为3×1的矩阵,其中T表示矩阵的转置;C.1-1-1. Using the Surf algorithm, extract the feature points of P t-τ-1 and P t-τ , and select 8 pairs of the most matching feature points (a l , a' l ), l=1, 2 , . _ _ [x l ,y l ,1] T , a' l =[x' l ,y' l ,1] T ; a l , a' l are both 3×1 matrices, where T represents the transpose of the matrix;

C.1-1-2、将8对匹配的特征点组合,得到3×8的矩阵a和a′:C.1-1-2. Combine 8 pairs of matched feature points to obtain 3×8 matrices a and a':

Figure BDA0002899371690000051
根据a和a′建立对极约束公式:
Figure BDA0002899371690000051
According to a and a′, establish the polar constraint formula:

aTEa′=0a T Ea'=0

解上述方程组得到本质矩阵E,E为3×3的矩阵;Solve the above equations to get the essential matrix E, where E is a 3×3 matrix;

C.1-2、对E进行奇异值分解,得到相机的旋转矩阵Rt-τ和平移向量Vt-τ,其中Rt-τ为3×3的矩阵,Vt-τ为3维列向量;C.1-2. Perform singular value decomposition on E to obtain the rotation matrix R t-τ and translation vector V t-τ of the camera, where R t-τ is a 3×3 matrix, and V t-τ is a 3-dimensional column vector;

最终得到t时刻前T个视频帧的旋转矩阵序列RS={Rt-(T-1),…,Rt-τ,…,Rt-1,Rt-0},t时刻前T个视频帧的平移向量序列VS={Vt-(T-1),…,Vt-τ,…,Vt-1,Vt-0};Finally, the rotation matrix sequence RS={R t-(T-1) ,...,R t-τ ,...,R t-1 ,R t-0 } of the T video frames before time t is obtained, and T before time t The translation vector sequence VS={V t-(T-1) ,...,V t-τ ,...,V t-1 ,V t-0 };

C.2、对于C.1得到的RS和VS中的相机旋转矩阵和平移向量,计算每一个Rt-τ和Vt-τ与其前一时刻的累积值,所述累积值用R′t-τ和V′t-τ表示,如下公式所示:C.2. For the camera rotation matrix and translation vector in RS and VS obtained in C.1, calculate the cumulative value of each R t-τ and V t-τ and its previous moment, the cumulative value is R' t -τ and V′ t-τ are expressed as follows:

Figure BDA0002899371690000052
Figure BDA0002899371690000052

Figure BDA0002899371690000053
Figure BDA0002899371690000053

C.3、将C.2最后计算得到的R′t-0和V′t-0传递给相机在下一时刻的旋转矩阵和平移向量,如下公式所示:C.3. Pass the R' t-0 and V' t-0 calculated in C.2 to the rotation matrix and translation vector of the camera at the next moment, as shown in the following formula:

Rt+1=R′t-0 R t+1 =R' t-0

Vt+1=V′t-0 V t+1 =V' t-0

C.4、将C.3得到的Rt+1和Vt+1分别添加在C.1得到的旋转矩阵序列RS和平移向量序列VS末尾,并继续执行C.2和C.3,直到得到t时刻后△个视频帧的所有旋转矩阵{Rt+1,Rt+2,…,Rt+δ,…,Rt+△},t时刻后△个视频帧的所有平移向量{Vt+1,Vt+2,…,Vt+δ,…,Vt+△},1≤δ≤△;C.4. Add R t+1 and V t+1 obtained in C.3 to the end of the rotation matrix sequence RS and translation vector sequence VS obtained in C.1, and continue to execute C.2 and C.3 until Obtain all rotation matrices {R t+1 ,R t+2 ,…,R t+δ ,…,R t+△ } of △ video frames after time t, and all translation vectors {V of △ video frames after time t t+1 ,V t+2 ,…,V t+δ ,…,V t+△ }, 1≤δ≤△;

C.5、计算本车在当前时刻t后△个时刻的运动向量,构成本车的运动预测序列M={Mt+1,Mt+2,…,Mt+δ,…,Mt+△},具体包括步骤C.5-1至C.5-2:C.5. Calculate the motion vector of the vehicle at △ times after the current time t to form the motion prediction sequence M={M t+1 ,M t+2 ,…,M t+δ ,…,M t+ △ }, specifically including steps C.5-1 to C.5-2:

C.5-1、从旋转矩阵Rt+δ中提取相机在x,y,z轴的旋转角度信息,并用3维行向量

Figure BDA0002899371690000061
表示,其中:C.5-1. Extract the rotation angle information of the camera in the x, y, z axes from the rotation matrix R t+δ , and use the 3-dimensional row vector
Figure BDA0002899371690000061
means, where:

Figure BDA0002899371690000062
Figure BDA0002899371690000062

Figure BDA0002899371690000063
Figure BDA0002899371690000063

Figure BDA0002899371690000064
Figure BDA0002899371690000064

上式中,rjk表示旋转矩阵Rt+δ中第j行第k列的值,j,k∈{1,2,3};atan2()与atan()均表示反正切函数,但是atan()求出的结果取值范围为(0,2π],atan2()求出的结果取值范围为(-π,π];In the above formula, r jk represents the value of the jth row and the kth column in the rotation matrix R t+δ , j,k∈{1,2,3}; both atan2() and atan() represent the arctangent function, but atan The value range of the result obtained by () is (0, 2π], and the value range of the result obtained by atan2() is (-π, π];

C.5-2、将向量ψt+δ与转换为三维行向量的平移向量Vt+δ T连接,组成一个6维行向量Mt+δ:Mt+δ=[ψt+δ,Vt+δ T];C.5-2. Connect the vector ψ t+δ with the translation vector V t+δ T converted into a three-dimensional row vector to form a 6-dimensional row vector M t+δ : M t+δ =[ψ t+δ , V t + δ T ];

最终得到本车的运动预测序列M={Mt+1,Mt+2,…,Mt+δ,…,Mt+△};Finally, the motion prediction sequence M={M t+1 ,M t+2 ,…,M t+δ ,…,M t+△ } is obtained;

C.6、将M经过一个全连接层FC4,变换其所有运动向量的维度。C.6. Pass M through a fully connected layer FC 4 to transform the dimensions of all its motion vectors.

所述前方车辆包围框编码器包括编码门控循环神经网络GRUb和第一全连接层FC1;所述GRUb的输入为前方车辆的包围框序列B中每个时刻的包围框Bt-τ,以及上一时刻GRUb传下来的隐藏状态矢量

Figure BDA0002899371690000065
输出为当前时刻的前方车辆包围框编码结果
Figure BDA0002899371690000066
FC1对GRUb最终输出
Figure BDA0002899371690000067
进行维度变换,得到当前时刻t前方车辆的时序特征矢量
Figure BDA0002899371690000068
The preceding vehicle bounding box encoder includes an encoding gated recurrent neural network GRU b and a first fully connected layer FC 1 ; the input of the GRU b is the bounding box B t- at each moment in the bounding box sequence B of the preceding vehicle τ , and the hidden state vector passed down by GRU b at the previous moment
Figure BDA0002899371690000065
The output is the encoding result of the bounding box of the front vehicle at the current moment
Figure BDA0002899371690000066
FC 1 to GRU b final output
Figure BDA0002899371690000067
Perform dimension transformation to obtain the time series feature vector of the vehicle ahead at the current time t
Figure BDA0002899371690000068

所述前方车辆光流编码器包括基于CNN的运动特征提取网络FEN和第二全连接层FC2;所述FEN的输入为前方车辆包围框内的光流序列F,输出为当前时刻的前方车辆包围框内光流编码结果;所述FEN基于ResNet50架构,包括依次连接的一个卷积层conv1,一个Relu层、一个最大池化层maxPool、4个残差学习块;其中conv1的输入通道数为2m,m为对光流序列F中光流图的采样数,即从F中均匀采样m个光流图;4个残差学习块均为为三层结构,即每个残差学习块为3个串接在一起的卷积网络层和Relu层;The optical flow encoder of the preceding vehicle includes a CNN-based motion feature extraction network FEN and a second fully connected layer FC 2 ; the input of the FEN is the optical flow sequence F in the bounding box of the preceding vehicle, and the output is the preceding vehicle at the current moment. The optical flow encoding result in the bounding box; the FEN is based on the ResNet50 architecture, including a convolutional layer conv1 connected in sequence, a Relu layer, a maximum pooling layer maxPool, and 4 residual learning blocks; The number of input channels of conv1 is 2m, m is the sampling number of optical flow graphs in the optical flow sequence F, that is, m optical flow graphs are uniformly sampled from F; the four residual learning blocks are all three-layer structures, that is, each residual learning block is 3 convolutional network layers and Relu layers concatenated together;

对前方车辆包围框内的光流序列F均匀采样m个光流图,m个光流图的垂直分量和水平分量构成2m个光流分量输入FEN中,FEN的输出为当前时刻的前方车辆包围框内光流图中的运动特征;The optical flow sequence F in the bounding box of the preceding vehicle is uniformly sampled with m optical flow graphs, and the vertical and horizontal components of the m optical flow graphs form 2m optical flow components, which are input into FEN, and the output of FEN is the surrounding vehicle at the current moment. Motion features in the optical flow map within the box;

FC2对FEN输出的运动特征进行维度变换,得到当前时刻t前方车辆的运动特征矢量

Figure BDA0002899371690000071
FC 2 performs dimensional transformation on the motion feature output by FEN, and obtains the motion feature vector of the vehicle ahead at the current time t
Figure BDA0002899371690000071

所述前方车辆位置预测解码器包括解码门控循环神经网络GRUd和第三全连接层FC3;所述GRUd的输入为t+δ时刻本车运动信息预测值Mt+δ与上一时刻GRUd传下来的隐藏状态矢量

Figure BDA0002899371690000072
的融合矢量Mht+δ,以及上一时刻GRUd传下来的隐藏状态矢量
Figure BDA0002899371690000073
1≤δ≤△,
Figure BDA0002899371690000074
输出为t+δ时刻前方车辆包围框解码结果
Figure BDA0002899371690000075
FC3
Figure BDA0002899371690000076
进行维度变换,得到t+δ时刻前方车辆包围框。The preceding vehicle position prediction decoder includes a decoding gated recurrent neural network GRU d and a third fully connected layer FC 3 ; the input of the GRU d is the predicted value M t+δ of the vehicle motion information at time t+δ and the previous Hidden state vector passed from GRU d at moment
Figure BDA0002899371690000072
The fusion vector Mh t+δ of , and the hidden state vector passed down by GRU d at the previous moment
Figure BDA0002899371690000073
1≤δ≤△,
Figure BDA0002899371690000074
The output is the decoding result of the bounding box of the vehicle ahead at time t+δ
Figure BDA0002899371690000075
FC 3 pair
Figure BDA0002899371690000076
Perform dimension transformation to obtain the bounding box of the vehicle ahead at time t+δ.

另一方面,本发明还公开了实现上述基于车载视频的前方车辆位置预测方法的预测系统,包括:On the other hand, the present invention also discloses a prediction system for realizing the above-mentioned method for predicting the position of the vehicle ahead based on the in-vehicle video, including:

基于编解码框架的车辆位置预测模型,用于根据当前时刻t之前的t-0,t-1,…,t-(T-1)时刻前方车辆包围框、所述包围框内的光流、本车在当前时刻t之后的t+1,t+2,…,t+△时刻的运动信息,预测前方车辆在当前时刻t之后的t+1,t+2,…,t+△时刻的包围框;The vehicle position prediction model based on the codec framework is used to predict the vehicle position according to the time t-0, t-1, ..., t-(T-1) before the current time t. The bounding box of the vehicle ahead, the optical flow in the bounding box, The motion information of the vehicle at the time t+1, t+2,..., t+△ after the current time t, and predict the bounding box of the vehicle ahead at the time t+1, t+2,..., t+△ after the current time t ;

所述车辆位置预测模型包括:前方车辆包围框编码器、前方车辆光流编码器、特征融合单元、前方车辆位置预测解码器;The vehicle position prediction model includes: a front vehicle bounding box encoder, a front vehicle optical flow encoder, a feature fusion unit, and a front vehicle position prediction decoder;

所述前方车辆包围框编码器用于对前方车辆的包围框序列B编码,得到前方车辆的时序特征矢量

Figure BDA0002899371690000077
The preceding vehicle bounding box encoder is used to encode the bounding box sequence B of the preceding vehicle to obtain the time series feature vector of the preceding vehicle
Figure BDA0002899371690000077

所述前方车辆光流编码器用于对前方车辆包围框内的光流序列F编码,得到前方车辆的运动特征矢量

Figure BDA0002899371690000078
The preceding vehicle optical flow encoder is used to encode the optical flow sequence F in the bounding box of the preceding vehicle to obtain the motion feature vector of the preceding vehicle
Figure BDA0002899371690000078

所述特征融合单元将前方车辆的时序特征矢量

Figure BDA0002899371690000079
和运动特征矢量
Figure BDA00028993716900000710
连接为前车的融合特征矢量
Figure BDA00028993716900000711
The feature fusion unit combines the time series feature vector of the preceding vehicle
Figure BDA0002899371690000079
and motion feature vectors
Figure BDA00028993716900000710
Concatenated as the fused feature vector of the preceding vehicle
Figure BDA00028993716900000711

所述前方车辆位置预测解码器根据本车的运动预测序列M对特征矢量

Figure BDA0002899371690000081
解码,得到当前时刻t后的△个时刻的视频帧中前方车辆的预测包围框;The preceding vehicle position prediction decoder predicts the sequence M pairs of feature vectors according to the motion of the vehicle.
Figure BDA0002899371690000081
Decoding to obtain the predicted bounding box of the vehicle ahead in the video frame at △ times after the current time t;

车辆包围框获取模块,用于获取车载视频中前方车辆的包围框序列B;a vehicle bounding box obtaining module, used to obtain the bounding box sequence B of the preceding vehicle in the vehicle video;

车辆包围框光流获取模块,用于获取车载视频中前方车辆包围框内的光流序列F;The vehicle bounding box optical flow acquisition module is used to obtain the optical flow sequence F in the bounding box of the preceding vehicle in the vehicle video;

本车运动信息预测模块,用于预测本车在未来时间的运动信息,构成本车运动预测序列M。The vehicle motion information prediction module is used to predict the motion information of the vehicle in the future, and constitutes the vehicle motion prediction sequence M.

有益效果:本发明公开前方车辆位置预测方法具有以下优点:1、本发明仅基于行车记录仪拍摄的视频图像信息,有效解决了现有技术中其他方法中需要依赖多种传感器获取信息而导致的在当下生产车辆中适用性不高的的问题;2、本发明采用基于编码-解码框架的深度学习网络模型,不仅能预测前方车辆的位置,还能预测前方车辆的尺度,显著提高了其预测的性能。Beneficial effects: The method for predicting the position of the vehicle ahead disclosed by the present invention has the following advantages: 1. The present invention is only based on the video image information captured by the driving recorder, which effectively solves the problem that other methods in the prior art need to rely on a variety of sensors to obtain information. The problem of low applicability in current production vehicles; 2. The present invention adopts a deep learning network model based on an encoding-decoding framework, which can not only predict the position of the vehicle ahead, but also predict the scale of the vehicle ahead, which significantly improves its prediction. performance.

附图说明Description of drawings

图1为本发明公开基于车载视频的前方车辆位置预测方法的流程图;1 is a flow chart of a method for predicting the position of a vehicle ahead based on a vehicle-mounted video disclosed by the present invention;

图2为视频帧车辆检测跟踪的示意图;2 is a schematic diagram of video frame vehicle detection and tracking;

图3为相邻帧的光流提取方法示意图;3 is a schematic diagram of an optical flow extraction method for adjacent frames;

图4为车辆位置预测模型的结构示意图;4 is a schematic structural diagram of a vehicle position prediction model;

图5为GRU的结构示意图;Fig. 5 is the structural schematic diagram of GRU;

图6为运动特征提取网络的结构示意图;6 is a schematic structural diagram of a motion feature extraction network;

图7为滑动窗示意图;7 is a schematic diagram of a sliding window;

图8为实施例中预测结果示意图;8 is a schematic diagram of a prediction result in an embodiment;

图9为本发明公开基于车载视频的前方车辆位置预测系统的结构示意图。FIG. 9 is a schematic structural diagram of the system for predicting the position of the vehicle ahead based on the in-vehicle video disclosed in the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施方式,进一步阐明本发明。The present invention will be further explained below in conjunction with the accompanying drawings and specific embodiments.

如图1所示,本发明公开了一种基于车载视频的前方车辆位置预测方法,包括训练阶段和预测阶段,其中训练阶段包括:As shown in FIG. 1 , the present invention discloses a method for predicting the position of a vehicle ahead based on in-vehicle video, including a training stage and a prediction stage, wherein the training stage includes:

S1、构建基于编解码框架的车辆位置预测模型,所述车辆位置预测模型用于根据当前时刻t之前的t-0,t-1,…,t-(T-1)时刻前方车辆包围框、所述包围框内的光流、本车在当前时刻t之后的t+1,t+2,…,t+△时刻的运动信息,预测前方车辆在当前时刻t之后的t+1,t+2,…,t+△时刻的包围框;S1. Construct a vehicle position prediction model based on an encoding and decoding framework. The vehicle position prediction model is used for the bounding box of the vehicle ahead at the time t-0, t-1, . . . , t-(T-1) before the current time t, The optical flow in the bounding box, the motion information of the vehicle at times t+1, t+2, ..., t+△ after the current time t, and the prediction of the vehicle ahead at t+1, t+2 after the current time t ,…, the bounding box at time t+△;

本实施例中,T=20,△=40;In this embodiment, T=20, Δ=40;

所述车辆位置预测模型的输入包括:当前时刻t前的T个时刻的视频帧中,前方车辆的包围框序列B、前方车辆包围框内的光流序列F,以及当前时刻t后的△个时刻的视频帧中,本车的运动预测序列M;The input of the vehicle position prediction model includes: in the video frames at T times before the current time t, the bounding box sequence B of the preceding vehicle, the optical flow sequence F in the bounding box of the preceding vehicle, and the △ frames after the current time t. In the video frame at the moment, the motion prediction sequence M of the vehicle;

其中B=[Bt-0,Bt-1,…,Bt-τ,…Bt-(T-1)],Bt-τ表示前方车辆在时刻t前的第τ个时刻的视频帧中的包围框,所述包围框用包围框中心点的横纵坐标xt-τ,yt-τ、包围框的宽wt-τ、高ht-τ表示,即Bt-τ=(xt-τ,yt-τ,wt-τ,ht-τ);0≤τ<T;where B=[B t-0 , B t-1 ,...,B t-τ ,...B t-(T-1) ], B t-τ represents the video of the vehicle ahead at the τth time before time t The bounding box in the frame, the bounding box is represented by the horizontal and vertical coordinates x t-τ , y t-τ of the center point of the bounding box, the width w t-τ of the bounding box, and the height h t-τ , namely B t-τ =(x t-τ , y t-τ , w t-τ , h t-τ ); 0≤τ<T;

本发明中,前方车辆的包围框序列采用如下步骤计算:In the present invention, the bounding box sequence of the preceding vehicle is calculated by the following steps:

A.1、对连续T个时刻的视频帧图像进行车辆检测,得到每帧图像中所有车辆的包围框;A.1. Perform vehicle detection on video frame images of T consecutive moments to obtain the bounding boxes of all vehicles in each frame of image;

本实施例采用基于Mask-RCNN建立的车辆检测模型进行车辆检测,所述车辆检测模型采用COCO数据集进行训练,其输出为图像中的车辆包围框,每个包围框用4维向量表示;视频中的图像尺寸在输入Mask-RCNN前统一缩放至1024*1024。In this embodiment, the vehicle detection model established based on Mask-RCNN is used for vehicle detection. The vehicle detection model is trained using the COCO data set, and the output is the vehicle bounding box in the image, and each bounding box is represented by a 4-dimensional vector; video The size of the images in is uniformly scaled to 1024*1024 before input to Mask-RCNN.

A.2、采用多目标跟踪算法跟踪步骤A.1得到的车辆包围框,对不同帧中同一车辆给出相同编号,按时间顺序构成T个时刻前方车辆包围框序列B。本实施例中采用Sort算法进行多目标跟踪,Sort算法是一种在线实时多目标跟踪算法,适用于车载视频中车辆的跟踪。图2为视频帧车辆检测跟踪的示意图。图2中不同时刻的两幅视频帧中检测到3辆车,对相同的车辆编号,分别为1,2,3。A.2. Use the multi-target tracking algorithm to track the vehicle bounding box obtained in step A.1, give the same number to the same vehicle in different frames, and form the bounding box sequence B of the preceding vehicle at T times in time order. In this embodiment, the Sort algorithm is used for multi-target tracking. The Sort algorithm is an online real-time multi-target tracking algorithm, which is suitable for tracking vehicles in vehicle-mounted videos. FIG. 2 is a schematic diagram of vehicle detection and tracking in video frames. 3 vehicles are detected in the two video frames at different times in Figure 2, and the same vehicle numbers are 1, 2, and 3, respectively.

F=[Ft-0,Ft-1,…,Ft-τ,…Ft-(T-1)],Ft-τ表示前方车辆在时刻t前的第τ个时刻的视频帧中的包围框内的光流图,Ft-τ={(ut-τ(p),vt-τ(p))},(ut-τ(p),vt-τ(p))为所述光流图中第p个像素点处的二维光流矢量;F=[F t-0 ,F t-1 ,...,F t-τ ,...F t-(T-1) ], F t-τ represents the video frame of the preceding vehicle at the τth time before time t The optical flow graph inside the bounding box in , F t-τ = {(u t-τ (p), v t-τ (p))}, (u t-τ (p), v t-τ (p )) is the two-dimensional optical flow vector at the p-th pixel in the optical flow diagram;

所述前方车辆包围框内的光流序列采用如下步骤计算:The optical flow sequence in the bounding box of the preceding vehicle is calculated by the following steps:

B.1、对连续T个时刻的视频帧图像,计算每一帧与其前一帧图像的光流,得到每一帧图像对应的光流图;本实施例采用FlowNet2算法进行相邻帧的光流计算;所述光流图中第j个像素点的二维光流矢量为:Ij=(uj,vj),uj,vj分别为光流矢量的垂直分量和水平分量;如图3所示。B.1. For the video frame images of consecutive T times, calculate the optical flow of each frame and its previous frame image, and obtain the optical flow diagram corresponding to each frame image; this embodiment uses the FlowNet2 algorithm to perform optical flow of adjacent frames. Flow calculation; the two-dimensional optical flow vector of the jth pixel in the optical flow diagram is: I j =(u j , v j ), u j , v j are the vertical component and the horizontal component of the optical flow vector respectively; As shown in Figure 3.

B.2、在第t-τ时刻的图像对应的光流图中截取第t-τ时刻图像中前方车辆包围框覆盖部分,并缩放至预设的统一尺寸,得到第t-τ时刻的包围框内的光流图,按时间顺序构成T个时刻前方车辆包围框内的光流序列F,t-τ表示时刻t前的第τ个时刻,0≤τ<T。本实施例中,将包围框内的光流图统一缩放至224*224。B.2. In the optical flow diagram corresponding to the image at time t-τ, intercept the part covered by the bounding box of the vehicle ahead in the image at time t-τ, and scale it to a preset uniform size to obtain the bounding box at time t-τ The optical flow graph in the box constitutes the optical flow sequence F in the bounding box of the vehicle ahead at T times in time order, t-τ represents the τth moment before time t, 0≤τ<T. In this embodiment, the optical flow graph in the bounding box is uniformly scaled to 224*224.

行车过程中,除了车前方场景中的车辆运动,本车自身也在运动,要预测车前方车辆的运动,也必须预测本车自身的运动。During driving, in addition to the motion of the vehicle in the scene in front of the vehicle, the vehicle itself is also moving. To predict the motion of the vehicle in front of the vehicle, the motion of the vehicle itself must also be predicted.

本车的运动信息预测序列采用如下步骤计算:The motion information prediction sequence of the vehicle is calculated by the following steps:

C.1、对当前时刻t之前的t-0,t-1,…,t-(T-1)时刻的视频帧,计算相邻时刻视频帧Pt-τ-1和Pt-τ的相机旋转矩阵Rt-τ和平移向量Vt-τ,构成旋转矩阵序列RS和平移向量序列VS,0≤τ<T,具体包括步骤C.1-1至步骤C.1-2:C.1. For the video frames at the time t-0, t-1,..., t-(T-1) before the current time t, calculate the difference between the video frames P t-τ-1 and P t-τ at the adjacent time Camera rotation matrix R t-τ and translation vector V t-τ , constitute rotation matrix sequence RS and translation vector sequence VS, 0≤τ<T, specifically including steps C.1-1 to C.1-2:

C.1-1、采用八点法,计算得到本质矩阵E,方法如下:C.1-1. Using the eight-point method, calculate the essential matrix E, the method is as follows:

C.1-1-1、采用Surf算法,提取Pt-τ-1和Pt-τ的特征点,并选取8对最匹配的特征点(al,a′l),l=1,2,…,8;其中al,a′l分别表示视频帧Pt-τ-1和Pt-τ中第l对匹配的特征点像素位置在归一化平面上的坐标,al=[xl,yl,1]T,a′l=[x′l,y′l,1]T;al,a′l均为3×1的矩阵,其中T表示矩阵的转置;C.1-1-1. Using the Surf algorithm, extract the feature points of P t-τ-1 and P t-τ , and select 8 pairs of the most matching feature points (a l , a' l ), l=1, 2 , . _ _ [x l ,y l ,1] T , a' l =[x' l ,y' l ,1] T ; a l , a' l are both 3×1 matrices, where T represents the transpose of the matrix;

C.1-1-2、将8对匹配的特征点组合,得到3×8的矩阵a和a′:C.1-1-2. Combine 8 pairs of matched feature points to obtain 3×8 matrices a and a':

Figure BDA0002899371690000101
根据a和a′建立对极约束公式:
Figure BDA0002899371690000101
According to a and a′, establish the polar constraint formula:

aTEa′=0a T Ea'=0

解上述方程组得到本质矩阵E,E为3×3的矩阵;Solve the above equations to get the essential matrix E, where E is a 3×3 matrix;

C.1-2、对E进行奇异值分解,得到相机的旋转矩阵Rt-τ和平移向量Vt-τ,其中Rt-τ为3×3的矩阵,Vt-τ为3维列向量;C.1-2. Perform singular value decomposition on E to obtain the rotation matrix R t-τ and translation vector V t-τ of the camera, where R t-τ is a 3×3 matrix, and V t-τ is a 3-dimensional column vector;

最终得到t时刻前T个视频帧的旋转矩阵序列RS={Rt-(T-1),…,Rt-τ,…,Rt-1,Rt-0},t时刻前T个视频帧的平移向量序列VS={Vt-(T-1),…,Vt-τ,…,Vt-1,Vt-0};Finally, the rotation matrix sequence RS={R t-(T-1) ,...,R t-τ ,...,R t-1 ,R t-0 } of the T video frames before time t is obtained, and T before time t The translation vector sequence VS={V t-(T-1) ,...,V t-τ ,...,V t-1 ,V t-0 };

C.2、对于C.1得到的RS和VS中的相机旋转矩阵和平移向量,计算每一个Rt-τ和Vt-τ与其前一时刻的累积值,所述累积值用R′t-τ和V′t-τ表示,如下公式所示:C.2. For the camera rotation matrix and translation vector in RS and VS obtained in C.1, calculate the cumulative value of each R t-τ and V t-τ and its previous moment, the cumulative value is R' t -τ and V′ t-τ are expressed as follows:

Figure BDA0002899371690000111
Figure BDA0002899371690000111

Figure BDA0002899371690000112
Figure BDA0002899371690000112

C.3、将C.2最后计算得到的R′t-0和V′t-0传递给相机在下一时刻的旋转矩阵和平移向量,如下公式所示:C.3. Pass the R' t-0 and V' t-0 calculated in C.2 to the rotation matrix and translation vector of the camera at the next moment, as shown in the following formula:

Rt+1=R′t-0 R t+1 =R' t-0

Vt+1=V′t-0 V t+1 =V' t-0

C.4、将C.3得到的Rt+1和Vt+1分别添加在C.1得到的旋转矩阵序列RS和平移向量序列VS末尾,并继续执行C.2和C.3,直到得到t时刻后△个视频帧的所有旋转矩阵{Rt+1,Rt+2,…,Rt+δ,…,Rt+△},t时刻后△个视频帧的所有平移向量{Vt+1,Vt+2,…,Vt+δ,…,Vt+△},1≤δ≤△;C.4. Add R t+1 and V t+1 obtained in C.3 to the end of the rotation matrix sequence RS and translation vector sequence VS obtained in C.1, and continue to execute C.2 and C.3 until Obtain all rotation matrices {R t+1 ,R t+2 ,…,R t+δ ,…,R t+△ } of △ video frames after time t, and all translation vectors {V of △ video frames after time t t+1 ,V t+2 ,…,V t+δ ,…,V t+△ }, 1≤δ≤△;

C.5、计算本车在当前时刻t后△个时刻的运动向量,构成本车的运动预测序列M={Mt+1,Mt+2,…,Mt+δ,…,Mt+△},具体包括步骤C.5-1至C.5-2:C.5. Calculate the motion vector of the vehicle at △ times after the current time t to form the motion prediction sequence M={M t+1 ,M t+2 ,…,M t+δ ,…,M t+ △ }, specifically including steps C.5-1 to C.5-2:

C.5-1、从旋转矩阵Rt+δ中提取相机在x,y,z轴的旋转角度信息,并用3维行向量

Figure BDA0002899371690000113
表示,其中:C.5-1. Extract the rotation angle information of the camera in the x, y, z axes from the rotation matrix R t+δ , and use the 3-dimensional row vector
Figure BDA0002899371690000113
means, where:

Figure BDA0002899371690000114
Figure BDA0002899371690000114

Figure BDA0002899371690000115
Figure BDA0002899371690000115

Figure BDA0002899371690000116
Figure BDA0002899371690000116

上式中,rjk表示旋转矩阵Rt+δ中第j行第k列的值,j,k∈{1,2,3};atan2()与atan()均表示反正切函数,但是atan()求出的结果取值范围为(0,2π],atan2()求出的结果取值范围为(-π,π];In the above formula, r jk represents the value of the jth row and the kth column in the rotation matrix R t+δ , j,k∈{1,2,3}; both atan2() and atan() represent the arctangent function, but atan The value range of the result obtained by () is (0, 2π], and the value range of the result obtained by atan2() is (-π, π];

C.5-2、将向量ψt+δ与转换为三维行向量的平移向量Vt+δ T连接,组成一个6维行向量Mt+δ:Mt+δ=[ψt+δ,Vt+δ T];C.5-2. Connect the vector ψ t+δ with the translation vector V t+δ T converted into a three-dimensional row vector to form a 6-dimensional row vector M t+δ : M t+δ =[ψ t+δ , V t + δ T ];

最终得到本车的运动预测序列M={Mt+1,Mt+2,…,Mt+δ,…,Mt+△};Finally, the motion prediction sequence M={M t+1 ,M t+2 ,…,M t+δ ,…,M t+△ } is obtained;

C.6、将M经过一个全连接层FC4,变换其所有运动向量的维度,使其与解码门控循环神经网络GRUd上一时刻传下来的隐藏状态矢量

Figure BDA0002899371690000122
维度一致。本实施例中全连接输出维度为512维。C.6. Pass M through a fully connected layer FC 4 to transform the dimensions of all its motion vectors so that they are the same as the hidden state vector passed down from the decoding gated recurrent neural network GRU d at the previous moment
Figure BDA0002899371690000122
Dimensions are consistent. In this embodiment, the fully connected output dimension is 512 dimensions.

所述车辆位置预测模型的输出为当前时刻t后的△个时刻的视频帧图像中前方车辆的预测包围框序列Y,Y=[Yt+1,Yt+2,…,Yt+δ,…,Yt+△];其中Yt+δ表示前方车辆在时刻t后的第δ个时刻视频帧图像中的预测包围框,所述包围框用包围框中心点的横纵坐标、包围框的宽高表示,即Yt+δ=(xt+δ,yt+δ,wt+δ,ht+δ);The output of the vehicle position prediction model is the predicted bounding box sequence Y of the preceding vehicle in the video frame images at △ times after the current time t, Y=[Y t+1 , Y t+2 ,..., Y t+δ ,...,Y t+△ ]; where Y t+δ represents the predicted bounding box in the video frame image at the δth time after the time t of the preceding vehicle, and the bounding box uses the horizontal and vertical coordinates of the center point of the bounding box, the bounding box The width and height of , namely Y t+δ =(x t+δ ,y t+δ ,w t+δ ,h t+δ );

如图4所示,车辆位置预测模型包括:前方车辆包围框编码器1-1、前方车辆光流编码器1-2、特征融合单元1-3、前方车辆位置预测解码器1-4;As shown in Figure 4, the vehicle position prediction model includes: a front vehicle bounding box encoder 1-1, a front vehicle optical flow encoder 1-2, a feature fusion unit 1-3, and a front vehicle position prediction decoder 1-4;

所述前方车辆包围框编码器1-1用于对前方车辆的包围框序列B编码,得到前方车辆的时序特征矢量

Figure BDA0002899371690000121
The preceding vehicle bounding box encoder 1-1 is used to encode the bounding box sequence B of the preceding vehicle to obtain the time series feature vector of the preceding vehicle
Figure BDA0002899371690000121

前方车辆包围框编码器主要利用门控循环神经网络(Gated Recurrent Unit,GRU)进行编码。GRU可以只保留相关信息来进行预测,而忘记不相关的数据,其结构如图5所示,输入为当前时刻的输入Int和上一时刻GRU传下来的隐藏状态矢量ht-1,ht-1表示GRU通过内部的门结构认为过去时刻中输入序列的有用信息,在本发明中该隐藏状态矢量表示前方车辆在过去时间段的位置和尺度信息。结合Int和ht-1,GRU输出当前时刻的隐藏状态矢量ht,整个前向传播过程计算公式如下:The front vehicle bounding box encoder is mainly encoded by a gated recurrent neural network (Gated Recurrent Unit, GRU). GRU can only keep relevant information for prediction, and forget irrelevant data. Its structure is shown in Figure 5. The input is the input In t at the current moment and the hidden state vector h t-1 , h passed from the GRU at the previous moment. t-1 indicates that the GRU considers the useful information of the input sequence in the past time through the internal gate structure. In the present invention, the hidden state vector represents the position and scale information of the preceding vehicle in the past time period. Combined with In t and h t-1 , GRU outputs the hidden state vector h t at the current moment. The calculation formula of the whole forward propagation process is as follows:

Figure BDA0002899371690000131
Figure BDA0002899371690000131

其中zt表示更新门的输出,σ()表示sigmoid函数,Wz表示更新门的权值参数,rt表示重置门的输出,Wr表示重置门的权值参数,

Figure BDA0002899371690000132
表示当前时刻待定的输出,tanh()表示双曲正切函数,
Figure BDA0002899371690000133
表示待定值的权值参数,[,]表示两个矢量相连。将上述公式组简记为:where z t represents the output of the update gate, σ() represents the sigmoid function, W z represents the weight parameter of the update gate, r t represents the output of the reset gate, W r represents the weight parameter of the reset gate,
Figure BDA0002899371690000132
Represents the output to be determined at the current moment, tanh() represents the hyperbolic tangent function,
Figure BDA0002899371690000133
Indicates the weight parameter of the undetermined value, [,] indicates that the two vectors are connected. The above formula group is abbreviated as:

Figure BDA0002899371690000134
Figure BDA0002899371690000134

其中c为具体的应用类别,U为GRUc当前时刻的输入值,V为GRUc的权值参数。Among them, c is the specific application category, U is the input value of GRU c at the current moment, and V is the weight parameter of GRU c .

所述前方车辆包围框编码器包括编码门控循环神经网络GRUb和第一全连接层FC1;所述GRUb的输入为前方车辆的包围框序列B中每个时刻的包围框Bt-τ,以及上一时刻GRUb传下来的隐藏状态矢量

Figure BDA0002899371690000135
输出为当前时刻的前方车辆包围框编码结果
Figure BDA0002899371690000136
FC1对GRUb最终输出
Figure BDA0002899371690000137
进行维度变换,得到当前时刻t前方车辆的时序特征矢量
Figure BDA0002899371690000138
The preceding vehicle bounding box encoder includes an encoding gated recurrent neural network GRU b and a first fully connected layer FC 1 ; the input of the GRU b is the bounding box B t- at each moment in the bounding box sequence B of the preceding vehicle τ , and the hidden state vector passed down by GRU b at the previous moment
Figure BDA0002899371690000135
The output is the encoding result of the bounding box of the front vehicle at the current moment
Figure BDA0002899371690000136
FC 1 to GRU b final output
Figure BDA0002899371690000137
Perform dimension transformation to obtain the time series feature vector of the vehicle ahead at the current time t
Figure BDA0002899371690000138

编码门控循环神经网络GRUb的结构为:The structure of the encoding gated recurrent neural network GRU b is:

Figure BDA0002899371690000139
Figure BDA0002899371690000139

其中φ()表示使用ReLU激活函数进行线性映射,θb表示GRUb中的权值参数V。本实施例中,

Figure BDA00028993716900001310
的维度为512,FC1
Figure BDA00028993716900001311
的维度变换为256,即
Figure BDA00028993716900001312
的维度为256。where φ() represents linear mapping using the ReLU activation function, and θ b represents the weight parameter V in GRU b . In this embodiment,
Figure BDA00028993716900001310
has a dimension of 512, FC 1 will
Figure BDA00028993716900001311
The dimension of is transformed to 256, that is
Figure BDA00028993716900001312
The dimension is 256.

所述前方车辆光流编码器1-2用于对前方车辆包围框内的光流序列F编码,得到前方车辆的运动特征矢量

Figure BDA00028993716900001313
The preceding vehicle optical flow encoder 1-2 is used to encode the optical flow sequence F in the bounding box of the preceding vehicle to obtain the motion feature vector of the preceding vehicle
Figure BDA00028993716900001313

所述前方车辆光流编码器包括基于CNN的运动特征提取网络FEN和第二全连接层FC2;所述FEN的输入为前方车辆包围框内的光流序列F,输出为当前时刻的前方车辆包围框内光流编码结果;如图6所示,所述FEN基于ResNet50架构,包括依次连接的一个卷积层conv1,一个Relu层、一个最大池化层maxPool、4个残差学习块,如图6-(a)所示;其中conv1的输入通道数为2m,m为对光流序列F中光流图的采样数,即从F中均匀采样m个光流图,本实施例中m=10;4个残差学习块均为为三层结构,即每个残差学习块为3个串接在一起的卷积网络层Conv2和Relu层,如图6-(b)所示。The optical flow encoder of the preceding vehicle includes a CNN-based motion feature extraction network FEN and a second fully connected layer FC 2 ; the input of the FEN is the optical flow sequence F in the bounding box of the preceding vehicle, and the output is the preceding vehicle at the current moment. The optical flow encoding result in the bounding box; as shown in Figure 6, the FEN is based on the ResNet50 architecture, including a convolutional layer conv1, a Relu layer, a maximum pooling layer maxPool, and 4 residual learning blocks connected in sequence, such as As shown in Figure 6-(a); the number of input channels of conv1 is 2m, and m is the number of samples of the optical flow map in the optical flow sequence F, that is, m optical flow maps are uniformly sampled from F, in this embodiment m = 10; the four residual learning blocks are all three-layer structures, that is, each residual learning block consists of three convolutional network layers Conv2 and Relu layers concatenated together, as shown in Figure 6-(b).

对前方车辆包围框内的光流序列F均匀采样m个光流图,每一个光流图的垂直分量和水平分量,看作光流图的两个通道。m个光流图的垂直分量和水平分量构成2m个光流分量输入FEN中,FEN的输出为当前时刻的前方车辆包围框内光流图中的运动特征;本实施例中FEN提取的运动特征维度为2048维,FC2将FEN输出的运动特征的维度变换为256,得到当前时刻t前方车辆的256维运动特征矢量

Figure BDA0002899371690000141
M optical flow graphs are uniformly sampled for the optical flow sequence F in the bounding box of the preceding vehicle, and the vertical and horizontal components of each optical flow graph are regarded as two channels of the optical flow graph. The vertical and horizontal components of the m optical flow graphs form 2m optical flow components, which are input into the FEN, and the output of the FEN is the motion feature in the optical flow graph in the bounding box of the vehicle ahead at the current moment; the motion feature extracted by the FEN in this embodiment The dimension is 2048, FC 2 transforms the dimension of the motion feature output by FEN to 256, and obtains the 256-dimensional motion feature vector of the vehicle ahead at the current time t
Figure BDA0002899371690000141

所述特征融合单元1-3将前方车辆的时序特征矢量

Figure BDA0002899371690000142
和运动特征矢量
Figure BDA0002899371690000143
连接为前车的融合特征矢量
Figure BDA0002899371690000144
Figure BDA0002899371690000145
表示车辆包围框历史信息和光流历史信息,即前方车辆在过去时间段中不同时间点的位置、尺度、外观和运动信息;本实施例中,
Figure BDA0002899371690000146
为512维矢量。The feature fusion unit 1-3 combines the time series feature vector of the preceding vehicle
Figure BDA0002899371690000142
and motion feature vectors
Figure BDA0002899371690000143
Concatenated as the fused feature vector of the preceding vehicle
Figure BDA0002899371690000144
Figure BDA0002899371690000145
Represents the historical information of the vehicle bounding box and the historical information of optical flow, that is, the position, scale, appearance and motion information of the preceding vehicle at different time points in the past time period; in this embodiment,
Figure BDA0002899371690000146
is a 512-dimensional vector.

所述前方车辆位置预测解码器1-4根据本车的运动预测序列M对特征矢量

Figure BDA0002899371690000147
解码,得到当前时刻t后的△个时刻的视频帧中前方车辆的预测包围框;The preceding vehicle position prediction decoder 1-4 predicts the sequence M pairs of feature vectors according to the motion of the vehicle
Figure BDA0002899371690000147
Decoding to obtain the predicted bounding box of the vehicle ahead in the video frame at △ times after the current time t;

所述前方车辆位置预测解码器包括解码门控循环神经网络GRUd和第三全连接层FC3;所述GRUd的输入为t+δ时刻本车运动信息预测值Mt+δ与上一时刻GRUd传下来的隐藏状态矢量

Figure BDA0002899371690000148
的融合矢量Mht+δ,以及上一时刻GRUd传下来的隐藏状态矢量
Figure BDA0002899371690000149
1≤δ≤△,
Figure BDA00028993716900001410
输出为t+δ时刻前方车辆包围框解码结果
Figure BDA00028993716900001411
FC3
Figure BDA00028993716900001412
进行维度变换,转换为4维矢量,得到t+δ时刻前方车辆包围框。The preceding vehicle position prediction decoder includes a decoding gated recurrent neural network GRU d and a third fully connected layer FC 3 ; the input of the GRU d is the predicted value M t+δ of the vehicle motion information at time t+δ and the previous Hidden state vector passed from GRU d at moment
Figure BDA0002899371690000148
The fusion vector Mh t+δ of , and the hidden state vector passed down by GRU d at the previous moment
Figure BDA0002899371690000149
1≤δ≤△,
Figure BDA00028993716900001410
The output is the decoding result of the bounding box of the vehicle ahead at time t+δ
Figure BDA00028993716900001411
FC 3 pair
Figure BDA00028993716900001412
Perform dimensional transformation and convert it into a 4-dimensional vector to obtain the bounding box of the vehicle ahead at time t+δ.

解码门控循环神经网络GRUd的结构为:The structure of the decoding gated recurrent neural network GRU d is:

Figure BDA0002899371690000151
Figure BDA0002899371690000151

其中θd为GRUd中的权值参数V。where θ d is the weight parameter V in GRU d .

本实施例中,融合矢量Mht+δ的计算为:In this embodiment, the calculation of the fusion vector Mh t+δ is:

对6维向量Mt+δ采用第四全连接层FC4变换为512维向量

Figure BDA0002899371690000152
Figure BDA0002899371690000153
使用ReLU激活函数进行线性映射,对线性映射后的向量与
Figure BDA0002899371690000154
相加后求平均,得到512维的融合矢量Mht+δ
Figure BDA0002899371690000155
其中Average()表示对两个矢量相加后求平均。Use the fourth fully connected layer FC 4 to transform the 6-dimensional vector M t+δ into a 512-dimensional vector
Figure BDA0002899371690000152
right
Figure BDA0002899371690000153
Use the ReLU activation function to perform linear mapping, and the linearly mapped vector is
Figure BDA0002899371690000154
After adding and averaging, a 512-dimensional fusion vector Mh t+δ is obtained,
Figure BDA0002899371690000155
Among them, Average() means that the two vectors are added and averaged.

S2、构建样本集并对车辆位置预测模型进行训练,包括:S2. Construct a sample set and train a vehicle position prediction model, including:

S2-1、采集能够拍摄到前车的多个时长为s的车载视频片段,对每个视频片段中的视频帧进行采样,并确定采样后的视频帧中前方车辆的包围框序列Btr、包围框内的光流序列Ftr和视频帧对应时刻本车的运动信息序列Mtr,构成样本集;S2-1. Collect a plurality of in-vehicle video clips with a duration of s that can capture the preceding vehicle, sample the video frames in each video clip, and determine the bounding box sequence B tr of the preceding vehicle in the sampled video frame, The optical flow sequence F tr in the bounding box and the motion information sequence M tr of the vehicle at the corresponding moment of the video frame constitute a sample set;

S2-2、将样本集划分为训练集和验证集;设置学习率σ,批处理数量N;S2-2. Divide the sample set into training set and validation set; set learning rate σ and batch number N;

S2-3、训练过程采用Adam优化器,根据训练集样本数和N确定训练批次N′;将训练样本中的视频片段前s′时长的视频帧对应的Btr、Ftr,后s″时长的视频帧对应的Mtr作为车辆位置预测模型的输入,后s″时长的视频帧对应的Btr作为输出,对所述模型进行训练,保存模型参数,并用验证集验证模型的预测准确度;s′+s″=s; S2-3 . The Adam optimizer is used in the training process, and the training batch N′ is determined according to the number of samples in the training set and N; The M tr corresponding to the video frame of the duration is used as the input of the vehicle position prediction model, and the B tr corresponding to the video frame of the later s″ duration is used as the output, the model is trained, the model parameters are saved, and the verification set is used to verify the prediction accuracy of the model ;s'+s"=s;

S2-4、选择N′批训练中预测准确度最高的模型参数作为车辆位置预测模型的参数;S2-4, select the model parameters with the highest prediction accuracy in the N' batch training as the parameters of the vehicle position prediction model;

本实施例中,采集1000个视频片段,每个片段时长为3秒,每秒20帧,根据前1秒内的车辆包围框预测后2秒内该车辆的包围框;训练集占样本集的70%,验证集占30%。训练过程采用Adam优化器,固定学习率为0.0005,批处理数量为64,共训练40批次。训练中计算车辆的实际包围框序列

Figure BDA0002899371690000156
与预测结果中的包围框Y的差值,使用smoothL1损失函数,反馈误差,优化并保存最终的网络权重参数;损失函数如下式所示:In this embodiment, 1000 video clips are collected, each clip has a duration of 3 seconds and 20 frames per second, and the bounding box of the vehicle in the next 2 seconds is predicted according to the bounding box of the vehicle in the first 1 second; 70% and the validation set is 30%. The training process adopts the Adam optimizer, the fixed learning rate is 0.0005, the batch number is 64, and a total of 40 batches are trained. Calculate the actual bounding box sequence of the vehicle during training
Figure BDA0002899371690000156
The difference from the bounding box Y in the prediction result, using the smoothL1 loss function, feedback the error, optimize and save the final network weight parameters; the loss function is as follows:

Figure BDA0002899371690000161
Figure BDA0002899371690000161

其中|·|表示计算向量的模。where |·| denotes the modulo of the computed vector.

预测阶段包括:The forecast phase includes:

车辆上设置可以拍摄前方车辆的摄像头,获取所述摄像头在车辆行驶中采集的视频数据;A camera that can shoot the vehicle ahead is set on the vehicle, and video data collected by the camera while the vehicle is driving is acquired;

对视频中每一帧图像进行车辆检测与跟踪,得到每一辆前车的包围框序列,并存入Btest(i)中,i为前车编号;同时计算包围框内的光流,存入Ftest(i);获取本车在未来帧中的运动信息,存入序列MtestPerform vehicle detection and tracking on each frame of image in the video, obtain the bounding box sequence of each preceding vehicle, and store it in B test (i), where i is the number of the preceding vehicle; at the same time, calculate the optical flow in the bounding box and store Enter F test (i); obtain the motion information of the vehicle in the future frame, and store it in the sequence M test ;

在序列Btest(i)和Ftest(i)中采用长度为T的第一滑动窗SW-1,在序列Mtest中采用长度为△的第二滑动窗SW-2,分别截取当前时刻t前的T个视频帧中车辆i的包围框、所述包围框内的光流,以及当前时刻t后的△个视频帧中本车的运动信息预测值,输入训练好的车辆位置预测模型中,得到前方车辆i在当前时刻t后的△个视频帧中的包围框序列Y′(i)=[Y′t+1(i),Y′t+2(i),…,Y′t+δ(i),…,Y′t+△(i)],计算前方车辆i的包围框在当前时刻视频帧中的相对位置:

Figure BDA0002899371690000162
其中Btest,t+0(i)为前方车辆i在当前时刻t的包围框;1≤δ≤△;滑动窗的如图7所示。随着时间的持续,两个滑动窗均前进一格,进行下一时刻前车位置的检测。A first sliding window SW-1 with a length of T is used in the sequences B test (i) and F test (i), and a second sliding window SW-2 with a length of Δ is used in the sequence M test , respectively intercepting the current time t The bounding box of vehicle i in the previous T video frames, the optical flow in the bounding box, and the predicted value of the vehicle's motion information in △ video frames after the current time t are input into the trained vehicle position prediction model. , obtain the bounding box sequence Y′(i)=[Y′ t+1 (i), Y′ t+2 (i),…,Y′ t of the preceding vehicle i in △ video frames after the current time t +δ (i),…,Y′ t+△ (i)], calculate the relative position of the bounding box of the preceding vehicle i in the current video frame:
Figure BDA0002899371690000162
Among them, B test,t+0 (i) is the bounding box of the preceding vehicle i at the current time t; 1≤δ≤Δ; the sliding window is shown in Figure 7. As time goes on, both sliding windows move forward by one frame to detect the position of the preceding vehicle at the next moment.

根据Y′(i)中包围框的中心得到前方车辆i的预测轨迹;根据Y′(i)中包围框的宽高得到前方车辆i尺度。The predicted trajectory of the preceding vehicle i is obtained according to the center of the bounding box in Y'(i); the scale of the preceding vehicle i is obtained according to the width and height of the bounding box in Y'(i).

本实施例中,将预测结果在当前时刻的视频帧中显示出来,如图8所示。In this embodiment, the prediction result is displayed in the video frame at the current moment, as shown in FIG. 8 .

如图9所示,本发明还公开了实现上述基于车载视频的前方车辆位置预测方法的预测系统,包括:As shown in FIG. 9 , the present invention also discloses a prediction system for realizing the above-mentioned method for predicting the position of the vehicle ahead based on the in-vehicle video, including:

基于编解码框架的车辆位置预测模型1,用于根据当前时刻t之前的t-0,t-1,…,t-(T-1)时刻前方车辆包围框、所述包围框内的光流、本车在当前时刻t之后的t+1,t+2,…,t+△时刻的运动信息,预测前方车辆在当前时刻t之后的t+1,t+2,…,t+△时刻的包围框;The vehicle position prediction model 1 based on the codec framework is used for the bounding box of the vehicle ahead at the time t-0, t-1, ..., t-(T-1) before the current time t, and the optical flow in the bounding box , the motion information of the vehicle at the time t+1, t+2,..., t+△ after the current time t, and predict the surrounding of the vehicle ahead at the time t+1, t+2,..., t+△ after the current time t frame;

所述车辆位置预测模型包括:前方车辆包围框编码器1-1、前方车辆光流编码器1-2、特征融合单元1-3、前方车辆位置预测解码器1-4;The vehicle position prediction model includes: a preceding vehicle bounding box encoder 1-1, a preceding vehicle optical flow encoder 1-2, a feature fusion unit 1-3, and a preceding vehicle position prediction decoder 1-4;

所述前方车辆包围框编码器用于对前方车辆的包围框序列B编码,得到前方车辆的时序特征矢量

Figure BDA0002899371690000176
The preceding vehicle bounding box encoder is used to encode the bounding box sequence B of the preceding vehicle to obtain the time series feature vector of the preceding vehicle
Figure BDA0002899371690000176

所述前方车辆光流编码器用于对前方车辆包围框内的光流序列F编码,得到前方车辆的运动特征矢量

Figure BDA0002899371690000175
The preceding vehicle optical flow encoder is used to encode the optical flow sequence F in the bounding box of the preceding vehicle to obtain the motion feature vector of the preceding vehicle
Figure BDA0002899371690000175

所述特征融合单元将前方车辆的时序特征矢量

Figure BDA0002899371690000173
和运动特征矢量
Figure BDA0002899371690000174
连接为前车的融合特征矢量
Figure BDA0002899371690000172
The feature fusion unit combines the time series feature vector of the preceding vehicle
Figure BDA0002899371690000173
and motion feature vectors
Figure BDA0002899371690000174
Concatenated as the fused feature vector of the preceding vehicle
Figure BDA0002899371690000172

所述前方车辆位置预测解码器根据本车的运动信息预测序列M对特征矢量

Figure BDA0002899371690000171
解码,得到当前时刻t后的△个时刻的视频帧中前方车辆的预测包围框;The preceding vehicle position prediction decoder predicts the sequence M pairs of feature vectors according to the motion information of the vehicle
Figure BDA0002899371690000171
Decoding to obtain the predicted bounding box of the vehicle ahead in the video frame at △ times after the current time t;

车辆包围框获取模块2,用于获取车载视频中前方车辆的包围框序列B;The vehicle bounding box obtaining module 2 is used to obtain the bounding box sequence B of the preceding vehicle in the vehicle video;

车辆包围框光流获取模块3,用于获取车载视频中前方车辆包围框内的光流序列F;The vehicle bounding box optical flow acquisition module 3 is used to obtain the optical flow sequence F within the bounding box of the preceding vehicle in the in-vehicle video;

本车运动信息预测模块4,用于预测本车在未来时间的运动信息,构成本车运动预测序列M。The own vehicle motion information prediction module 4 is used for predicting the own vehicle motion information in the future time, and constitutes the own vehicle motion prediction sequence M.

Claims (10)

1. A vehicle-mounted video-based front vehicle position prediction method comprises a training phase and a prediction phase, and is characterized in that the training phase comprises the following steps:
s1, constructing a vehicle position prediction model based on a coding and decoding frame, wherein the vehicle position prediction model is used for predicting enclosing frames of the front vehicle at T +1, T +2, … and T + delta moments after the current moment T according to T-0, T-1, … and T- (T-1) moments before the current moment T, optical flows in the enclosing frames and motion information of the vehicle at T +1, T +2, … and T + delta moments after the current moment T;
the input of the vehicle position prediction model includes: in the video frames at T moments before the current moment T, the surrounding frame sequence B of the front vehicle, the optical flow sequence F in the surrounding frame of the front vehicle and the motion prediction sequence M of the self vehicle in the video frames at delta moments after the current moment T;
the output of the vehicle position prediction model is a predicted bounding box sequence Y of a front vehicle in a video frame image of delta moments after the current moment t;
the vehicle position prediction model includes: the system comprises a front vehicle surrounding frame encoder, a front vehicle optical flow encoder, a feature fusion unit and a front vehicle position prediction decoder;
the front vehicle surrounding frame encoder is used for encoding a surrounding frame sequence B of the front vehicle to obtain a time sequence characteristic vector of the front vehicle
Figure FDA0002899371680000011
The front vehicle optical flow encoder is used for encoding an optical flow sequence F in a surrounding frame of the front vehicle to obtain a motion characteristic vector of the front vehicle
Figure FDA0002899371680000012
The feature fusion unit fuses time-series feature vectors of a preceding vehicle
Figure FDA0002899371680000013
And motion feature vector
Figure FDA0002899371680000014
Fused feature vector connected as front vehicle
Figure FDA0002899371680000016
The front vehicle position prediction decoder predicts the feature vector according to the motion prediction sequence M of the vehicle
Figure FDA0002899371680000015
Decoding to obtain a prediction surrounding frame of a front vehicle in video frames at delta moments after the current moment t;
s2, constructing a sample set and training a vehicle position prediction model, wherein the method comprises the following steps:
s2-1, collecting a plurality of vehicle-mounted video clips with the duration of S and capable of shooting a front vehicle, sampling video frames in each video clip, and determining a surrounding frame sequence B of the front vehicle in the sampled video frames tr Optical flow sequence within bounding box F tr Motion prediction sequence M of the vehicle at a time corresponding to the video frame tr Forming a sample set;
s2-2, dividing the sample set into a training set and a verification set; setting a learning rate sigma and a batch processing number N;
s2-3, adopting Adam optimizer in the training process,determining a training batch N' according to the number of the training set samples and N; b corresponding to the video frame s' duration before the video clip in the training sample tr 、F tr M corresponding to video frame of last s' duration tr As input of vehicle position prediction model, B corresponding to video frame with time length of s ″ later tr As output, training the model, storing model parameters, and verifying the prediction accuracy of the model by using a verification set; s' + s ═ s;
s2-4, selecting the model parameter with the highest prediction accuracy in N' batch training as the parameter of the vehicle position prediction model;
the prediction phase comprises:
the method comprises the steps that a camera capable of shooting a front vehicle is arranged on the vehicle, and video data collected by the camera in the driving process of the vehicle are obtained;
carrying out vehicle detection and tracking on each frame of image in the video to obtain an enclosure frame sequence of each front vehicle, and storing the enclosure frame sequence in B test (i) In the middle, i is the serial number of the front vehicle; while calculating the light flow in the bounding box, storing in F test (i) (ii) a Obtaining the motion information of the vehicle in the future frame and storing the motion information into the sequence M test
In the sequence B test (i) And F test (i) In which a first sliding window of length T is used, in sequence M test The method includes the steps of adopting a second sliding window with the length of delta, respectively intercepting a surrounding frame of a vehicle i in T video frames before the current time T, an optical flow in the surrounding frame and a predicted value of motion information of the vehicle in delta video frames after the current time T, inputting the intercepted value into a trained vehicle position prediction model, and obtaining a surrounding frame sequence Y '(i) ═ Y' t+1 (i),Y′ t+2 (i),…,Y′ t+δ (i),…,Y′ t+△ (i)]And calculating the relative position of the bounding box of the front vehicle i in the video frame at the current moment:
Figure FDA0002899371680000021
wherein B is test,t+0 (i) A bounding box for the vehicle i ahead at the current time t; delta is not less than 1 and not more than delta;
obtaining a predicted track of a front vehicle i according to the center of the surrounding frame in Y' (i); and obtaining the dimension i of the front vehicle according to the width and the height of the surrounding frame in Y' (i).
2. A preceding vehicle position prediction method according to claim 1, characterized in that the sequence of bounding boxes of the preceding vehicle is calculated using the steps of:
a.1, carrying out vehicle detection on video frame images at continuous T moments to obtain surrounding frames of all vehicles in each frame image;
and A.2, tracking the vehicle enclosure frame obtained in the step A.1 by adopting a multi-target tracking algorithm, giving the same number to the same vehicle in different frames, and forming a front vehicle enclosure frame sequence B of T moments according to a time sequence.
3. The preceding vehicle position prediction method according to claim 1, characterized in that the optical flow sequence within the preceding vehicle bounding box is calculated by using:
b.1, calculating the optical flow of each frame and the previous frame of image of the frame of the video images at the continuous T moments to obtain an optical flow graph corresponding to each frame of image; the two-dimensional optical flow vector of the jth pixel point in the optical flow graph is as follows: i is j =(u j ,v j ),u j ,v j Vertical and horizontal components of the optical flow vector, respectively;
and B.2, intercepting a covering part of the front vehicle surrounding frame in the image at the T-T moment from a light flow graph corresponding to the image at the T-T moment, zooming to a preset uniform size to obtain a light flow graph in the surrounding frame at the T-T moment, forming a light flow sequence F in the front vehicle surrounding frame at the T moment according to a time sequence, wherein T-T represents the T-th moment before the moment T, and T is more than or equal to 0 and less than T.
4. The preceding vehicle position prediction method according to claim 1, characterized in that the motion prediction sequence of the own vehicle is calculated by using:
c.1, calculating the video frames at T-0, T-1, … and T- (T-1) before the current time TAdjacent moment video frame P t-τ-1 And P t-τ Camera rotation matrix R t-τ And a translation vector V t-τ Forming a rotation matrix sequence RS and a translation vector sequence VS, and the value is more than or equal to 0 and less than or equal to tau<T, specifically comprising the steps C.1-1 to C.1-2:
c.1-1, calculating to obtain an essential matrix E by adopting an eight-point method, wherein the method comprises the following steps:
c.1-1-1, extracting P by using Surf algorithm t-τ-1 And P t-τ And 8 pairs of the most matched feature points (a) are selected l ,a′ l ) 1,2, …, 8; wherein a is l ,a′ l Respectively representing video frames P t-τ-1 And P t-τ Coordinates of the pixel positions of the matched characteristic points of the ith pair on a normalized plane, a l =[x l ,y l ,1] T ,a′ l =[x′ l ,y′ l ,1] T ;a l ,a′ l Each of the matrices is 3 × 1, where T represents a transpose of the matrix;
c.1-1-2, combining 8 pairs of matched feature points to obtain a 3 x 8 matrix a and a':
Figure FDA0002899371680000031
establishing a epipolar constraint formula according to a and a':
a T Ea′=0
solving the equation set to obtain an essential matrix E, wherein E is a matrix of 3 multiplied by 3;
c.1-2, performing singular value decomposition on E to obtain a rotation matrix R of the camera t-τ And a translation vector V t-τ Wherein R is t-τ Is a 3 × 3 matrix, V t-τ Is a 3-dimensional column vector;
finally obtaining a rotation matrix sequence RS ═ R of T video frames before T time t-(T-1) ,…,R t-τ ,…,R t-1 ,R t-0 T time is earlier than T video frames, and the translation vector sequence VS is ═ V t-(T-1) ,…,V t-τ ,…,V t-1 ,V t-0 };
C.2 Camera rotation matrix and translation vector in RS and VS obtained for C.1Calculating each R t-τ And V t-τ And a cumulative value of the previous time, the cumulative value being R' t-τ And V' t-τ Expressed as follows:
Figure FDA0002899371680000041
Figure FDA0002899371680000042
c.3, calculating R 'finally obtained from C.2' t-0 And V' t-0 The rotation matrix and the translation vector of the camera at the next moment are transmitted, and the following formula is shown:
R t+1 =R′ t-0
V t+1 =V′ t-0
C.4R obtained from C.3 t+1 And V t+1 Adding the rotation matrix sequence RS and the translation vector sequence VS obtained at C.1 respectively at the end, and continuing to execute C.2 and C.3 until all rotation matrices { R } of delta video frames after t time are obtained t+1 ,R t+2 ,…,R t+δ ,…,R t+△ All translation vectors V for delta video frames after time t t+1 ,V t+2 ,…,V t+δ ,…,V t+△ },1≤δ≤△;
C.5, calculating the motion vector of the vehicle at the time delta after the current time t, and forming a motion prediction sequence M of the vehicle, wherein M is equal to { M { t+1 ,M t+2 ,…,M t+δ ,…,M t+△ The method specifically comprises the following steps of C.5-1 to C.5-2:
c.5-1, from rotation matrix R t+δ Extracting the rotation angle information of the camera in the x, y and z axes and using 3-dimensional line vector
Figure FDA0002899371680000043
Is shown, in which:
Figure FDA0002899371680000044
Figure FDA0002899371680000045
Figure FDA0002899371680000051
in the above formula, r jk Representing a rotation matrix R t+δ The value of the jth row and the kth column, j, k belongs to {1,2,3 }; both atan2() and atan () represent arctan functions, but the range of values obtained by atan () is (0,2 π)]The value range of the result obtained by atan2() is (-pi, pi)];
C.5-2, vector psi t+δ And a translation vector V converted into a three-dimensional row vector t+δ T Are connected to form a 6-dimensional row vector M t+δ :M t+δ =[ψ t+δ ,V t+δ T ];
Finally obtaining the motion prediction sequence M ═ { M ═ of the vehicle t+1 ,M t+2 ,…,M t+δ ,…,M t+△ };
C.6, passing M through a full connection layer FC 4 Transform the dimensions of all its motion vectors.
5. The method of predicting a location of a preceding vehicle as claimed in claim 1, wherein the preceding vehicle bounding box encoder includes a coded gated recurrent neural network GRU b And a first full connection layer FC 1 (ii) a The GRU b The input of (a) is the bounding box B at each time in the bounding box sequence B of the preceding vehicle t-τ And last time GRU b Passing down hidden state vector
Figure FDA0002899371680000052
Outputting the coded result of the surrounding frame of the front vehicle at the current moment
Figure FDA0002899371680000053
FC 1 To GRU b Final output
Figure FDA0002899371680000054
Dimension conversion is carried out to obtain a time sequence feature vector of the vehicle in front of the current moment t
Figure FDA0002899371680000055
6. The preceding vehicle position prediction method according to claim 1, characterized in that the preceding vehicle optical flow encoder includes a CNN-based motion feature extraction network FEN and a second full connection layer FC 2 (ii) a The input of the FEN is an optical flow sequence F in a surrounding frame of the front vehicle, and the output is an optical flow coding result in the surrounding frame of the front vehicle at the current moment; the FEN is based on a ResNet50 framework and comprises a convolution layer conv1, a Relu layer, a maximum pooling layer maxPool and 4 residual learning blocks which are connected in sequence; the number of input channels of conv1 is 2m, and m is the number of samples of the optical flow graph in the optical flow sequence F, that is, m optical flow graphs are uniformly sampled from F; the 4 residual error learning blocks are all of a three-layer structure, namely each residual error learning block is a convolution network layer and a Relu layer which are connected in series;
uniformly sampling m optical flow graphs of an optical flow sequence F in a surrounding frame of a front vehicle, wherein vertical components and horizontal components of the m optical flow graphs form 2m optical flow components which are input into FEN, and the output of the FEN is the motion characteristic of the optical flow graph in the surrounding frame of the front vehicle at the current moment;
FC 2 performing dimension transformation on the motion characteristics output by the FEN to obtain the motion characteristic vector of the vehicle in front of the current time t
Figure FDA0002899371680000061
7. A preceding vehicle position prediction method according to claim 1, characterized in that the preceding vehicleThe vehicle position prediction decoder comprises a decoding gated recurrent neural network GRU d And a third full connection layer FC 3 (ii) a The GRU d Is the predicted value M of the vehicle motion information at the moment of t + delta t+δ And GRU at last time d Passing down hidden state vector
Figure FDA0002899371680000062
Fusion vector Mh of t+δ And last time GRU d Passing down hidden state vector
Figure FDA0002899371680000063
1≤δ≤△,
Figure FDA0002899371680000064
Outputting a decoding result of a surrounding frame of a front vehicle at the moment of t + delta
Figure FDA0002899371680000065
FC 3 To pair
Figure FDA0002899371680000066
And carrying out dimension conversion to obtain a front vehicle surrounding frame at the t + delta moment.
8. A preceding vehicle position prediction system based on an in-vehicle video, characterized by comprising:
the vehicle position prediction model based on the coding and decoding frame is used for predicting the bounding box of the front vehicle at T +1, T +2, … and T + delta moments after the current moment T according to the bounding box of the front vehicle at T-0, T-1, … and T- (T-1) moments before the current moment T, the optical flow in the bounding box and the motion information of the self vehicle at T +1, T +2, … and T + delta moments after the current moment T;
the vehicle position prediction model includes: the system comprises a front vehicle surrounding frame encoder, a front vehicle optical flow encoder, a feature fusion unit and a front vehicle position prediction decoder;
the front vehicle surrounding frame encoder is used for surrounding frames of front vehiclesCoding the sequence B to obtain the time sequence characteristic vector of the front vehicle
Figure FDA0002899371680000067
The front vehicle optical flow encoder is used for encoding an optical flow sequence F in a surrounding frame of the front vehicle to obtain a motion characteristic vector of the front vehicle
Figure FDA0002899371680000068
The feature fusion unit fuses time-series feature vectors of a preceding vehicle
Figure FDA0002899371680000069
And motion feature vector
Figure FDA00028993716800000610
Fused feature vector connected as front vehicle
Figure FDA00028993716800000611
The front vehicle position prediction decoder predicts the feature vector according to the motion prediction sequence M of the vehicle
Figure FDA00028993716800000612
Decoding to obtain a predicted surrounding frame of a front vehicle in a video frame of delta moments after the current moment t;
the vehicle surrounding frame acquiring module is used for acquiring a surrounding frame sequence B of a front vehicle in the vehicle-mounted video;
the vehicle surrounding frame light stream acquisition module is used for acquiring a light stream sequence F in a front vehicle surrounding frame in the vehicle-mounted video;
and the vehicle motion information prediction module is used for predicting the motion information of the vehicle in the future time to form a vehicle motion prediction sequence M.
9. The preceding vehicle position prediction system of claim 8, characterized in that the preceding vehicle position prediction system is the preceding vehicle position prediction systemThe vehicle surround frame encoder includes a coded gated recurrent neural network GRU b And a first full connection layer FC 1 (ii) a The GRU b The input of (a) is the bounding box B at each time in the bounding box sequence B of the preceding vehicle t-τ And last time GRU b Passing down hidden state vectors
Figure FDA0002899371680000071
Outputting the coded result of the surrounding frame of the front vehicle at the current moment
Figure FDA0002899371680000072
FC 1 To GRU b Final output
Figure FDA0002899371680000073
Dimension conversion is carried out to obtain a time sequence feature vector of a vehicle in front of the current time t
Figure FDA0002899371680000074
10. The preceding vehicle position prediction system according to claim 8, characterized in that the preceding vehicle optical flow encoder includes a CNN-based motion feature extraction network FEN and a second full connection layer FC 2 (ii) a The input of the FEN is an optical flow sequence F in a surrounding frame of the front vehicle, and the output is an optical flow coding result in the surrounding frame of the front vehicle at the current moment; the FEN is based on a ResNet50 framework and comprises a convolution layer conv1, a Relu layer, a maximum pooling layer maxPool and 4 residual learning blocks which are connected in sequence; wherein the number of input channels of conv1 is 2m, and m is the number of samples of the optical flow graph in the optical flow sequence F, that is, m optical flow graphs are uniformly sampled from F; the 4 residual error learning blocks are all of a three-layer structure, namely each residual error learning block is a convolution network layer and a Relu layer which are connected in series;
uniformly sampling m optical flow graphs of an optical flow sequence F in a surrounding frame of a front vehicle, wherein vertical components and horizontal components of the m optical flow graphs form 2m optical flow components which are input into FEN, and the output of the FEN is the motion characteristic of the optical flow graph in the surrounding frame of the front vehicle at the current moment;
FC 2 performing dimension transformation on the motion characteristics output by the FEN to obtain the motion characteristic vector of the vehicle in front of the current time t
Figure FDA0002899371680000075
CN202110051940.3A 2021-01-15 2021-01-15 Vehicle-mounted video-based front vehicle position prediction method and prediction system Active CN112800879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110051940.3A CN112800879B (en) 2021-01-15 2021-01-15 Vehicle-mounted video-based front vehicle position prediction method and prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110051940.3A CN112800879B (en) 2021-01-15 2021-01-15 Vehicle-mounted video-based front vehicle position prediction method and prediction system

Publications (2)

Publication Number Publication Date
CN112800879A CN112800879A (en) 2021-05-14
CN112800879B true CN112800879B (en) 2022-08-26

Family

ID=75811025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110051940.3A Active CN112800879B (en) 2021-01-15 2021-01-15 Vehicle-mounted video-based front vehicle position prediction method and prediction system

Country Status (1)

Country Link
CN (1) CN112800879B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610900B (en) * 2021-10-11 2022-02-15 深圳佑驾创新科技有限公司 Method and device for predicting scale change of vehicle tail sequence and computer equipment
CN114255450A (en) * 2022-01-01 2022-03-29 南昌智能新能源汽车研究院 A near-field vehicle jamming behavior prediction method based on forward panoramic images
CN114445606A (en) * 2022-01-29 2022-05-06 北京精英路通科技有限公司 Method and device for capturing license plate image, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846854A (en) * 2018-05-07 2018-11-20 中国科学院声学研究所 A kind of wireless vehicle tracking based on motion prediction and multiple features fusion
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and trajectory tracking method based on re-identification
CN111931905A (en) * 2020-07-13 2020-11-13 江苏大学 Graph convolution neural network model and vehicle track prediction method using same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846854A (en) * 2018-05-07 2018-11-20 中国科学院声学研究所 A kind of wireless vehicle tracking based on motion prediction and multiple features fusion
CN111914664A (en) * 2020-07-06 2020-11-10 同济大学 Vehicle multi-target detection and trajectory tracking method based on re-identification
CN111931905A (en) * 2020-07-13 2020-11-13 江苏大学 Graph convolution neural network model and vehicle track prediction method using same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于CNN和LSTM混合模型的车辆行为检测方法;王硕等;《智能计算机与应用》;20200201(第02期);全文 *

Also Published As

Publication number Publication date
CN112800879A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN112800879B (en) Vehicle-mounted video-based front vehicle position prediction method and prediction system
Huang et al. End-to-end autonomous driving decision based on deep reinforcement learning
CN109740419A (en) A Video Action Recognition Method Based on Attention-LSTM Network
Bai et al. Deep learning based motion planning for autonomous vehicle using spatiotemporal LSTM network
CN111027461B (en) Vehicle track prediction method based on multi-dimensional single-step LSTM network
CN104506800B (en) The alert camera scene synthesis of the multi-direction electricity of one kind and comprehensive monitoring and controlling method and device
CN110516633B (en) Lane line detection method and system based on deep learning
CN110599521B (en) Method and prediction method for generating trajectory prediction model of vulnerable road users
CN111292366A (en) Visual driving ranging algorithm based on deep learning and edge calculation
CN115829171A (en) Pedestrian trajectory prediction method combining space information and social interaction characteristics
Wang et al. Adversarial learning for joint optimization of depth and ego-motion
CN116740424A (en) Transformer-based time series point cloud 3D target detection
CN112818935B (en) Multi-lane congestion detection and duration prediction method and system based on deep learning
CN109919107B (en) Traffic police gesture recognition method based on deep learning and unmanned vehicle
Lee et al. Ev-reconnet: Visual place recognition using event camera with spiking neural networks
Wen-juan et al. Application of vision sensing technology in urban intelligent traffic control system
CN118314530B (en) Video anti-tailing method based on abnormal event detection
CN117058474B (en) Depth estimation method and system based on multi-sensor fusion
Lee et al. Low computational vehicle lane changing prediction using drone traffic dataset
CN117649491A (en) Real test scene virtual reconstruction method for ice and snow aerial photographing driving data
CN114998402B (en) Monocular depth estimation method and device for pulse camera
CN116797640A (en) A depth and 3D key point estimation method for intelligent accompanying patrol vehicles
CN116934977A (en) A visual three-dimensional perception method and system based on three-dimensional occupancy prediction and neural rendering
Ayalew et al. Self-Supervised Representation Learning for Motion Control of Autonomous Vehicles
Lyu et al. Sensor Fusion and Motion Planning with Unified Bird’s-Eye View Representation for End-to-end Autonomous Driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant