CN107240120B - Method and device for tracking moving target in video - Google Patents
Method and device for tracking moving target in video Download PDFInfo
- Publication number
- CN107240120B CN107240120B CN201710254328.XA CN201710254328A CN107240120B CN 107240120 B CN107240120 B CN 107240120B CN 201710254328 A CN201710254328 A CN 201710254328A CN 107240120 B CN107240120 B CN 107240120B
- Authority
- CN
- China
- Prior art keywords
- tracking
- moving target
- moving
- target
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000004364 calculation method Methods 0.000 claims description 24
- 238000001514 detection method Methods 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 claims 12
- 230000009466 transformation Effects 0.000 description 17
- 239000011159 matrix material Substances 0.000 description 14
- 230000008569 process Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000000903 blocking effect Effects 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种视频中运动目标的跟踪方法及装置,该视频中运动目标的跟踪方法包括以下步骤:计算第一视角下的视频中的当前帧的运动目标的遮挡率;根据运动目标的遮挡率计算时空上下文模型的学习速率,并根据学习速率更新运动目标的时空上下文模型;获取当前帧中的运动目标的图像特征值,根据图像特征值更新运动目标的上下文先验模型;对更新后的时空上下文模型和上下文先验模型进行卷积运算,得到第一视角下的视频中的下一帧的运动目标的跟踪位置。上述视频中运动目标的跟踪方法计算复杂度,跟踪效率高且跟踪准确性高。相应地,本发明还提供一种视频中运动目标的跟踪装置。
The invention relates to a method and device for tracking a moving object in a video. The method for tracking a moving object in a video includes the following steps: calculating the occlusion rate of the moving object in the current frame of the video under a first viewing angle; Calculate the learning rate of the spatio-temporal context model, and update the spatio-temporal context model of the moving target according to the learning rate; obtain the image feature value of the moving target in the current frame, and update the context prior model of the moving target according to the image feature value; update the updated The spatio-temporal context model and the context prior model perform a convolution operation to obtain the tracking position of the moving target in the next frame of the video from the first perspective. The tracking method of the moving target in the above video has computational complexity, high tracking efficiency and high tracking accuracy. Correspondingly, the present invention also provides a tracking device for a moving object in a video.
Description
技术领域technical field
本发明涉及视频跟踪技术领域,特别是涉及一种视频中运动目标的跟踪方 法及装置。The present invention relates to the technical field of video tracking, in particular to a method and device for tracking a moving object in a video.
背景技术Background technique
随着信息技术的蓬勃发展,计算机视觉技术在视频跟踪领域的应用越来越 广泛,尤其在体育赛事视频分析中,通过计算机视觉跟踪运动目标进行体育赛 事分析能够大大减少人工成本,提高分析准确度。近年来基于在线机器学习的 跟踪算法得到了快速发展,如在线Boosting算法、“跟踪-检测-学习”算法、 以及基于压缩感知的跟踪算法等,然而,上述各种基于在线机器学习的运动目 标跟踪方法由于需要不断学习新的模型,使得计算复杂度高,影响跟踪效率, 且容易产生跟踪漂移问题,跟踪准确性低。With the vigorous development of information technology, computer vision technology is more and more widely used in the field of video tracking, especially in the video analysis of sports events. Using computer vision to track moving targets for sports event analysis can greatly reduce labor costs and improve analysis accuracy. . In recent years, tracking algorithms based on online machine learning have developed rapidly, such as online Boosting algorithm, "tracking-detection-learning" algorithm, and tracking algorithms based on compressed sensing. However, the above-mentioned moving target tracking based on online machine learning Because the method needs to learn new models continuously, the calculation complexity is high, which affects the tracking efficiency, and is prone to tracking drift, and the tracking accuracy is low.
发明内容Contents of the invention
基于此,有必要针对传统运动目标跟踪方法存在的跟踪效率低、跟踪准确 性低的问题,提供一种快速且准确有效的视频中运动目标的跟踪方法及装置。Based on this, it is necessary to provide a fast, accurate and effective tracking method and device for moving targets in video for the problems of low tracking efficiency and low tracking accuracy in traditional moving target tracking methods.
一种视频中运动目标的跟踪方法,包括以下步骤:A method for tracking a moving target in a video, comprising the following steps:
计算第一视角下的视频中的当前帧的运动目标的遮挡率;Calculate the occlusion rate of the moving target in the current frame of the video from the first perspective;
根据运动目标的遮挡率计算时空上下文模型的学习速率,并根据学习速率更 新运动目标的时空上下文模型;Calculate the learning rate of the spatio-temporal context model according to the occlusion rate of the moving target, and update the spatio-temporal context model of the moving target according to the learning rate;
获取当前帧中的运动目标的图像特征值,根据图像特征值更新运动目标的 上下文先验模型;Obtain the image feature value of the moving target in the current frame, and update the context prior model of the moving target according to the image feature value;
对更新后的时空上下文模型和上下文先验模型进行卷积运算,得到第一视 角下的视频中的下一帧的运动目标的跟踪位置。Convolute the updated spatio-temporal context model and the context prior model to obtain the tracking position of the moving target in the next frame of the video from the first perspective.
上述视频中运动目标的跟踪方法,通过计算第一视角下的视频中的当前帧 的运动目标的遮挡率计算时空上下文模型的学习速率,并根据学习速率更新运动 目标的时空上下文模型;再根据图像特征值更新运动目标的上下文先验模型; 根据更新后的时空上下文模型和上下文先验模型进行卷积运算,得到第一视角 下的视频中的下一帧的运动目标的跟踪位置。上述视频中运动目标跟踪方法通 过更新运动目标的时空上下文模型上下文先验模型即可实现下一帧运动目标的 跟踪定位,只要进行模型更新即可,不需要一直学习新的模型,有效降低了计 算复杂度,有效提升跟踪效率,并且,上述视频中运动目标的跟踪方法根据运 动目标的遮挡情况动态确定时空上下文模型的学习速率以更新时空上下文模型, 能够避免运动目标被其它物体遮挡时学习到错误的模型,有效避免出现跟踪漂 移,大大提高跟踪准确性。The tracking method of the moving object in the above video calculates the learning rate of the space-time context model by calculating the occlusion rate of the moving object in the current frame of the video under the first perspective, and updates the space-time context model of the moving object according to the learning rate; then according to the image The feature value updates the context prior model of the moving target; performs convolution operation according to the updated spatio-temporal context model and the context prior model, and obtains the tracking position of the moving target in the next frame of the video from the first perspective. The moving target tracking method in the above video can realize the tracking and positioning of the moving target in the next frame by updating the context prior model of the space-time context model of the moving target, as long as the model is updated, it is not necessary to learn a new model all the time, which effectively reduces the calculation Complexity, effectively improve tracking efficiency, and, the tracking method of the moving object in the above video dynamically determines the learning rate of the spatio-temporal context model according to the occlusion of the moving object to update the spatio-temporal context model, which can avoid learning errors when the moving object is occluded by other objects The model can effectively avoid tracking drift and greatly improve tracking accuracy.
在其中一个实施例中,计算第一视角下的视频中的当前帧的运动目标的遮 挡率的步骤包括:In one of the embodiments, the step of calculating the occlusion rate of the moving target of the current frame in the video under the first viewing angle includes:
检测当前帧的不同的运动目标的跟踪框之间是否包括交点;Detect whether there is an intersection between the tracking frames of different moving objects in the current frame;
当不同的运动目标的跟踪框之间包括交点时,计算不同的运动目标的跟踪 框之间重叠部分的长度和宽度,并根据长度和宽度计算运动目标发生遮挡的遮 挡面积;When intersecting points are included between the tracking frames of different moving objects, calculate the length and width of overlapping parts between the tracking frames of different moving objects, and calculate the occlusion area where the moving objects are blocked according to the length and width;
获取预先存储的运动目标的跟踪框面积,计算运动目标的遮挡率为遮挡面 积与跟踪框面积的比值。Obtain the tracking frame area of the pre-stored moving target, and calculate the occlusion rate of the moving target as the ratio of the occlusion area to the tracking frame area.
在其中一个实施例中,采用以下公式计算学习速率:In one of the embodiments, the learning rate is calculated using the following formula:
其中:in:
e为自然对数;e is the natural logarithm;
ΔS为运动目标的遮挡率;ΔS is the occlusion rate of the moving target;
k、均为常参数。k, are constant parameters.
在其中一个实施例中,获取当前帧中的运动目标的图像特征值的步骤包括:In one of the embodiments, the step of obtaining the image feature value of the moving target in the current frame includes:
获取当前帧中的运动目标在红色通道上的颜色强度、在绿色通道上的颜色 强度以及在蓝色通道上的颜色强度;Obtain the color intensity of the moving target in the current frame on the red channel, the color intensity on the green channel and the color intensity on the blue channel;
为运动目标在红色通道上的颜色强度、在绿色通道上的颜色强度,及在蓝 色通道上的颜色强度赋予相应的颜色强度权重值;Assign corresponding color intensity weight values to the color intensity of the moving target on the red channel, the color intensity on the green channel, and the color intensity on the blue channel;
对在每个通道上的颜色强度进行加权求和,得到当前帧中的运动目标的图 像特征值。The weighted summation is carried out on the color intensity on each channel to obtain the image feature value of the moving target in the current frame.
在其中一个实施例中,上述视频中运动目标的跟踪方法还包括:In one of the embodiments, the method for tracking the moving target in the above video also includes:
提取追踪场地边线区域,建立追踪场地俯视二维模型,将跟踪位置投影至 追踪场地俯视二维模型中的第一投影坐标。Extract the edge area of the tracking site, establish a two-dimensional model of the tracking site, and project the tracking position to the first projection coordinates in the two-dimensional model of the tracking site.
在其中一个实施例中,上述视频中运动目标的跟踪方法还包括:In one of the embodiments, the method for tracking the moving target in the above video also includes:
获取第二视角下的视频,并计算在第二视角下的视频中与下一帧对应的视 频帧的运动目标的跟踪位置在追踪场地俯视二维模型中的第二投影坐标;Obtain the video under the second viewing angle, and calculate the second projection coordinates of the tracking position of the moving target in the video frame corresponding to the next frame in the video under the second viewing angle in the tracking site overlooking the two-dimensional model;
分别将第一视角下的视频中的当前帧运动目标的遮挡率和第二视角下的视 频中的当前帧运动目标的遮挡率与预设遮挡率阈值进行比较;The occlusion rate of the current frame moving target in the video under the first viewing angle and the occlusion rate of the current frame moving target in the video under the second viewing angle are compared with the preset occlusion rate threshold;
当第一视角下的视频中的当前帧运动目标的遮挡率和第二视角下的视频中 的当前帧运动目标的遮挡率均小于或等于预设遮挡率阈值时,根据第一投影坐 标和第二投影坐标计算运动目标在追踪场地俯视二维模型中的目标投影坐标;When the occlusion rate of the moving object in the current frame of the video under the first viewing angle and the occlusion rate of the moving object in the current frame of the video under the second viewing angle are both less than or equal to the preset occlusion rate threshold, according to the first projected coordinates and the second Two projection coordinates calculate the target projection coordinates of the moving target in the two-dimensional model of the tracking site;
当第一视角下的视频中的当前帧运动目标的遮挡率大于预设遮挡率阈值时, 选取第二投影坐标作为运动目标在追踪场地俯视二维模型中的目标投影坐标; 当第二视角下的视频中的当前帧运动目标的遮挡率大于预设遮挡率阈值时,选 取第一投影坐标作为运动目标在追踪场地俯视二维模型中的目标投影坐标。When the occlusion rate of the current frame moving target in the video under the first viewing angle is greater than the preset occlusion rate threshold, select the second projected coordinates as the target projected coordinates of the moving target in the two-dimensional model of the tracking site; when the second viewing angle When the occlusion rate of the moving target in the current frame of the video is greater than the preset occlusion rate threshold, the first projected coordinates are selected as the target projected coordinates of the moving target in the two-dimensional model of the tracking site.
在其中一个实施例中,选取第二投影坐标作为运动目标在追踪场地俯视二 维模型中的目标投影坐标的步骤之后,还包括:根据第二投影坐标对第一投影 坐标进行修正;In one of the embodiments, after selecting the second projected coordinates as the moving target after the step of tracking the projected coordinates of the target in the two-dimensional model of the ground, it also includes: correcting the first projected coordinates according to the second projected coordinates;
选取第一投影坐标作为运动目标在追踪场地俯视二维模型中的目标投影坐 标的步骤之后,还包括:根据第一投影坐标对第二投影坐标进行修正。After the step of selecting the first projected coordinates as the moving target and tracking the projected coordinates of the target in the top-view two-dimensional model of the site, it also includes: correcting the second projected coordinates according to the first projected coordinates.
一种视频中运动目标的跟踪装置,包括:A tracking device for a moving target in a video, comprising:
遮挡率计算模块,用于计算第一视角下的视频中的当前帧的运动目标的遮 挡率;The occlusion rate calculation module is used to calculate the occlusion rate of the moving target of the current frame in the video under the first viewing angle;
时空上下文模型更新模块,用于根据运动目标的遮挡率计算时空上下文模 型的学习速率,并根据学习速率更新运动目标的时空上下文模型;The spatio-temporal context model updating module is used to calculate the learning rate of the spatio-temporal context model according to the occlusion rate of the moving target, and updates the spatio-temporal context model of the moving target according to the learning rate;
上下文先验模型更新模块,用于获取当前帧中的运动目标的图像特征值, 根据图像特征值更新运动目标的上下文先验模型;The context prior model updating module is used to obtain the image feature value of the moving target in the current frame, and update the context prior model of the moving target according to the image feature value;
跟踪模块,用于对更新后的时空上下文模型和上下文先验模型进行卷积运 算,得到第一视角下的视频中的下一帧的运动目标的跟踪位置。The tracking module is used to perform convolution operation on the updated spatio-temporal context model and the context prior model to obtain the tracking position of the moving target in the next frame in the video under the first perspective.
在其中一个实施例中,时空上下文模型更新模块包括:In one of the embodiments, the spatio-temporal context model update module includes:
交点检测子模块,用于检测当前帧的不同的运动目标的跟踪框之间是否包 括交点;The intersection detection submodule is used to detect whether an intersection is included between the tracking frames of different moving objects of the current frame;
遮挡面积计算子模块,用于当不同的运动目标的跟踪框之间包括交点时, 计算不同的运动目标的跟踪框之间重叠部分的长度和宽度,并根据长度和宽度 计算运动目标发生遮挡的遮挡面积;The occlusion area calculation sub-module is used to calculate the length and width of the overlapping part between the tracking frames of different moving targets when the tracking frames of different moving targets include intersection points, and calculate the occlusion rate of the moving target according to the length and width Blocking area;
遮挡率计算子模块,用于获取预先存储的运动目标的跟踪框面积,计算运 动目标的遮挡率为遮挡面积与跟踪框面积的比值。The occlusion rate calculation sub-module is used to obtain the tracking frame area of the pre-stored moving target, and calculate the occlusion rate of the moving target as the ratio of the occlusion area to the tracking frame area.
在其中一个实施例中,学习速率计算模块采用以下公式计算学习速率:In one of the embodiments, the learning rate calculation module adopts the following formula to calculate the learning rate:
其中:in:
e为自然对数;e is the natural logarithm;
ΔS为运动目标的遮挡率;ΔS is the occlusion rate of the moving target;
k、均为常参数。k, are constant parameters.
附图说明Description of drawings
图1为一个实施例中视频中运动目标的跟踪方法的流程图;Fig. 1 is the flowchart of the tracking method of moving target in video in one embodiment;
图2为一个实施例中计算运动目标的遮挡率的流程图;Fig. 2 is a flowchart of calculating the occlusion rate of a moving target in one embodiment;
图3为一个实施例中运动目标的遮挡面积计算原理示意图;Fig. 3 is a schematic diagram of the calculation principle of the occlusion area of a moving target in an embodiment;
图4为又一个实施例中视频中运动目标的跟踪方法的流程图;Fig. 4 is the flowchart of the tracking method of moving object in video in another embodiment;
图5为一个实施例中运动目标的时空上下文信息界面显示示意图;Fig. 5 is a schematic diagram showing the space-time context information interface of a moving object in an embodiment;
图6为一个实施例中追踪场地俯视二维模型显示示意图;Fig. 6 is a schematic view showing a two-dimensional model of a tracking site in an embodiment;
图7为一个实施例中视频中运动目标的跟踪装置的结构示意图;FIG. 7 is a schematic structural diagram of a tracking device for a moving target in a video in an embodiment;
图8为一个实施例中时空上下文模型更新模块的结构示意图;FIG. 8 is a schematic structural diagram of a spatio-temporal context model update module in an embodiment;
图9为一个实施例中上下文先验模型更新模块的结构示意图;FIG. 9 is a schematic structural diagram of a context prior model update module in an embodiment;
图10为又一个实施例中时空上下文模型更新模块的结构示意图。Fig. 10 is a schematic structural diagram of a spatio-temporal context model updating module in another embodiment.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实 施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅 仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
请参阅图1,一种视频中运动目标的跟踪方法,包括以下步骤:Please refer to Figure 1, a method for tracking a moving target in a video, including the following steps:
步骤102:计算第一视角下的视频中的当前帧的运动目标的遮挡率。Step 102: Calculate the occlusion rate of the moving object in the current frame of the video under the first viewing angle.
具体地,运动目标的遮挡率表示运动目标被遮挡的程度,根据运动目标发 生遮挡的面积计算得到。终端检测运动目标是否发生遮挡,当运动目标发生遮 挡时,则计算运动目标的遮挡率;否则,运动目标的遮挡率为0。Specifically, the occlusion rate of the moving object indicates the extent to which the moving object is occluded, and is calculated according to the area where the moving object is occluded. The terminal detects whether the moving target is blocked, and when the moving target is blocked, it calculates the blocking rate of the moving target; otherwise, the blocking rate of the moving target is 0.
步骤104:根据运动目标的遮挡率计算时空上下文模型的学习速率,并根据 学习速率更新运动目标的时空上下文模型。Step 104: Calculate the learning rate of the spatio-temporal context model according to the occlusion rate of the moving target, and update the spatio-temporal context model of the moving target according to the learning rate.
具体地,空间上下文模型关注运动目标与其局部上下文的空间位置关系, 包括距离和方向关系。视频序列是连续的,时间上的上下文相关性对于跟踪结 果也非常重要,每一帧的运动目标的时空上下文模型通过学习速率学习上一帧 跟踪目标的时空上下文模型和空间上下文模型得到。当运动目标被其他物体遮 挡时,运动目标的外观模型会受到改变,此时运动目标的空间上下文模型可信 度降低,本实施例中,通过调整运动目标的时空上下文模型的学习速率以防止 学习到错误的模型。具体地,根据运动目标发生遮挡的情况动态确定时空上下 文模型的学习速率,并根据学习速率更新第一视角下的视频中的下一帧的运动 目标的时空上下文模型。Specifically, the spatial context model focuses on the spatial position relationship between moving objects and their local context, including distance and direction relationships. The video sequence is continuous, and the temporal context correlation is also very important for the tracking results. The spatio-temporal context model of the moving target in each frame is obtained by learning the spatio-temporal context model and the spatial context model of the tracking target in the previous frame through the learning rate. When the moving object is blocked by other objects, the appearance model of the moving object will be changed, and the reliability of the spatial context model of the moving object will decrease at this time. In this embodiment, the learning rate of the space-time context model of the moving object is adjusted to prevent learning to the wrong model. Specifically, dynamically determine the learning rate of the spatio-temporal context model according to the occlusion of the moving target, and update the spatio-temporal context model of the moving target in the next frame of the video under the first view according to the learning rate.
步骤106:获取当前帧中的运动目标的图像特征值,根据图像特征值更新运 动目标的上下文先验模型。Step 106: Obtain the image feature value of the moving object in the current frame, and update the context prior model of the moving object according to the image feature value.
具体地,上下文先验模型反映运动目标当前局部上下文自身的空间构成, 与上下文空间的图像特征与其空间位置结构有关,本实施例中,终端获取当前 帧中的运动目标的图像特征值,根据图像特征值更新运动目标的上下文先验模 型。Specifically, the context prior model reflects the spatial composition of the current local context of the moving object itself, and is related to the image features of the context space and its spatial position structure. In this embodiment, the terminal obtains the image feature value of the moving object in the current frame, and according to the image The eigenvalues update the contextual prior model of moving objects.
步骤108:对更新后的时空上下文模型和上下文先验模型进行卷积运算,得 到第一视角下的视频中的下一帧的运动目标的跟踪位置。Step 108: Convolute the updated spatio-temporal context model and the context prior model to obtain the tracking position of the moving target in the next frame of the video from the first perspective.
上述视频中运动目标的跟踪方法,通过计算第一视角下的视频中的当前帧 的运动目标的遮挡率计算时空上下文模型的学习速率,并根据学习速率更新运动 目标的时空上下文模型;再根据图像特征值更新运动目标的上下文先验模型; 根据更新后的时空上下文模型和上下文先验模型进行卷积运算,得到第一视角 下的视频中的下一帧的运动目标的跟踪位置。上述视频中运动目标跟踪方法通 过更新运动目标的时空上下文模型上下文先验模型即可实现下一帧运动目标的 跟踪定位,只要进行模型更新即可,不需要一直学习新的模型,有效降低了计 算复杂度,有效提升跟踪效率,并且,上述视频中运动目标的跟踪方法根据运 动目标的遮挡情况动态确定时空上下文模型的学习速率以更新时空上下文模型, 能够避免运动目标被其它物体遮挡时学习到错误的模型,有效避免出现跟踪漂 移,大大提高跟踪准确性。The tracking method of the moving object in the above video calculates the learning rate of the space-time context model by calculating the occlusion rate of the moving object in the current frame of the video under the first perspective, and updates the space-time context model of the moving object according to the learning rate; then according to the image The feature value updates the context prior model of the moving target; performs convolution operation according to the updated spatio-temporal context model and the context prior model, and obtains the tracking position of the moving target in the next frame of the video from the first perspective. The moving target tracking method in the above video can realize the tracking and positioning of the moving target in the next frame by updating the context prior model of the space-time context model of the moving target, as long as the model is updated, it is not necessary to learn a new model all the time, which effectively reduces the calculation Complexity, effectively improve tracking efficiency, and, the tracking method of the moving object in the above video dynamically determines the learning rate of the spatio-temporal context model according to the occlusion of the moving object to update the spatio-temporal context model, which can avoid learning errors when the moving object is occluded by other objects The model can effectively avoid tracking drift and greatly improve tracking accuracy.
如图2所示,在一个实施例中,步骤102包括:As shown in Figure 2, in one embodiment, step 102 includes:
步骤1022:检测当前帧的不同的运动目标的跟踪框之间是否包括交点。Step 1022: Detect whether there is an intersection between the tracking frames of different moving objects in the current frame.
为了保证运动目标初始位置的准确性,为之后的跟踪打下良好的基础,本 实施例中,通过人机交互对第一视角下的视频的第一帧中运动目标的初始位置 进行人工标定,人工选定跟踪框,确定各运动目标的初始位置,具体的,本实 施例中,跟踪框为矩形框。跟踪过程中,终端实时检测不同的运动目标的跟踪 框之间是否包括交点,如果不同的运动目标的跟踪框之间包括交点,则表示运 动目标之间发生遮挡,执行步骤1024;否则,运动目标没有发生遮挡,直接得 到运动目标的遮挡率为0。In order to ensure the accuracy of the initial position of the moving target and lay a good foundation for subsequent tracking, in this embodiment, the initial position of the moving target in the first frame of the video under the first viewing angle is manually calibrated through human-computer interaction. A tracking frame is selected to determine the initial position of each moving target. Specifically, in this embodiment, the tracking frame is a rectangular frame. During the tracking process, the terminal detects in real time whether there is an intersection between the tracking frames of different moving objects. If there is an intersection between the tracking frames of different moving objects, it means that occlusion occurs between the moving objects, and step 1024 is executed; otherwise, the moving object No occlusion occurs, and the occlusion rate of the moving target is directly obtained as 0.
步骤1024:当不同的运动目标的跟踪框之间包括交点时,计算不同的运动 目标的跟踪框之间重叠部分的长度和宽度,并根据长度和宽度计算运动目标发 生遮挡的遮挡面积。Step 1024: When the tracking frames of different moving objects include intersections, calculate the length and width of the overlapping parts between the tracking frames of different moving objects, and calculate the occlusion area where the moving objects are occluded according to the length and width.
如图3所示,本实施例中,以跟踪框的左上角为原点,向右为X轴,向下 为Y轴建立坐标系。完成当前帧跟踪得到运动目标的跟踪位置即能够得到跟踪 框的顶点的位置坐标。为便于计算,本实施例选取跟踪框的左上角顶点坐标和 右下角顶点坐标进行计算,其中,跟踪框K1的左上角顶点坐标为(minX1,minY1), 右下角顶点坐标为(maxX1,maxY1),跟踪框K2的左上角顶点坐标为 (minX2,minY2),右下角顶点坐标为(maxX2,maxY2)。跟踪框K1和跟踪框K2包 括两个交点,分别为交点E和交点F,根据跟踪框K1的右下角顶点的横坐标和 跟踪框K2的左上角顶点的纵坐标可以得到交点E点的坐标为(maxX1,minY2); 同理,根据跟踪框K2左上角的横坐标和和跟踪框K1右下角的纵坐标得到交点 F的坐标为(minX2,maxY1)。获取到交点E和交点F的坐标后即可计算跟踪 框K1和跟踪框K2重叠部分的长度和宽度。其中,计算交点E的横坐标与第二 跟踪框K2左上角顶点的横坐标的差值,得到跟踪框K1和跟踪框K2重叠部分 的宽度,并计算交点F的纵坐标与第二跟踪框K2左上角顶点的纵坐标之差得到 跟踪框K1和跟踪框K2重叠部分的长度。进一步的,计算长度和宽度的乘积即 得到运动目标发生遮挡的遮挡面积,遮挡面积 Soverlap=(maxX1-minX2)*(maxY1-minY2)。As shown in Figure 3, in this embodiment, a coordinate system is established with the upper left corner of the tracking frame as the origin, the rightwards as the X axis, and the downwards as the Y axis. After completing the current frame tracking to obtain the tracking position of the moving target, the position coordinates of the apex of the tracking frame can be obtained. For the convenience of calculation, this embodiment selects the vertex coordinates of the upper left corner and the lower right corner of the tracking frame for calculation, wherein the vertex coordinates of the upper left corner of the tracking frame K1 are (minX1, minY1), and the vertex coordinates of the lower right corner are (maxX1, maxY1) , the vertex coordinates of the upper left corner of the tracking frame K2 are (minX2, minY2), and the vertex coordinates of the lower right corner are (maxX2, maxY2). The tracking frame K1 and the tracking frame K2 include two intersection points, which are the intersection point E and the intersection point F respectively. According to the abscissa of the lower right corner vertex of the tracking frame K1 and the ordinate of the upper left corner vertex of the tracking frame K2, the coordinates of the intersection point E can be obtained as (maxX1, minY2); Similarly, according to the abscissa of the upper left corner of the tracking frame K2 and the ordinate of the lower right corner of the tracking frame K1, the coordinates of the intersection point F are (minX2, maxY1). After obtaining the coordinates of the intersection point E and the intersection point F, the length and width of the overlapping portion of the tracking frame K1 and the tracking frame K2 can be calculated. Among them, the difference between the abscissa of the intersection point E and the abscissa of the upper left corner of the second tracking frame K2 is calculated to obtain the width of the overlapping part of the tracking frame K1 and the tracking frame K2, and the ordinate of the intersection point F and the second tracking frame K2 are calculated. The difference between the vertical coordinates of the upper left corner vertex is the length of the overlapping part of the tracking frame K1 and the tracking frame K2. Further, calculate the product of the length and width to obtain the occlusion area where the moving target is occluded, and the occlusion area Soverlap=(maxX1-minX2)*(maxY1-minY2).
本实施例中,通过计算交点坐标再进一步计算跟踪框之间重叠部分的长度 和宽度以计算得到遮挡面积。但是,需要说明的是,以上实施例并不用于对遮 挡面积计算的具体限定。如,在其它实施例中,还可以直接通过跟踪框K1的左 上角顶点及右下角顶点的坐标和跟踪框K2的左上角顶点及右下角顶点坐标计 算遮挡面积。为便于说明,仍以图3为例进行说明,在一个实施例中,定义 minX=max(minX1,minX2),即minX为minX1与minX2当中的较大值;同时, 定义maxX=min(maxX1,maxX2),maxX为maxX1与maxX2中的较小值; minY=max(minY1,minY2),minY为minY1与minY2中的较大值; maxY=min(maxY1,maxY2),maxY为maxY1与maxY2中的较小值。跟踪过程 中,终端根据实时比较minX与maxdX的大小,及minY与maxY的大小,根 据比较结果判断跟踪框之间是否发生遮挡并在发生遮挡时计算遮挡面积。具体 的,如果minX<maxX并且minY<maxY,则跟踪框K1和跟踪框K2重叠,运动 目标发生遮挡,计算遮挡面积为:Soverlap=(maxX-minX)*(maxY-minY)。如图3 所示,本实施例中,minX=minX2,minY=minY2,maxX=maxX1,maxY=maxY1,遮 挡面积Soverlap=(maxX1-minX2)*(maxY1-minY2)。In this embodiment, the length and width of the overlapping parts between the tracking frames are further calculated by calculating the intersection coordinates to calculate the occlusion area. However, it should be noted that the above embodiments are not used to specifically limit the calculation of the shading area. For example, in other embodiments, the occlusion area can also be calculated directly through the coordinates of the upper left vertex and the lower right vertex of the tracking frame K1 and the coordinates of the upper left vertex and the lower right vertex of the tracking frame K2. For ease of description, Fig. 3 is still taken as an example. In one embodiment, minX=max(minX1, minX2) is defined, that is, minX is the larger value among minX1 and minX2; at the same time, maxX=min(maxX1, maxX2), maxX is the smaller value of maxX1 and maxX2; minY=max(minY1,minY2), minY is the larger value of minY1 and minY2; maxY=min(maxY1,maxY2), maxY is the value of maxY1 and maxY2 smaller value. During the tracking process, the terminal compares the size of minX and maxdX, and the size of minY and maxY in real time, judges whether occlusion occurs between the tracking frames according to the comparison result, and calculates the occlusion area when occlusion occurs. Specifically, if minX<maxX and minY<maxY, then the tracking frame K1 and the tracking frame K2 overlap, and the moving object is occluded, and the occlusion area is calculated as: Soverlap=(maxX-minX)*(maxY-minY). As shown in Figure 3, in this embodiment, minX=minX2, minY=minY2, maxX=maxX1, maxY=maxY1, and the occlusion area Soverlap=(maxX1-minX2)*(maxY1-minY2).
步骤1026:获取预先存储的运动目标的跟踪框面积,计算运动目标的遮挡 率为遮挡面积与跟踪框面积的比值。Step 1026: Acquire the pre-stored tracking frame area of the moving object, and calculate the ratio of the occlusion rate of the moving object to the area of the tracking frame.
具体地,步骤1022中,标定运动目标跟踪框后计算并存储跟踪框面积,步 骤1024计算得到运动目标的遮挡面积后,终端读取跟踪框面积,计算遮挡面积 与跟踪框面积的比值,得到运动目标的遮挡率如下:Specifically, in step 1022, after the tracking frame of the moving object is calibrated, the area of the tracking frame is calculated and stored. After the occlusion area of the moving object is calculated in step 1024, the terminal reads the area of the tracking frame, calculates the ratio of the occlusion area to the area of the tracking frame, and obtains the motion The occlusion rate of the target is as follows:
其中Soverlap为运动目标的遮挡面积,S0为跟踪框面积。Among them, S overlap is the occlusion area of the moving target, and S 0 is the area of the tracking frame.
在一个实施例中,步骤104中,采用以下公式计算学习速率:In one embodiment, in step 104, the following formula is used to calculate the learning rate:
其中:e为自然对数;ΔS为运动目标的遮挡率;k、均为常参数。具体地, k的取值范围为2~4;的取值范围为1.5~2.5。在一个实施例中,k=3, Among them: e is the natural logarithm; ΔS is the occlusion rate of the moving target; k, are constant parameters. Specifically, the value range of k is 2-4; The range of values is 1.5 to 2.5. In one embodiment, k=3,
在一个实施例中,步骤106中,获取当前帧中的运动目标的图像特征值的 步骤包括:获取当前帧中的运动目标在红色通道上的颜色强度、在绿色通道上 的颜色强度以及在蓝色通道上的颜色强度;为运动目标在红色通道上的颜色强 度、在绿色通道上的颜色强度,及在蓝色通道上的颜色强度赋予相应的颜色强 度权重值;对在每个通道上的颜色强度进行加权求和,得到当前帧中的运动目 标的图像特征值。In one embodiment, in step 106, the step of obtaining the image feature value of the moving object in the current frame includes: obtaining the color intensity of the moving object in the current frame on the red channel, the color intensity on the green channel, and the color intensity on the blue channel. The color intensity on the color channel; assign corresponding color intensity weight values to the color intensity of the moving target on the red channel, the color intensity on the green channel, and the color intensity on the blue channel; The color intensity is weighted and summed to obtain the image feature value of the moving target in the current frame.
具体地,运动目标在红色通道上的颜色强度、在绿色通道上的颜色强度, 及在蓝色通道上的颜色强度赋予相应的颜色强度权重值根据不同运动目标在红 色通道、绿色通道及蓝色通道上的颜色强度的差异大小确定,颜色强度差异越 大,该通道上的颜色强度权重值越大。本实施例中,通过不同运动目标之间的 颜色差异确定每个运动目标的颜色特征值以用于上下文先验模型更新,以确保 上下文先验模型跟踪准确,进一步提高跟踪准确性。Specifically, the color intensity of the moving object on the red channel, the color intensity on the green channel, and the color intensity on the blue channel are given corresponding color intensity weight values according to the red channel, green channel and blue channel of different moving objects. The difference in color intensity on a channel is determined, and the greater the difference in color intensity, the greater the weight value of the color intensity on that channel. In this embodiment, the color feature value of each moving object is determined by the color difference between different moving objects for updating the context prior model, so as to ensure the accurate tracking of the context prior model and further improve the tracking accuracy.
在一个实施例中,上述视频中运动目标的跟踪方法还包括:提取追踪场地 边线区域,建立追踪场地俯视二维模型,将跟踪位置投影至追踪场地俯视二维 模型中的第一投影坐标。In one embodiment, the method for tracking a moving target in the above-mentioned video further includes: extracting the sideline area of the tracking site, establishing a two-dimensional model of the tracking site, and projecting the tracking position to the first projected coordinates in the two-dimensional model of the tracking site.
通过步骤102至步骤108跟踪得到的运动目标的坐标及位置关系都是在第 一视角的摄像机所拍摄得到的原视图中的位置,为了将追踪结果形象化展示以 便于跟踪分析,本实施例中,建立一个追踪场地的俯视二维模型同步显示各运 动目标的跟踪位置。在该追踪场地俯视二维模型中每个运动目标具有一个目标 标识,完成每一帧追踪后,将运动目标的目标标识由上一帧中的相应第一投影 坐标移动至与当前确定的跟踪位置对应的第一投影坐标处。The coordinates and positional relationship of the moving target obtained by tracking through steps 102 to 108 are the positions in the original view captured by the camera of the first viewing angle. In order to visualize the tracking results for easy tracking and analysis, in this embodiment , establish a top-view two-dimensional model of the tracking site to simultaneously display the tracking position of each moving target. In the two-dimensional model of the tracking site, each moving target has a target identification. After each frame of tracking is completed, the target identification of the moving target is moved from the corresponding first projected coordinates in the previous frame to the currently determined tracking position. corresponding to the first projected coordinates.
一般的,追踪场地的二维模型图为俯视图,而原视频的拍摄角度一般是带 有一定角度的侧视图。本实施例中,根据摄像机所在位置及角度进行视角以及 数据尺度上的转换,将运动目标的跟踪位置同步地显示在追踪场地的俯视二维 模型上。具体地,本实施例通过齐次变换建立原视频图像与二维模型的转化关 系。首先,将二维平面的投影变换表示为齐次坐标下向量与一个3x3矩阵的乘 积,即为x’=Hx,具体单应变换矩阵表示如下:Generally, the 2D model map of the tracking site is a top view, while the shooting angle of the original video is generally a side view with a certain angle. In this embodiment, the angle of view and data scale conversion are performed according to the position and angle of the camera, and the tracking position of the moving target is synchronously displayed on the top-view two-dimensional model of the tracking site. Specifically, this embodiment establishes the conversion relationship between the original video image and the two-dimensional model through homogeneous transformation. First, the projection transformation of the two-dimensional plane is expressed as the product of the vector under the homogeneous coordinates and a 3x3 matrix, which is x’=Hx, and the specific homography transformation matrix is expressed as follows:
由以上单应变换矩阵可知,平面单应变换为八个自由度,求解变换矩阵中 的八个未知数即可求得单应变换矩阵,完成目标投影变换。由于一组对应点坐 标可由上述矩阵乘式得到两个方程,要求原变换矩阵中所有未知数,需要四组 方程,故,若要求得单应变换矩阵,只需知道对应的四组点坐标即可。具体的, 本实施例中,通过提取追踪场地边线区域确定追踪场地的四组顶点坐标,以求 得变换矩阵,实现二维投影变换。本实施例通过单应变换矩阵计算三维视频图 像的二维投影变换,无需获取摄像设备的参数信息,视频分析系统简单易用, 转换灵活性高。From the above homography transformation matrix, it can be seen that the planar homography transformation has eight degrees of freedom, and the homography transformation matrix can be obtained by solving the eight unknowns in the transformation matrix, and the target projection transformation is completed. Since a set of corresponding point coordinates can be obtained by the above matrix multiplication, two equations are required, and four sets of equations are required for all unknowns in the original transformation matrix. Therefore, if the homography transformation matrix is required, it is only necessary to know the corresponding four sets of point coordinates. . Specifically, in this embodiment, four sets of vertex coordinates of the tracking site are determined by extracting the edge area of the tracking site to obtain a transformation matrix and realize two-dimensional projection transformation. In this embodiment, the two-dimensional projection transformation of the three-dimensional video image is calculated through the homography transformation matrix, without obtaining the parameter information of the camera equipment, the video analysis system is simple and easy to use, and the conversion flexibility is high.
在一个实施例中,上述视频中运动目标的跟踪方法还包括:获取第二视角 下的视频,并计算在第二视角下的视频中与下一帧对应的视频帧的运动目标的 跟踪位置在追踪场地俯视二维模型中的第二投影坐标;分别将第一视角下的视 频中的当前帧运动目标的遮挡率和第二视角下的视频中的当前帧运动目标的遮 挡率与预设遮挡率阈值进行比较;当第一视角下的视频中的当前帧运动目标的 遮挡率和第二视角下的视频中的当前帧运动目标的遮挡率均小于或等于预设遮 挡率阈值时,根据第一投影坐标和第二投影坐标计算运动目标在追踪场地俯视 二维模型中的目标投影坐标;当第一视角下的视频中的当前帧运动目标的遮挡 率大于预设遮挡率阈值时,选取第二投影坐标作为运动目标在追踪场地俯视二 维模型中的目标投影坐标;当第二视角下的视频中的当前帧运动目标的遮挡率 大于预设遮挡率阈值时,选取第一投影坐标作为运动目标在追踪场地俯视二维 模型中的目标投影坐标。In one embodiment, the above-mentioned method for tracking a moving object in a video further includes: acquiring the video under the second viewing angle, and calculating the tracking position of the moving object in the video frame corresponding to the next frame in the video under the second viewing angle. Track the second projected coordinates in the two-dimensional model of the field overlooking; respectively compare the occlusion rate of the current frame moving object in the video under the first viewing angle and the occlusion rate of the current frame moving object in the video under the second viewing angle with the preset occlusion When the occlusion rate of the moving object in the current frame of the video under the first viewing angle and the occlusion rate of the moving object in the current frame of the video under the second viewing angle are less than or equal to the preset occlusion rate threshold, according to the first The first projection coordinates and the second projection coordinates calculate the target projection coordinates of the moving target in the two-dimensional model of the tracking site; when the occlusion rate of the current frame of the moving target in the video under the first viewing angle is greater than the preset occlusion rate threshold, select the second The second projection coordinates are used as the target projection coordinates of the moving target in the two-dimensional model of the tracking site; when the occlusion rate of the current frame of the moving target in the video under the second viewing angle is greater than the preset occlusion rate threshold, the first projection coordinate is selected as the motion The projected coordinates of the target in the 2D model of the tracking site.
具体地,第二视角下的视频中下一帧运动目标跟踪位置的确定及跟踪位置 在追踪场地俯视二维模型中的第二投影坐标的转换过程及原理均与第一视角下 的视频中下一帧运动目标跟踪位置的确定及跟踪位置在追踪场地俯视二维模型 中的第二投影坐标的转换相同,在此不予赘述。Specifically, the determination of the tracking position of the moving target in the next frame in the video under the second viewing angle and the conversion process and principle of the second projected coordinates of the tracking position in the two-dimensional model of the tracking site overlooked are the same as those in the video under the first viewing angle. The determination of the tracking position of a moving target in one frame and the conversion of the second projected coordinates of the tracking position in the two-dimensional model of the tracking site overlooked are the same, and will not be repeated here.
在多运动目标追踪场景中,由于目标运动复杂,极其容易出现大面积遮挡 甚至完全遮挡的情况,如果两个跟踪框叠加到一起会发生漂移跳变,并且,跟 踪过程中当运动目标发生大幅度遮挡时,即使未发生漂移跳变,由于在此视角 下无法判断有遮挡关系的物体间的距离,被遮挡物体所获得的坐标信息也是不 准确的。因此,本实施例根据运动目标发生遮挡的情况判断第一视角下得到的 运动目标的第一投影坐标和第二视角下得到的运动目标的第二投影坐标是否有 误,如果某一视角下运动目标发生遮挡的情况较严重,运动目标的遮挡率大于 预设遮挡率阈值,则该视角下得到的运动目标的投影坐标错误,选取另一个视 角下得到的运动目标的投影坐标作为最终的目标投影坐标;如果两个视角下运 动目标的遮挡率都小于或等于预设遮挡率阈值,则两个视角下得到的运动目标 的投影坐标均是正确的,此时为第一投影坐标和第二投影坐标赋予权重值,根 据第一投影坐标和第二投影坐标进行加权和计算,得到运动目标在追踪场地俯 视二维模型中的目标投影坐标,同时根据第一投影坐标和第二投影坐标对对跟 踪结果进行优化,确保跟踪准确。In the multi-moving target tracking scene, due to the complex movement of the target, it is extremely easy to have large-area occlusion or even complete occlusion. If two tracking frames are superimposed together, drift and jump will occur. During occlusion, even if there is no drift jump, the coordinate information obtained by the occluded object is inaccurate because the distance between objects with occlusion relationship cannot be judged under this viewing angle. Therefore, this embodiment judges whether the first projected coordinates of the moving object obtained under the first viewing angle and the second projected coordinates of the moving object obtained under the second viewing angle are wrong according to the occlusion of the moving object. The occlusion of the target is more serious. If the occlusion rate of the moving target is greater than the preset occlusion rate threshold, the projected coordinates of the moving target obtained from this viewing angle are wrong. Select the projected coordinates of the moving target obtained from another viewing angle as the final target projection Coordinates; if the occlusion rate of the moving target in the two viewing angles is less than or equal to the preset occlusion rate threshold, the projected coordinates of the moving target obtained in the two viewing angles are all correct, which is the first projection coordinate and the second projection The coordinates are assigned weight values, and the weighted sum calculation is performed according to the first projected coordinates and the second projected coordinates to obtain the target projected coordinates of the moving target in the two-dimensional model of the tracking site, and at the same time, the pair is tracked according to the first projected coordinates and the second projected coordinates Results are optimized to ensure accurate tracking.
具体地,由于在跟踪过程中,跟踪框的尺寸是固定的,但跟踪的运动目标 在摄像机视角下存在近大远小的关系。因此,在一个实施例中,定义界定遮挡 情况的预设遮挡率阈值与目标在二维模型上距离摄像机的距离有关,根据运动 目标在追踪场地俯视二维模型中距离摄像机的距离计算预设遮挡率阈值,并根 据第一视角视频及第二视角视频中运动目标距离摄像机的距离计算第一视角视 频中下一帧运动目标的跟踪位置及第二视角视频中下一帧运动目标的跟踪位置 的权重。Specifically, in the tracking process, the size of the tracking frame is fixed, but the tracked moving target has a relationship between near and far being small in the perspective of the camera. Therefore, in one embodiment, the preset occlusion rate threshold defined to define the occlusion situation is related to the distance of the target from the camera on the 2D model, and the preset occlusion is calculated according to the distance of the moving target from the camera in the 2D model of the tracking field overlooking rate threshold, and calculate the tracking position of the moving target in the next frame of the first viewing angle video and the tracking position of the moving target in the next frame of the second viewing angle video according to the distance between the moving target in the first viewing angle video and the second viewing angle video. Weights.
本实施例通过对不同视角下的视频进行同时追踪,将每个视角下得到的运 动目标的跟踪位置均投影到追踪场地俯视二维模型中,并根据运动目标发生遮 挡的情况将两个视角下对同一运动目标的跟踪结果统一化,基于双视角的视频 追踪对跟踪结果进行优化,确保跟踪准确,大大提高跟踪准确性。In this embodiment, through simultaneous tracking of videos under different viewing angles, the tracking position of the moving target obtained under each viewing angle is projected into the two-dimensional model of the tracking site, and according to the occlusion of the moving target, the two viewing angles The tracking results of the same moving target are unified, and the tracking results are optimized based on dual-view video tracking to ensure accurate tracking and greatly improve tracking accuracy.
在一个实施例中,选取第二投影坐标作为运动目标在追踪场地俯视二维模 型中的目标投影坐标的步骤之后,还包括:根据第二投影坐标对第一投影坐标 进行修正;选取第一投影坐标作为运动目标在追踪场地俯视二维模型中的目标 投影坐标的步骤之后,还包括:根据第一投影坐标对第二投影坐标进行修正。 本实施例中,当某一视角下的跟踪结果有误时,通过另一视角下的跟踪结果对 错误的跟踪结果进行修正,并根据修正后的跟踪结果更新时空上下文模型,以 确保后续跟踪结果准确,进一步提高跟踪准确性。In one embodiment, after selecting the second projected coordinates as the moving target, after the step of tracking the projected coordinates of the target in the top view two-dimensional model of the field, it further includes: correcting the first projected coordinates according to the second projected coordinates; selecting the first projected coordinates Coordinates as the moving target After the step of tracking the projected coordinates of the target in the top view two-dimensional model of the site, the method further includes: correcting the second projected coordinates according to the first projected coordinates. In this embodiment, when the tracking result under a certain perspective is wrong, the wrong tracking result is corrected through the tracking result under another perspective, and the space-time context model is updated according to the corrected tracking result to ensure that the follow-up tracking results Accurate, further improving tracking accuracy.
进一步地,为了便于理解本发明的技术方案,以下结合图4至图6,以足球 视频跟踪为例对上述视频中运动目标的跟踪方法进行详细说明。为便于说明, 将两个足球队定义为甲队和乙队,其中,甲队队员在球场俯视二维模型中的球 员标识为矩形,乙队队员在球场俯视二维模型中的球员标识为圆形。Further, in order to facilitate the understanding of the technical solution of the present invention, below in conjunction with Fig. 4 to Fig. 6, take football video tracking as an example to describe the tracking method of the moving target in the above-mentioned video in detail. For the convenience of explanation, two football teams are defined as team A and team B, where the players of team A are identified as rectangles in the two-dimensional model of the field overlooking the stadium, and the players of team B in the two-dimensional model of the field overlooking the player are identified as circles shape.
一种视频中运动目标的跟踪方法,包括以下步骤:A method for tracking a moving target in a video, comprising the following steps:
1)、确定运动目标初始位置,标定运动目标跟踪框。1) Determine the initial position of the moving target, and calibrate the tracking frame of the moving target.
首先,读取第t帧图像,通过人工标定跟踪框确定第t帧中各球员(即运动 目标)的初始位置。具体地,人工标定球员跟踪框时,可采用鼠标选定跟踪框, 分别对第一视角下的视频中及第二视角下的视频中第一帧中球员的初始位置进 行标定,确定第一视角下的视频中及第二视角下的视频中第一帧中球员的初始 位置。进一步的,完成球员初始位置标定后,终端进一步计算并存储每个球员 跟踪框的跟踪框面积。First, read the t-th frame image, and determine the initial position of each player (that is, the moving target) in the t-th frame by manually marking the tracking frame. Specifically, when manually marking the player tracking frame, the mouse can be used to select the tracking frame, and the initial position of the player in the first frame of the video under the first viewing angle and the video under the second viewing angle are respectively calibrated to determine the first viewing angle. The player's initial position in the first frame of the video below and in the second-view video. Further, after the player's initial position calibration is completed, the terminal further calculates and stores the tracking frame area of each player's tracking frame.
2)、计算当前帧的运动目标的遮挡率。2) Calculate the occlusion rate of the moving target in the current frame.
具体地,分别计算第一视角下的视频中及第二视角下的视频中对应的当前 帧的每个球员的遮挡率,根据当前帧中每个球员跟踪框的跟踪框面积和当前球 员发生遮挡的遮挡面积计算第一视角下的视频中及第二视角下的视频中对应的 当前帧的每个球员的遮挡率,具体运动目标的遮挡率的计算原理及过程已在前 述实施例中详细描述,在此不再赘述。Specifically, calculate the occlusion rate of each player in the corresponding current frame in the video under the first viewing angle and the video under the second viewing angle, according to the tracking frame area of each player’s tracking frame in the current frame and the occlusion of the current player Calculate the occlusion rate of each player in the corresponding current frame in the video under the first viewing angle and the video under the second viewing angle. The calculation principle and process of the occlusion rate of a specific moving target have been described in detail in the foregoing embodiments. , which will not be repeated here.
3)、计算时空上下文模型的学习速率,更新时空上下文模型。3) Calculate the learning rate of the spatio-temporal context model and update the spatio-temporal context model.
时间上的上下文信息是连续帧间的时间关联性,空间上的上下文信息是由 跟踪目标与其周围可定范围内的背景图像的组合。利用时空上下文信息对目标 进行跟踪首先需要建立跟踪模型,具体地,目标跟踪问题,即为一个目标出现 位置的概率问题。令o为要跟踪的目标,x为一个图像上的二维坐标点,P(x|o) 表示坐标x在目标o中出现,将目标跟踪转换为了最大置信度的计算问题。The temporal context information is the time correlation between consecutive frames, and the spatial context information is the combination of the tracking target and the background image within a definite range around it. Using spatio-temporal context information to track a target first requires the establishment of a tracking model. Specifically, the target tracking problem is the probability of a target's location. Let o be the target to be tracked, x be a two-dimensional coordinate point on an image, P(x|o) means that the coordinate x appears in the target o, and the target tracking is transformed into a calculation problem of maximum confidence.
令:make:
M(x)=P(x|o); 公式(4)M(x)=P(x|o); formula (4)
则当置信图M(x)取值最大时,所对应的坐标x,即可认为是目标o最可能 出现的位置。如图5所示,实线框范围内为目标所在区域,外部虚线框范围内 为局部上下文区域。用目标中心位置坐标x*代表目标所在位置,z为局部上下文 区域内一点。定义目标x*的局部上下文区域为Ωc(x*),并定义此局部区域的上下文 特征集合为XC={c(z)=(I(z),z)|z∈Ωc(x*)},其中,I(z)为z坐标处的图像特征 值。利用全概率公式,以局部上下文特征为中间量,将公式(4)式展开,即可 以得到:Then when the value of the confidence map M(x) is the largest, the corresponding coordinate x can be considered as the most likely position of the target o. As shown in Figure 5, the area within the solid line frame is the area where the target is located, and the area within the outer dashed line frame is the local context area. Use the target center position coordinate x * to represent the position of the target, and z to be a point in the local context area. Define the local context area of the target x * as Ω c (x * ), and define the context feature set of this local area as X C ={c(z)=(I(z),z)|z∈Ω c (x * )}, where I(z) is the image feature value at the z coordinate. Using the full probability formula and taking the local context feature as the intermediate quantity, the formula (4) can be expanded to get:
其中,P(x|c(z),o)表示当给定目标o和其局部上下文特征c(z)时,目标出现 在x点的概率,它为跟踪目标所在位置与其上下文信息的空间关系建立了空间 上下文模型。而P(c(z)|o)代表某一上下为文特征c(z)出现在目标o中的概率,是 目标o的上下文先验概率,为当前局部上下文的外观先验模型。其中,上下文 先验模型表示当通过计算置信图M(x)进行目标位置预测时,所选择的是与上一 帧目标所处位置外观相似的上下文,而空间上下文模型则保证了所选择的新目 标位置不仅在外观上与原目标相似,且在空间位置上也是具有合理性的,从而 在一定程度上避免因其他外观相似的物体出现而形成干扰,避免造成跟踪中的 漂移现象。Among them, P(x|c(z),o) represents the probability of the target appearing at point x when the target o and its local context feature c(z) are given, which is the spatial relationship between the location of the tracking target and its context information A spatial context model is established. And P(c(z)|o) represents the probability that a contextual feature c(z) appears in the target o, which is the context prior probability of the target o, and is the appearance prior model of the current local context. Among them, the context prior model indicates that when the target position is predicted by calculating the confidence map M(x), the selected context is similar to the target position in the previous frame, while the spatial context model guarantees that the selected new The target position is not only similar to the original target in appearance, but also reasonable in spatial position, so as to avoid interference caused by other similar-looking objects to a certain extent, and avoid drifting in tracking.
基于上述,本实施例中,预先对公式(5)中的每一部分进行具体的数学模 型建立,具体包括置信图建模、空间上下文模型建模和上下文先验模型建模。Based on the above, in this embodiment, a specific mathematical model is established in advance for each part in the formula (5), specifically including confidence map modeling, spatial context model modeling and context prior model modeling.
首先,置信图的建模如下:由于视频中第一帧中的目标位置已知(根据对 初始帧进行跟踪框标定得到),置信图M(x)应满足距离目标x*位置越近,其置 信度越大这一性质。因此,令:First, the modeling of the confidence map is as follows: Since the target position in the first frame of the video is known (according to the tracking frame calibration of the initial frame), the confidence map M(x) should satisfy the closer the distance to the target x* position, its The larger the confidence is of this property. Therefore, let:
其中,b为归一化常参数;α为尺度常参数,β为函数曲线图像控制常参数。 α与跟踪目标的尺寸大小相关,取值范围为1.75~2.75;β取值范围为0.5~1.5。 在一个实施例中,α=2.25,β=1。Among them, b is a normalized constant parameter; α is a scale constant parameter, and β is a function curve image control constant parameter. α is related to the size of the tracking target, and its value ranges from 1.75 to 2.75; β ranges from 0.5 to 1.5. In one embodiment, α=2.25 and β=1.
其次,空间上下文模型P(x|c(z),o)的建模如下:由于空间上下文模型关注的 是跟踪目标与其局部上下文的空间位置关系,即包括距离和方向关系,故定义 P(x|c(z),o)为一非径向对称的函数:Secondly, the modeling of the spatial context model P(x|c(z),o) is as follows: Since the spatial context model focuses on the spatial position relationship between the tracking target and its local context, that is, including the distance and direction relationship, the definition of P(x |c(z),o) is a non-radial symmetric function:
P(x|c(z),o)=hsc(x-z); 公式(7)P(x|c(z),o)=hsc(x-z); formula (7)
其中:x为目标所在位置,z为其局部上下文中某一位置,则即使当有两点 z1、z2与目标中心位置x*距离相同时,由于其所处位置不同,hsc(x*-z1)≠hsc(x *-z2),即表明它们对于x*来说,代表着不同的上下文,以有效区分不同的空 间关系,防止发生歧义。Among them: x is the position of the target, z is a certain position in the local context, even when there are two points z1, z2 with the same distance from the center position x* of the target, since their positions are different, hsc(x*-z1 )≠hsc(x*-z2), which means that they represent different contexts for x*, so as to effectively distinguish different spatial relationships and prevent ambiguity.
最后,上下文先验模型P(c(z)|o)的建模如下:上下文先验模型反映的是当前 局部/上下文自身的空间构成,直观地考虑,应与上下文空间的图像特征与其空 间位置结构有关。因此,令:Finally, the modeling of the context prior model P(c(z)|o) is as follows: the context prior model reflects the spatial composition of the current local/context itself, intuitively considered, it should be related to the image features of the context space and its spatial position structure related. Therefore, let:
P(c(z)|o)=I(z)ωσ(z-x*); 公式(8)P(c(z)|o)=I(z)ω σ (zx * ); formula (8)
其中,I(z)为局部上下文区域中z点处的图像特征值,ωσ(Δ)为权值函数。Among them, I(z) is the image feature value at point z in the local context area, and ω σ (Δ) is the weight function.
具体地,在跟踪过程中,可类比人眼追踪某物的过程,距离跟踪目标越近 的上下文区域,可认为是与跟踪目标越相关的,因此重要性越高,而距离跟踪 目标越远的上下文区域,可认为是与跟踪目标较为无关的部分,因此重要性较 低。据此,定义:Specifically, in the tracking process, it can be compared to the process of human eyes tracking something. The closer the context area to the tracking target, the more relevant it is to the tracking target, and therefore the higher the importance, while the farther away from the tracking target The context area, which can be considered as a part that is relatively irrelevant to the tracking target, is therefore less important. Accordingly, define:
其中,Δ为两点间距离,λ为归一化常参数,用于使P(c(z)|o)的取值处于0到 1之间,以符合概率函数的定义;σ为尺度参数,与跟踪目标大小相关。Among them, Δ is the distance between two points, λ is a normalized constant parameter, which is used to make the value of P(c(z)|o) between 0 and 1, so as to conform to the definition of probability function; σ is a scale parameter , which is related to the size of the tracking target.
将公式(9)代入公式(8)得到上下文先验模型如下:Substitute formula (9) into formula (8) to get the context prior model as follows:
也即,将局部上下文的空间构成建模为此区域内各点图像特征值的高斯加 权和。That is, the spatial composition of the local context is modeled as the Gaussian weighted sum of the image feature values of each point in this region.
进一步的,本实施例中,完成上述置信图建模、空间上下文模型建模和上 下文先验模型建模后,进一步根据上述置信图、空间上下文模型和上下文先验 模型更新时空上下文模型:Further, in this embodiment, after completing the above confidence map modeling, spatial context model modeling and context prior model modeling, the spatiotemporal context model is further updated according to the above confidence map, spatial context model and context prior model:
首先,将公式(5)代入公式(6)、公式(7)及公式(10),得到:First, substitute formula (5) into formula (6), formula (7) and formula (10), and get:
其中,hsc(x-z)为空间上下文模型,即每一帧图像要计算并学习的对象。Among them, hsc(x-z) is the spatial context model, that is, the object to be calculated and learned for each frame of image.
根据卷积的定义:According to convolution Definition:
公式(11)可变化为:Equation (11) can be changed to:
根据卷积定理,有: According to the convolution theorem, there are:
则:but:
其中,和分别表示傅立叶及反傅立叶变换。in, and denote the Fourier and inverse Fourier transforms, respectively.
假定第t帧时,已知目标中心位置及目标的局部上下文区域Ωc(x*),可 以计算得到第t帧中跟踪目标及其局部上下文区域的空间上下文模型,记为 因为所处理的是连续的视频序列,时间上的上下文相关性对于跟踪结果 也至关重要。为将这一维度也纳入考虑,设置时空上下文模型学习速率ρ,将每 一帧跟踪目标的时空上下文模型表示为历史时空上下文模型与新习得的空间上 下文模型两部分,如下所示:Assume that at the tth frame, the center position of the target is known and the local context area Ω c (x * ) of the target, the spatial context model of the tracking target and its local context area in the tth frame can be calculated, denoted as Since we are dealing with continuous video sequences, temporal contextual correlation is also crucial for tracking results. In order to take this dimension into consideration, set the spatio-temporal context model learning rate ρ, and express the spatio-temporal context model of each frame tracking target as two parts: the historical spatio-temporal context model and the newly acquired spatial context model, as follows:
其中,为第t帧的空间上下文模型;为第t帧的时空上下文模型,为第t+1帧的时空上下文模型。in, is the spatial context model of frame t; is the spatio-temporal context model of frame t, is the spatio-temporal context model of frame t+1.
一般的,在存在多个外观相似的跟踪目标的情境下,当发生遮挡状况时, 即目标的外观模型已经发生了较大变化,而此时时空上下文模型仍按照同等的 速率学习并更新,便会不断学习到错误模型进而最终丢失跟踪目标。本实施例 根据运动目标的遮挡情况动态确定学习速率,学习速率ρ为自动更新的动态值, 能够有效防止更新过快而完全丢失历史模型信息。具体来说,当跟踪目标被其 他物体遮挡时,目标外观模型会受到改变,进而空间上下文模型可信度降低, 需要降低学习速率,以防学习到错误的模型,确保跟踪准确。本实施例中,根 据运动目标的遮挡率计算时空上下文模型的学习速率更新上述得到的运动目标 的时空上下文模型。具体根据公式(3)计算时空上下文模型的学习速率,在此 不予赘述。Generally, in a situation where there are multiple tracking targets with similar appearance, when an occlusion occurs, that is, the appearance model of the target has changed greatly, and the spatio-temporal context model is still learning and updating at the same rate at this time. It will continue to learn the wrong model and eventually lose the tracking target. In this embodiment, the learning rate is dynamically determined according to the occlusion of the moving object. The learning rate ρ is an automatically updated dynamic value, which can effectively prevent the historical model information from being completely lost due to too fast updating. Specifically, when the tracking target is occluded by other objects, the target appearance model will be changed, and the reliability of the spatial context model will be reduced, so the learning rate needs to be reduced to prevent learning the wrong model and ensure accurate tracking. In this embodiment, the learning rate of the spatiotemporal context model is calculated according to the occlusion rate of the moving object to update the spatiotemporal context model of the moving object obtained above. Specifically, the learning rate of the spatio-temporal context model is calculated according to formula (3), which will not be repeated here.
4)、获取运动目标的图像特征值,更新运动目标的上下文先验模型。4) Obtain the image feature value of the moving target, and update the context prior model of the moving target.
具体的,通过以下公式计算得到运动目标的图像特征值:Specifically, the image feature value of the moving target is calculated by the following formula:
I(x)=w1·IR(x)+w2·IG(x)+w3·IB(x); 公式(17)I(x)=w 1 ·I R (x)+w 2 ·I G (x)+w 3 ·I B (x); formula (17)
其中,IR(x)为x处在红色通道上的颜色强度;IG(x)为x处在绿色通道上的颜 色强度;IB(x)为x处在蓝色通道上的颜色强度;w1、w2、w3为权值且w1+w2+w3=1。 在一个实施例中,两队队服颜色在R通道差异较为明显,将IR(x)赋予较大的权 值,w1=0.4,w2=0.3,w3=0.3。Among them, I R (x) is the color intensity of x on the red channel; I G (x) is the color intensity of x on the green channel; I B (x) is the color intensity of x on the blue channel ; w 1 , w 2 , and w 3 are weights and w 1 +w 2 +w 3 =1. In one embodiment, the color of the uniforms of the two teams is significantly different in the R channel, and I R (x) is assigned a greater weight, w 1 =0.4, w 2 =0.3, and w 3 =0.3.
5)、对更新后的时空上下文模型和上下文先验模型进行卷积运算,得到下 一帧运动目标的跟踪位置。5) Perform convolution operation on the updated spatio-temporal context model and context prior model to obtain the tracking position of the moving target in the next frame.
具体地,在第t+1帧时,已知其时空上下文模型为更新过后的再计算 出t+1帧的上下文先验模型即可通过式(7)至公式(11)进行 卷积计算得到第t+1帧的置信图,如下:Specifically, at frame t+1, it is known that its spatio-temporal context model is updated Then calculate the context prior model of t+1 frame That is, the confidence map of the t+1th frame can be obtained by performing convolution calculation from formula (7) to formula (11), as follows:
则有,当第t+1帧的置信图Mt+鯐(x)取最大值时,所对应的x值即认为是第t+1 帧中运动目标的中心位置从而确定运动目标的跟踪位置,即分别确定了第 一视角下的视频中下一帧被跟踪球员的跟踪位置,及第二视角下的视频中下一 帧被跟踪球员的跟踪位置。Then, when the confidence map M t+鯐(x) of the t+1th frame takes the maximum value, the corresponding x value is considered to be the center position of the moving target in the t+1th frame Therefore, the tracking position of the moving target is determined, that is, the tracking position of the next frame of the tracked player in the video under the first viewing angle and the tracking position of the next frame of the tracking player in the video of the second viewing angle are respectively determined.
6)、建立球场俯视二维模型,将第一视角下的视频中下一帧运动目标的跟 踪位置投影至球场俯视二维模型中的第一投影坐标,将第二视角下的视频中下 一帧运动目标的跟踪位置投影至球场俯视二维模型中的第二投影坐标。6) Establish a two-dimensional model of the stadium overlooking, project the tracking position of the next frame of the moving target in the video under the first viewing angle to the first projection coordinates in the two-dimensional model of the stadium overlooking, and project the next frame of the moving target in the video under the second viewing angle The tracking position of the moving target in the frame is projected to the second projected coordinates in the two-dimensional model overlooking the stadium.
具体地,本实施例中,选取足球比赛场地半场的四个角点作为计算平面单 应变换矩阵的四个参考点。首先,通过一系列数字图像处理中的阈值处理技术 与霍夫变换直线检测,提取球场边线区域;再将分散的线段进行合并,得到球 场边线直线方程并且得到四组标定点坐标,最后,根据四组标定点坐标得到两 个视角的转换矩阵,具体球场俯视二维模型如图6所示。Specifically, in this embodiment, the four corner points of the football match field are selected as the four reference points for calculating the plane homography transformation matrix. First, through a series of threshold processing techniques in digital image processing and Hough transform line detection, extract the sideline area of the court; then merge the scattered line segments to obtain the straight line equation of the sideline of the court and four sets of coordinates of calibration points. Finally, according to the four Group the coordinates of the calibration points to obtain the transformation matrix of the two viewing angles. The specific two-dimensional model of the stadium is shown in Figure 6.
7)、检测第一投影坐标和第二投影坐标是否错误。7) Detecting whether the first projected coordinates and the second projected coordinates are wrong.
具体地,分别将第一视角下的视频中的当前帧球员的遮挡率和第二视角下 的视频中的当前帧球员的遮挡率与预设遮挡率阈值进行比较,判断第一投影坐 标和第二投影坐标是否错误。如果第一视角下的视频中的当前帧球员的遮挡率 大于预设遮挡率阈值,则第一投影坐标错误;如果第二视角下的视频中的当前 帧球员的遮挡率大于预设遮挡率阈值,则第二投影坐标错误。如果第一视角下 的视频中的当前帧球员的遮挡率和第二视角下的视频中的当前帧球员的遮挡率 均小于或等于预设遮挡率阈值,那么第一投影坐标和第二投影坐标都是正确的。 本实施例中,预设遮挡率阈值根据球员在球场俯视二维模型中距离摄像机的距 离计算得到,其中,球员在球场俯视二维模型中距离摄像机的距离为:Specifically, the occlusion rate of the current frame player in the video under the first viewing angle and the occlusion rate of the current frame player in the video under the second viewing angle are compared with the preset occlusion rate threshold, and the first projection coordinate and the second projection coordinate are determined. 2 Whether the projected coordinates are wrong. If the occlusion rate of the current frame player in the video from the first viewing angle is greater than the preset occlusion rate threshold, the first projection coordinates are wrong; if the occlusion rate of the current frame player in the video from the second viewing angle is greater than the preset occlusion rate threshold , the coordinates of the second projection are wrong. If the occlusion rate of the current frame player in the video from the first viewing angle and the current frame player’s occlusion rate in the video from the second viewing angle are both less than or equal to the preset occlusion rate threshold, then the first projection coordinate and the second projection coordinate All are correct. In this embodiment, the preset occlusion rate threshold is calculated according to the distance from the camera to the player in the two-dimensional model overlooking the stadium, wherein the distance from the camera to the player in the two-dimensional model overlooking the stadium is:
其中,[x,y]为当前帧球员在球场俯视二维模型上的坐标,height,width分别 为球场俯视二维模型高度和宽度。Among them, [x, y] are the coordinates of the players in the current frame on the two-dimensional model of the stadium, and height and width are the height and width of the two-dimensional model of the stadium.
则,预设遮挡率阈值为:Then, the preset occlusion rate threshold is:
threshold=γe-μ·Δd; 公式(20)threshold=γe- μ·Δd ; formula (20)
其中,threshold为预设遮挡率阈值;γ和μ皆为常参数,γ用于调整预设遮挡率阈值变化范围,μ用于调整预设遮挡率阈值变化快慢。Among them, threshold is the preset occlusion rate threshold; γ and μ are constant parameters, γ is used to adjust the range of the preset occlusion rate threshold, and μ is used to adjust the speed of the preset occlusion rate threshold change.
8)、当第一投影坐标或第二投影坐标错误时,选取另一视角的投影坐标作 为球员的目标投影坐标。8), when the first projected coordinates or the second projected coordinates are wrong, choose the projected coordinates of another angle of view as the player's target projected coordinates.
具体地,第一视角下的视频和第二视角下的视频的拍摄角度不同,在第一 视角下拍摄得到的视频中球员发生遮挡,但此时,第二视角下拍摄的视频中球 员不会发生遮挡,因此,第一投影坐标和第二投影坐标不会同时发生错误。故, 当第一投影坐标错误时,选取第二投影坐标作为球员的目标投影坐标,第t帧跟 踪结束。当第二投影坐标错误时,选取第一投影坐标作为球员的目标投影坐标, 第t帧跟踪结束。Specifically, the shooting angles of the video under the first angle of view and the video under the second angle of view are different, and the players in the video captured under the first angle of view are blocked, but at this time, the players in the video captured under the second angle of view will not Occlusion occurs so that both the first projected coordinate and the second projected coordinate cannot be wrong at the same time. Therefore, when the first projected coordinates are wrong, select the second projected coordinates as the player's target projected coordinates, and the tracking of the tth frame ends. When the second projected coordinates are wrong, the first projected coordinates are selected as the player's target projected coordinates, and the tracking of frame t ends.
进一步的,在一个实施例中,当某一视角下的跟踪结果有误时,还通过另一 视角下的跟踪结果对错误的跟踪结果进行修正,以确保后续跟踪结果准确。假 定第一投影坐标错误,在第一视角下,发生跟踪漂移的被追踪球员由置信图给 出的最大似然位置为P1,将第一视角视频向球场俯视二维模型转换的投影矩阵为 H1,此时第二视角下,被追踪球员的最大似然位置为P2,将第二视角视频向球场 俯视二维模型转换的投影矩阵为H2,则P2在球场俯视二维模型上的第二投影坐标 为H2·P2,由于P2为正确跟踪的结果,于是将错误跟踪位置P1更新为正确的跟踪位 置在第一视角下的第一投影坐标为:Further, in one embodiment, when the tracking result under a certain viewing angle is wrong, the wrong tracking result is also corrected by using the tracking result under another viewing angle, so as to ensure the accuracy of the subsequent tracking result. Assuming that the coordinates of the first projection are wrong, in the first viewing angle, the maximum likelihood position of the tracked player with tracking drift given by the confidence map is P 1 , and the projection matrix for converting the first viewing angle video to the two-dimensional model of the pitch is H 1 , at this time, the maximum likelihood position of the tracked player in the second perspective is P 2 , and the projection matrix for converting the second perspective video to the two-dimensional model of the pitch is H 2 , then P 2 looks down on the two-dimensional model of the pitch The second projected coordinates on is H 2 ·P 2 , since P 2 is the result of correct tracking, the wrong tracking position P 1 is updated to the correct tracking position. The first projected coordinates of the first viewing angle are:
P1=H1 -1·H2·P2; 公式(21)P 1 =H 1 −1 ·H 2 ·P 2 ; formula (21)
同理,如果第二投影坐标错误,则根据第一投影坐标对第二投影坐标进行 修正,具体修正原理与上述第一投影坐标的修正原理相同,不再赘述。完成第 一投影坐标或第二投影坐标的修正后,再进一步根据修正后的跟踪结果更新对 应视角下的时空上下文模型,确保后续跟踪结果准确。Similarly, if the second projected coordinates are wrong, the second projected coordinates are corrected according to the first projected coordinates. The specific correction principle is the same as that of the first projected coordinates described above, and will not be repeated here. After completing the correction of the first projected coordinates or the second projected coordinates, further update the spatio-temporal context model under the corresponding perspective according to the corrected tracking results to ensure the accuracy of subsequent tracking results.
9)、当第一投影坐标和第二投影坐标都正确时,根据第一投影坐标和第二 投影坐标计算球员的目标投影坐标。9), when the first projected coordinates and the second projected coordinates are correct, calculate the target projected coordinates of the player according to the first projected coordinates and the second projected coordinates.
当第一投影坐标和第二投影坐标都正确时,通过第一投影坐标和第二投影 坐标互相辅助调整确定球员的目标投影坐标,确定目标投影坐标后第t帧跟踪结 束。具体地,在经过投影变换后的图像中,在离摄像机较近的地方,球员所处 的位置较为清晰,而在离摄像机较远的位置,球员因为发生形变拉伸,所处的 具体位置较为模糊。因此,当目标离某一视角下的摄像机越近时,认为在该摄 像机所拍摄视频中的跟踪结果更为可靠,即由该视角所得的跟踪结果,在最终 确定目标位置时,所占权重越大,故根据目标距离摄像机的距离确定第一投影 坐标和第二投影坐标的权重值。When the first projected coordinates and the second projected coordinates are all correct, through the first projected coordinates and the second projected coordinates, the mutual auxiliary adjustment determines the player's target projected coordinates, after determining the target projected coordinates, the tth frame tracking ends. Specifically, in the image after projection transformation, the position of the player is relatively clear at a position closer to the camera, while at a position farther from the camera, the specific position of the player is relatively clear due to deformation and stretching. Vague. Therefore, when the target is closer to the camera at a certain angle of view, the tracking result in the video shot by the camera is considered to be more reliable, that is, the tracking result obtained from this angle of view, when the target position is finally determined, the more weight it occupies. is large, so the weight values of the first projected coordinates and the second projected coordinates are determined according to the distance between the target and the camera.
假定在第一视角下,摄影机处于如图6所示位置。定义第一视角下的摄影的 位置为原点,乙队队员球员M在球场俯视二维模型上的坐标为posmodel 1=[x1 y1], 则有:Assume that in the first viewing angle, the camera is at the position shown in Figure 6 . Define the position of the photography in the first perspective as the origin, and the coordinates of player M of team B on the two-dimensional model of the pitch are pos model 1=[x 1 y 1 ], then:
如图6所示,第二视角下的摄像机位于第一视角下的摄像机的对面,第二视 角下摄影机所得的对球员M的跟踪结果转换到球场俯视二维模型上的坐标为 posmodel 2=[x2y2],则球员M距离第二视角下的摄影机的距离为:As shown in Figure 6, the camera under the second viewing angle is located opposite to the camera under the first viewing angle, and the coordinates of the tracking result of the player M obtained by the camera under the second viewing angle are converted to the two-dimensional model of the pitch as pos model 2= [x 2 y 2 ], then the distance between player M and the camera in the second perspective is:
其中,width和height分别为球场俯视二维模型的宽度和高度。Among them, width and height are the width and height of the two-dimensional model of the stadium overlooking.
则,融合第一投影坐标和第二投影坐标后,球员M在球场二维模型上的最终 位置为:posfinal=[x y], Then, after fusing the first projected coordinates and the second projected coordinates, the final position of the player M on the two-dimensional model of the stadium is: pos final = [xy],
进一步的,在一个实施例中,依据上述步骤(1)至步骤(10),对两个视 角下的足球视频进行追踪,追踪操作在PC计算机上实现,硬件环境:中央处理 器:Intel Core i5、主频为2.5GHz、内存为8GB。编程环境为Matlab 2014a。两 个视角下的原视频为avi格式,每帧图片大小为1696x1080,视频大小约20MB, 两个视频长度均约18秒,每秒取30帧,共计约540帧,本实施例中,跟踪速 率达到1s/帧,跟踪准确率达到100%。Further, in one embodiment, according to above-mentioned steps (1) to step (10), the football video under two viewing angles is tracked, and the tracking operation is realized on the PC computer, hardware environment: central processing unit: Intel Core i5 , The main frequency is 2.5GHz, and the memory is 8GB. The programming environment is Matlab 2014a. The original video under the two viewing angles is in avi format, the size of each frame is 1696x1080, and the video size is about 20MB. It reaches 1s/frame, and the tracking accuracy reaches 100%.
请参阅图7,一种视频中运动目标的跟踪装置700,包括:Please refer to FIG. 7 , a tracking device 700 for a moving target in a video, including:
遮挡率计算模块702,用于计算第一视角下的视频中的当前帧的运动目标的 遮挡率。The occlusion rate calculation module 702 is used to calculate the occlusion rate of the moving target in the current frame in the video under the first viewing angle.
时空上下文模型更新模块704,用于根据运动目标的遮挡率计算时空上下文 模型的学习速率,并根据学习速率更新运动目标的时空上下文模型。The spatio-temporal context model update module 704 is used to calculate the learning rate of the spatio-temporal context model according to the occlusion rate of the moving target, and update the spatio-temporal context model of the moving target according to the learning rate.
上下文先验模型更新模块706,用于获取当前帧中的运动目标的图像特征值, 根据图像特征值更新运动目标的上下文先验模型。The context prior model updating module 706 is configured to acquire the image feature value of the moving object in the current frame, and update the context prior model of the moving object according to the image feature value.
跟踪模块708,用于对更新后的时空上下文模型和上下文先验模型进行卷积 运算,得到第一视角下的视频中的下一帧的运动目标的跟踪位置。The tracking module 708 is used to perform convolution operations on the updated spatio-temporal context model and the context prior model to obtain the tracking position of the moving target in the next frame of the video under the first viewing angle.
如图8所示,在一个实施例中,时空上下文模型更新模块704包括:As shown in Figure 8, in one embodiment, the spatio-temporal context model updating module 704 includes:
交点检测子模块7042,用于检测当前帧的不同的运动目标的跟踪框之间是 否包括交点。The intersection detection sub-module 7042 is used to detect whether an intersection is included between the tracking frames of different moving objects in the current frame.
遮挡面积计算子模块7044,用于当不同的运动目标的跟踪框之间包括交点 时,计算不同的运动目标的跟踪框之间重叠部分的长度和宽度,并根据长度和 宽度计算运动目标发生遮挡的遮挡面积。The occlusion area calculation sub-module 7044 is used to calculate the length and width of the overlapping part between the tracking frames of different moving objects when the tracking frames of different moving objects include intersection points, and calculate the occlusion of the moving object according to the length and width shading area.
遮挡率计算子模块7046,用于获取预先存储的运动目标的跟踪框面积,计 算运动目标的遮挡率为遮挡面积与跟踪框面积的比值。The occlusion rate calculation sub-module 7046 is used to obtain the pre-stored tracking frame area of the moving object, and calculate the ratio of the occlusion rate of the moving object to the area of the tracking frame.
在其中一个实施例中,学习速率计算模块702采用以下公式计算学习速率:In one of the embodiments, the learning rate calculation module 702 uses the following formula to calculate the learning rate:
其中:e为自然对数;ΔS为运动目标的遮挡率;k、均为常参数。Among them: e is the natural logarithm; ΔS is the occlusion rate of the moving target; k, are constant parameters.
如图9所示,在一个实施例中,上下文先验模型更新模块706包括:As shown in FIG. 9, in one embodiment, the context prior model update module 706 includes:
颜色强度获取子模块7062:用于获取当前帧中的运动目标在红色通道上的 颜色强度、在绿色通道上的颜色强度以及在蓝色通道上的颜色强度。Color intensity acquisition sub-module 7062: used to acquire the color intensity of the moving target in the current frame on the red channel, the color intensity on the green channel and the color intensity on the blue channel.
颜色强度权重值选取子模块7064,用于为运动目标在红色通道上的颜色强 度、在绿色通道上的颜色强度,及在蓝色通道上的颜色强度赋予相应的颜色强 度权重值。The color intensity weight value selection sub-module 7064 is used to assign corresponding color intensity weight values to the color intensity of the moving target on the red channel, the color intensity on the green channel, and the color intensity on the blue channel.
图像特征值计算模块7066,用于对在每个通道上的颜色强度进行加权求和, 得到当前帧中的运动目标的图像特征值。The image feature value calculation module 7066 is configured to weight and sum the color intensity on each channel to obtain the image feature value of the moving object in the current frame.
如图10所示,在一个实施例中,视频中运动目标的跟踪装置700还包括:As shown in FIG. 10, in one embodiment, the tracking device 700 of a moving object in a video further includes:
二维模型投影模块710,用于提取追踪场地边线区域,建立追踪场地俯视二 维模型,将跟踪位置投影至追踪场地俯视二维模型中的第一投影坐标。The two-dimensional model projection module 710 is used to extract the tracking site sideline area, establish a tracking site overlooking the two-dimensional model, and project the tracking position to the first projection coordinates in the tracking site overlooking the two-dimensional model.
在一个实施例中,视频中运动目标的跟踪装置700用于获取第二视角下的 视频,并计算在第二视角下的视频中与下一帧对应的视频帧的运动目标的跟踪 位置在追踪场地俯视二维模型中的第二投影坐标。如图10所示,视频中运动目 标的跟踪装置700还包括:In one embodiment, the device 700 for tracking a moving object in a video is configured to acquire a video under a second viewing angle, and calculate the tracking position of the moving object in the video frame corresponding to the next frame in the video under the second viewing angle. The site looks down on the second projected coordinates in the 2D model. As shown in Figure 10, the tracking device 700 of the moving target in the video also includes:
遮挡率比较模块712,用于分别将第一视角下的视频中的当前帧运动目标的 遮挡率和第二视角下的视频中的当前帧运动目标的遮挡率与预设遮挡率阈值进 行比较。The occlusion rate comparison module 712 is used to compare the occlusion rate of the moving object in the current frame in the video under the first viewing angle and the moving object in the current frame in the video under the second viewing angle with the preset occlusion rate threshold.
第一目标投影坐标选取模块714,用于当第一视角下的视频中的当前帧运动 目标的遮挡率和第二视角下的视频中的当前帧运动目标的遮挡率均小于或等于 预设遮挡率阈值时,根据第一投影坐标和第二投影坐标计算运动目标在追踪场 地俯视二维模型中的目标投影坐标。The first target projection coordinate selection module 714 is used for when the occlusion rate of the moving object in the current frame in the video under the first viewing angle and the occlusion rate of the moving object in the current frame in the video under the second viewing angle are both less than or equal to the preset occlusion When the rate threshold is calculated, the target projected coordinates of the moving target in the two-dimensional model of the tracking field overlooked are calculated according to the first projected coordinates and the second projected coordinates.
第二目标投影坐标选取模块716,用于当第一视角下的视频中的当前帧运动 目标的遮挡率大于预设遮挡率阈值时,选取第二投影坐标作为运动目标在追踪 场地俯视二维模型中的目标投影坐标;当第二视角下的视频中的当前帧运动目 标的遮挡率大于预设遮挡率阈值时,选取第一投影坐标作为运动目标在追踪场 地俯视二维模型中的目标投影坐标。The second target projection coordinate selection module 716 is used to select the second projected coordinates as the moving target to look down on the two-dimensional model in the tracking field when the occlusion rate of the current frame moving target in the video under the first viewing angle is greater than the preset occlusion rate threshold The target projection coordinates in ; when the occlusion rate of the moving target in the current frame of the video under the second viewing angle is greater than the preset occlusion rate threshold, select the first projected coordinates as the target projection coordinates of the moving target in the two-dimensional model of the tracking site. .
如图10所示,在一个实施例中,视频中运动目标的跟踪装置700还包括:As shown in FIG. 10, in one embodiment, the tracking device 700 of a moving object in a video further includes:
投影坐标修正模块718,用于当第一视角下的视频中的当前帧运动目标的遮 挡率大于预设遮挡率阈值时,根据第二投影坐标对第一投影坐标进行修正;以 及,当第二视角下的视频中的当前帧运动目标的遮挡率大于预设遮挡率阈值时, 根据第一投影坐标对第二投影坐标进行修正。The projected coordinate correction module 718 is configured to correct the first projected coordinates according to the second projected coordinates when the occlusion rate of the moving object in the current frame in the video under the first viewing angle is greater than a preset occlusion rate threshold; and, when the second projected coordinates When the occlusion rate of the moving object in the current frame in the video under the viewing angle is greater than the preset occlusion rate threshold, the second projection coordinates are corrected according to the first projection coordinates.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对 上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技 术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-mentioned embodiments can be combined arbitrarily. To make the description concise, all possible combinations of the technical features in the above-mentioned embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, should be considered as within the scope of this specification.
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细, 但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的 普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改 进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权 利要求为准。The above-mentioned embodiments only express several implementation modes of the present invention, and the descriptions thereof are relatively specific and detailed, but should not be construed as limiting the scope of the patent for the invention. It should be noted that for those skilled in the art, without departing from the concept of the present invention, several modifications and improvements can be made, and these all belong to the protection scope of the present invention. Therefore, the protection scope of the patent for the present invention should be based on the appended claims.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254328.XA CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254328.XA CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107240120A CN107240120A (en) | 2017-10-10 |
CN107240120B true CN107240120B (en) | 2019-12-17 |
Family
ID=59983446
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710254328.XA Active CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107240120B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022254B (en) * | 2017-11-09 | 2022-02-15 | 华南理工大学 | Feature point assistance-based space-time context target tracking method |
CN109636828A (en) * | 2018-11-20 | 2019-04-16 | 北京京东尚科信息技术有限公司 | Object tracking methods and device based on video image |
CN111223104B (en) * | 2018-11-23 | 2023-10-10 | 杭州海康威视数字技术股份有限公司 | Method and device for extracting and tracking package and electronic equipment |
CN111241872B (en) * | 2018-11-28 | 2023-09-22 | 杭州海康威视数字技术股份有限公司 | Video image shielding method and device |
CN112489086B (en) * | 2020-12-11 | 2025-01-21 | 苏州玺芯人工智能科技有限公司 | Target tracking method, target tracking device, electronic device and storage medium |
CN115712354B (en) * | 2022-07-06 | 2023-05-30 | 成都戎盛科技有限公司 | Man-machine interaction system based on vision and algorithm |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105976401A (en) * | 2016-05-20 | 2016-09-28 | 河北工业职业技术学院 | Target tracking method and system based on partitioned multi-example learning algorithm |
CN106127798A (en) * | 2016-06-13 | 2016-11-16 | 重庆大学 | Dense space-time contextual target tracking based on adaptive model |
CN106485732A (en) * | 2016-09-09 | 2017-03-08 | 南京航空航天大学 | A kind of method for tracking target of video sequence |
-
2017
- 2017-04-18 CN CN201710254328.XA patent/CN107240120B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105976401A (en) * | 2016-05-20 | 2016-09-28 | 河北工业职业技术学院 | Target tracking method and system based on partitioned multi-example learning algorithm |
CN106127798A (en) * | 2016-06-13 | 2016-11-16 | 重庆大学 | Dense space-time contextual target tracking based on adaptive model |
CN106485732A (en) * | 2016-09-09 | 2017-03-08 | 南京航空航天大学 | A kind of method for tracking target of video sequence |
Non-Patent Citations (2)
Title |
---|
Xingping Dong 等.Occlusion-Aware Real-Time Object Tracking.《IEEE Transactions on Multimedia》.2016, * |
张雷 等.基于置信图特性的改进时空上下文目标跟踪.《计算机工程》.2016,第42卷(第8期), * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN107240120A (en) | 2017-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107240120B (en) | Method and device for tracking moving target in video | |
US11568516B2 (en) | Depth-based image stitching for handling parallax | |
US10445887B2 (en) | Tracking processing device and tracking processing system provided with same, and tracking processing method | |
JP7427188B2 (en) | 3D pose acquisition method and device | |
US11037325B2 (en) | Information processing apparatus and method of controlling the same | |
CN111462200A (en) | A cross-video pedestrian location tracking method, system and device | |
US20240303859A1 (en) | 3d position acquisition method and device | |
CN107240117B (en) | Method and device for tracking moving object in video | |
US10438412B2 (en) | Techniques to facilitate accurate real and virtual object positioning in displayed scenes | |
JP7334432B2 (en) | Object tracking device, monitoring system and object tracking method | |
WO2014017006A1 (en) | Posture estimation device, posture estimation method, and posture estimation program | |
CN108717704B (en) | Target tracking method based on fisheye image, computer device and computer readable storage medium | |
US10055845B2 (en) | Method and image processing system for determining parameters of a camera | |
JP6515039B2 (en) | Program, apparatus and method for calculating a normal vector of a planar object to be reflected in a continuous captured image | |
Hariharan et al. | Gesture recognition using Kinect in a virtual classroom environment | |
Führ et al. | Camera self-calibration based on nonlinear optimization and applications in surveillance systems | |
Utasi et al. | A multi-view annotation tool for people detection evaluation | |
US11062521B2 (en) | Virtuality-reality overlapping method and system | |
US20240155257A1 (en) | Determining a camera control point for virtual production | |
CN109697696A (en) | Benefit blind method for panoramic video | |
CN114037923A (en) | Target activity hotspot graph drawing method, system, equipment and storage medium | |
CN114640833A (en) | Projection picture adjusting method and device, electronic equipment and storage medium | |
US7006706B2 (en) | Imaging apparatuses, mosaic image compositing methods, video stitching methods and edgemap generation methods | |
CN109902675B (en) | Object pose acquisition method and scene reconstruction method and device | |
CN111179281A (en) | Human body image extraction method and human action video extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |