WO2022217840A1 - 一种复杂背景下高精度多目标跟踪方法 - Google Patents

一种复杂背景下高精度多目标跟踪方法 Download PDF

Info

Publication number
WO2022217840A1
WO2022217840A1 PCT/CN2021/119796 CN2021119796W WO2022217840A1 WO 2022217840 A1 WO2022217840 A1 WO 2022217840A1 CN 2021119796 W CN2021119796 W CN 2021119796W WO 2022217840 A1 WO2022217840 A1 WO 2022217840A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
matching
tracking
result
trajectory
Prior art date
Application number
PCT/CN2021/119796
Other languages
English (en)
French (fr)
Inventor
辛付豪
朱伟
董小舒
刘羽
张典
陆园
Original Assignee
南京莱斯电子设备有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京莱斯电子设备有限公司 filed Critical 南京莱斯电子设备有限公司
Publication of WO2022217840A1 publication Critical patent/WO2022217840A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the invention relates to the technical field of target tracking, in particular to a high-precision multi-target tracking method under complex background.
  • Visual object tracking technology is an important means to process these video data.
  • Visual target tracking is a basic research problem in computer vision, and has broad application prospects in many aspects such as video surveillance, unmanned driving, human-computer interaction, planetary detection, and military applications.
  • the problem to be solved by visual target tracking can be expressed as: in a video sequence, given the position and size of the target in the first frame (usually a rectangular bounding box), the position and size of the target need to be predicted in subsequent frames.
  • Target tracking algorithms can be divided into target tracking algorithms based on generative models and those based on discriminative models. Using the results of historical frames to generate a statistical model for describing target features can effectively deal with the situation of target loss in the tracking process, but the methods based on generative models usually ignore the background information around the target, and at the same time, in the face of background confusion It is easy to lose the target. Most of the traditional tracking methods based on correlation filtering only use hand-designed feature descriptors to extract features, so that the ability to represent the target is limited, and the target position determined by the response map is not accurate enough. Satisfactory performance is obtained. Before 2010, target tracking algorithms generally used classical tracking algorithms, such as mean shift, particle filter, Kalman filter, subspace learning, sparse expression method, and kernel density estimation method.
  • Target tracking algorithms based on deep learning can be divided into target tracking algorithms based on deep features, target tracking algorithms based on Siamese networks, target tracking algorithms based on recurrent neural networks, target tracking algorithms based on generative adversarial networks, and target tracking algorithms based on other specific networks.
  • the present invention proposes a high-precision multi-target tracking method under complex background, which solves the defect of poor tracking effect of traditional target tracking algorithm under complex scene, including the following steps:
  • Step 1 Input the acquired video data into a residual network, perform target resolution feature extraction, and output an extraction result at an output end, where the extraction result includes target resolution features of different dimensions.
  • the residual network may adopt ResNet.
  • the target resolution features of different dimensions in the extraction result have different characteristics, and the feature expression ability can be enhanced according to the different characteristics.
  • the problem of scale change that often occurs in the target tracking process is solved.
  • Step 2 calculating the correlation filter response graph of the target resolution feature
  • Step 3 using the target detection network to obtain the detection result of the target, the detection result of the target defines the motion state of the target as an 8-dimensional space, which respectively represents the state of the trajectory at a certain moment;
  • Step 4 matching the detection result of the target with the predicted trajectory to obtain a matching result, and the matching result includes the value of two metrics of fusion motion information and appearance information;
  • Step 5 Compare the fused value of the two metrics with a preset matching threshold to obtain a target tracking result.
  • the step 2 includes:
  • Step 2-1 perform an interpolation operation on the target resolution features of the different dimensions, and convert the features of the different resolutions into a continuous space domain, and the interpolation operator J d is expressed as:
  • each sample contains a D-dimensional feature channel
  • N d represents the number of spatial sampling points in the feature channel
  • d ⁇ ⁇ 0,1,2,... ⁇ the features of different resolutions are converted to the continuous spatial domain [0, T) ⁇ R, where T represents the size of the support region, t represents the position of the tracking target in the image, t ⁇ [0, T), and n represents the discrete space variable n ⁇ 0,...N d-1 ⁇ ;
  • Step 2-2 obtain the correlation filter by minimizing the loss function
  • the corresponding loss function in the Fourier domain can be derived as:
  • f is the filter
  • P is the feature matrix
  • z is the interpolation feature map
  • the penalty function w ⁇ L 2 (T) is a spatial regularization term
  • C is the C-dimensional feature map
  • is the weight parameter
  • F is the The result of the Fourier transformation of the filter f;
  • Step 2-3 perform the convolution operation of factorization to obtain the response of the correlation filter.
  • the correlation is used to describe the connection between the two signals, and is divided into cross-correlation and positive correlation.
  • the correlation refers to is positively correlated;
  • the new filter response Rc is expressed as the matrix-vector product Pf, and the convolution operator factoring the filter response Rc is expressed as:
  • the feature vector J ⁇ x ⁇ (t) of each position t is first multiplied by the matrix P T , and then the generated feature map is convolved with the filter, P dc represents the learning coefficient, which can be compactly expressed as D ⁇
  • P dc represents the learning coefficient, which can be compactly expressed as D ⁇
  • the matrix P (P dc ) of C; in the formula, the eigenvector J ⁇ x ⁇ (t) of each position t is expressed as J ⁇ x ⁇ ;
  • Step 2-4 using visual saliency detection on the tracking target; in the present invention, by using the visual saliency detection on the tracking target, the tracking target can be quickly positioned and the accuracy of positioning can be improved;
  • the steps 2-4 include:
  • Step 2-4-1 assuming that the input image is I, if a target area of the tracking target is known, that is, the rectangular frame area, and the surrounding area, the probability that the pixel at the image belongs to the target pixel is:
  • m represents the separated target pixel
  • O represents the target area
  • S represents the surrounding area
  • b m represents the color component assigned to the input image I;
  • Step 2-4-2 the maximum entropy assigned to the background pixel value is 0.5, in the target tracking process, given the target position of the first frame, in subsequent frames, a rectangular area is performed around the position of the previous frame. Search, the saliency R S of the current frame is calculated as:
  • s v (O t ) represents the probability score based on the object model
  • s d (O t ) represents the distance score of the Euclidean distance between the target and the target center c t-1 of the previous frame
  • P 1:t-1 represents The probability score from the first frame to the previous frame, ⁇ expressed as the standard deviation of the normal distribution.
  • the step 3 includes: using the target detection network to obtain the detection result of the target, and defining the motion state of the target as an 8-dimensional space (x t , y t , r t , h t ) , x * , y * , r * , h * ), respectively represent the state of the trajectory at a certain moment, where x t , y t represent the coordinates of the center of the detection frame in the image coordinate system, and r t represent the detection frame. Aspect ratio, h t represents the height of the detection frame; x * , y * , r * , h * represents the corresponding speed information in the image coordinates.
  • the target detection network may use yolov4.
  • the step 4 includes:
  • Step 4-1 use the distance between the detection result of the target and the predicted trajectory to indicate the degree of motion matching:
  • d jk represents the k-th state of the j-th target
  • y ik represents the k-th state of the i-th trajectory
  • the degree of motion matching represents the degree of matching between the detection result of the j-th target and the i-th track
  • S i is the covariance matrix of the observation space at the current moment obtained by trajectory prediction
  • yi is the predicted observation amount of the trajectory at the current moment
  • d j is the state of the jth target.
  • Step 4-2 use the minimum cosine distance between the detection result of the target and the feature vector of the target included in the trajectory as the apparent matching degree between the target and the trajectory;
  • the present invention performs tracking by combining the apparent matching degree, which can effectively reduce the ID change of the tracking target compared with the prior art. .
  • Step 4-3 using the weighted average method to fuse the two measurement methods, that is, the motion distance matching degree and the apparent information, to obtain the value ⁇ i,j of the fusion of the two measurement methods:
  • is a hyperparameter used to adjust the de-weighting of different items.
  • the motion distance matching degree metric is very effective for short-term prediction and matching, while the apparent information is more effective for long-term lost trajectories.
  • the choice of hyperparameters depends on the specific data set. If you want to get the importance of the same, ⁇ should be about 0.1.
  • the step 5 includes:
  • Step 5-1 if the value ⁇ i,j of the fusion of the two metrics is greater than or equal to the preset matching threshold Thres , the target tracking result is that the matching is successful;
  • the target tracking result is a matching failure
  • Step 5-2 the initial state of the known track is T ini , if the video is successfully matched for n consecutive frames in the processing process, the track is transferred from the initial state T ini to the definite state T cofr , which is regarded as successful tracking;
  • the track is changed from the initial state T ini to the deletion state T dele , which is regarded as a tracking failure, and the current track is deleted from the video.
  • the invention proposes a high-precision multi-target tracking method under complex background, which improves the traditional tracking algorithm.
  • the traditional method matches the detection target and the trajectory, due to the lack of sufficient feature information, it is easy to cause an ID switch, that is, the ID of the detection frame is constantly replaced, which lacks accuracy and robustness.
  • a residual network for extracting features is added to extract the multi-resolution features of the target, and the matching process is combined with motion information and appearance information to maximize the accuracy of the matching process.
  • FIG. 1 is a schematic diagram of a basic flow frame of a high-precision multi-target tracking method under a complex background provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a target area and a surrounding area in a high-precision multi-target tracking method under a complex background provided in part by an embodiment of the present invention.
  • an embodiment of the present invention discloses a high-precision multi-target tracking method in a complex background, which is applied to the tracking of multi-target tasks in a complex background, and includes the following steps:
  • step 1 video data is obtained first; in this embodiment, the camera can be used to capture the video in real time and send it to the computer, or the computer can directly read the local video.
  • the camera and computer can be of any type.
  • Step 1 Input the acquired video data into the residual network, extract the target resolution feature, and output the extraction result at the output end, and the extraction result includes the target resolution feature of different dimensions.
  • the The residual network can use ResNet.
  • the target resolution features of different dimensions in the extraction result have different characteristics, and the feature expression ability can be enhanced according to the different characteristics.
  • the problem of scale change that often occurs in the target tracking process is solved.
  • Step 2 calculating the correlation filter response graph of the target resolution feature
  • Step 3 using the target detection network to obtain the detection result of the target, the detection result of the target defines the motion state of the target as an 8-dimensional space, which respectively represents the state of the trajectory at a certain moment;
  • Step 4 matching the detection result of the target with the predicted trajectory to obtain a matching result, and the matching result includes the value of two metrics of fusion motion information and appearance information;
  • Step 5 Compare the fused value of the two metrics with a preset matching threshold to obtain a target tracking result.
  • the step 2 includes:
  • Step 2-1 perform an interpolation operation on the target resolution features of the different dimensions, and convert the features of the different resolutions into a continuous space domain, and the interpolation operator J d is expressed as:
  • each sample contains a D-dimensional feature channel
  • N d represents the number of spatial sampling points in the feature channel
  • d ⁇ ⁇ 0,1,2,... ⁇ the features of different resolutions are converted to the continuous spatial domain [0, T) ⁇ R, where T represents the size of the support region, t represents the position of the tracking target in the image, t ⁇ [0, T), and n represents the discrete space variable n ⁇ 0,...N d-1 ⁇ ;
  • Step 2-2 obtain the correlation filter by minimizing the loss function
  • the corresponding loss function in the Fourier domain can be derived as:
  • f is the filter
  • P is the feature matrix
  • z is the interpolation feature map
  • the penalty function w ⁇ L 2 (T) is a spatial regularization term
  • C is the C-dimensional feature map
  • is the weight parameter
  • F is the The result of the Fourier transformation of the filter f;
  • Step 2-3 perform the convolution operation of factorization to obtain the response of the correlation filter.
  • the correlation is used to describe the connection between the two signals, and is divided into cross-correlation and positive correlation.
  • the correlation refers to is positively correlated;
  • the new filter response Rc is expressed as the matrix-vector product Pf, and the convolution operator factoring the filter response Rc is expressed as:
  • the feature vector J ⁇ x ⁇ (t) of each position t is first multiplied by the matrix P T , and then the generated feature map is convolved with the filter, P dc represents the learning coefficient, which can be compactly expressed as D ⁇
  • P dc represents the learning coefficient, which can be compactly expressed as D ⁇
  • the matrix P (P dc ) of C; in the formula, the eigenvector J ⁇ x ⁇ (t) of each position t is expressed as J ⁇ x ⁇ ;
  • Step 2-4 using visual saliency detection on the tracking target; in this embodiment, by using the visual saliency detection on the tracking target, the tracking target can be quickly located, and the accuracy of positioning can be improved;
  • the steps 2-4 include:
  • Step 2-4-1 as shown in Figure 2, assuming that the input image is I, if the target area of a tracking target, that is, the rectangular frame area, and the surrounding area are known, the probability that the pixel at the image belongs to the target pixel is :
  • m represents the separated target pixel
  • O represents the target area
  • S represents the surrounding area
  • b m represents the color component assigned to the input image I;
  • Step 2-4-2 in the target tracking process, given the target position of the first frame, in subsequent frames, a rectangular area search is performed around the position of the previous frame, and the saliency R S of the current frame is calculated by the formula for:
  • s v (O t ) represents the probability score based on the object model
  • s d (O t ) represents the distance score of the Euclidean distance between the target and the target center c t-1 of the previous frame
  • P 1:t-1 represents The probability score from the first frame to the previous frame, ⁇ expressed as the standard deviation of the normal distribution.
  • the step 3 includes: using a target detection network to obtain a detection result of the target, and defining the motion state of the target as an 8-dimensional space (x t , y t , r t , h t , x * , y * , r * , h * ), respectively represent the state of the trajectory at a certain moment, where x t , y t represent the center of the detection frame in the image coordinate system
  • the coordinates in , r t represents the aspect ratio of the detection frame, h t represents the height of the detection frame; x * , y * , r * , h * represent the corresponding speed information in the image coordinates.
  • the target detection network may use yolov4.
  • the step 4 includes:
  • Step 4-1 use the distance between the detection result of the target and the predicted trajectory to indicate the degree of motion matching:
  • the degree of motion matching represents the degree of matching between the detection result of the j-th target and the i-th track
  • S i is the covariance matrix of the observation space at the current moment obtained by trajectory prediction
  • yi is the predicted observation amount of the trajectory at the current moment
  • d j is the state of the jth target.
  • Step 4-2 use the minimum cosine distance between the detection result of the target and the feature vector of the target included in the trajectory as the apparent matching degree between the target and the trajectory;
  • d jk represents the k-th state of the j-th target
  • y ik represents the k-th state of the i-th trajectory
  • the present invention performs tracking by combining the apparent matching degree, which can effectively reduce the ID change of the tracking target compared with the prior art. .
  • Step 4-3 using the weighted average method to fuse the two measurement methods, that is, the motion distance matching degree and the apparent information, to obtain the value ⁇ i,j of the fusion of the two measurement methods:
  • is a hyperparameter used to adjust the de-weighting of different items.
  • the motion distance matching degree metric is very effective for short-term prediction and matching, while the apparent information is more effective for long-term lost trajectories.
  • the choice of hyperparameters depends on the specific data set. If you want to get the importance of the same, ⁇ should be about 0.1.
  • the step 5 includes:
  • Step 5-1 if the value ⁇ i,j of the fusion of the two metrics is greater than or equal to the preset matching threshold Thres , the target tracking result is that the matching is successful;
  • the target tracking result is a matching failure
  • Step 5-2 the initial state of the known track is T ini , if the video is successfully matched for n consecutive frames in the processing process, the track is transferred from the initial state T ini to the definite state T cofr , which is regarded as successful tracking;
  • the track is changed from the initial state T ini to the deletion state T dele , which is regarded as a tracking failure, and the current track is deleted from the video.
  • the invention proposes a high-precision multi-target tracking method under complex background, which improves the traditional tracking algorithm.
  • the traditional method matches the detection target and the trajectory, due to the lack of sufficient feature information, it is easy to cause an ID switch, that is, the ID of the detection frame is constantly replaced, which lacks accuracy and robustness.
  • a residual network for extracting features is added to extract the multi-resolution features of the target, and the matching process is combined with motion information and appearance information to maximize the accuracy of the matching process.
  • the present invention also provides a computer storage medium, wherein the computer storage medium can store a program, and when the program is executed, it can include various embodiments of the high-precision multi-target tracking method in a complex background provided by the present invention some or all of the steps in .
  • the storage medium can be a magnetic disk, an optical disk, a read-only memory (ROM) or a random access memory (RAM), and the like.
  • the technology in the embodiments of the present invention can be implemented by means of software plus a necessary general hardware platform.
  • the technical solutions in the embodiments of the present invention may be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products may be stored in a storage medium, such as ROM/RAM , magnetic disk, optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of the present invention.
  • a computer device which may be a personal computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种复杂背景下高精度多目标跟踪方法,包括将获取的视频数据输入至残差网络,进行目标分辨率特征提取,在输出端输出提取结果,提取结果包括不同维度的目标分辨率特征;计算目标分辨率特征的相关滤波响应图;利用目标检测网络得到目标的检测结果;将目标的检测结果与预测的轨迹进行匹配,获得匹配结果,匹配结果包括融合运动信息和表观信息两种度量的值;将两种度量融合的值与预设匹配阈值进行对比,获得目标跟踪结果。相较于现有技术,通过增加了一个提取特征的残差网络,提取目标的多分辨率特征,将匹配的过程结合运动信息以及表观信息,更大限度的提高了匹配过程的准确性。

Description

一种复杂背景下高精度多目标跟踪方法 技术领域
本发明涉及目标跟踪技术领域,尤其涉及一种复杂背景下高精度多目标跟踪方法。
背景技术
目前,随着计算机视觉技术的发展,海量的视觉信息被获取、传输以及分析,因此如何让计算机处理这些视频数据成为了当前的研究热点,其中视觉目标跟踪技术是处理这些视频数据的重要手段,视觉目标跟踪是计算机视觉中的一个基本的研究问题,在视频监控、无人驾驶、人机交互、行星探测、军事应用等诸多方面都有广泛的应用前景。视觉目标跟踪要解决的问题可以表述为:在视频序列中,给出第一帧中目标的位置和大小(通常是一个矩形边界框),需要在后续帧中预测出目标的位置和大小。
传统的目标跟踪算法可以分为基于生成模型和基于判别模型的目标跟踪算法。采用历史帧的结果来生成用于描述目标特征的统计模型,能够有效处理跟踪过程中目标丢失的情况,但是基于生成模型的方法通常忽略了目标周围的背景信息,同时在面对背景混乱的情况时容易丢失目标。传统的基于相关滤波的跟踪方法大多只使用手工设计的特征描述子来提取特征,使得对目标的表征能力有限,通过响应图确定的目标位置不够精确,在遮挡和背景混杂等因素干扰下通常不能得到令人满意的性能。在2010年之前,目标跟踪算法一般采用经典的跟踪算法,例如均值漂移、粒子滤波、卡尔曼滤波、子空间学习、稀疏表达方法、核密度估计方法。
基于深度学习的目标跟踪算法可以分成基于深度特征的目标跟踪算法、基于孪生网络的目标跟踪算法、基于循环神经网络、基于生成对抗网络的目标跟踪算法和基于其他特定网络的目标跟踪算法。
尽管目标跟踪已经被研究了多年,并取得了一定的进展,但在复杂背景下仍然难以满足实际的需求,在跟踪任务中,当环境亮度降低或者存在较多相似的目标,跟踪算法区分目标区域与背景区域的能力将变弱,跟踪效果将变差;当目标发生遮时,目标的特征信息会丢失,而随着遮挡比例的增大,丢失的信息会越来越多。因此,如何设计一个实时鲁棒的跟踪算法是当前目标跟踪领域的研究焦点。
发明内容
本发明针对目标跟踪中的问题,提出了一种复杂背景下高精度多目标跟踪方法,解决了传统的目标跟踪算法在复杂场景下跟踪效果不佳的缺陷,包括以下步骤:
步骤1,将获取的视频数据输入至残差网络,进行目标分辨率特征提取,在输出端输出提取结果,所述提取结果包括不同维度的目标分辨率特征。具体的,本发明中,所述残差网络可采用ResNet。
本发明中,所述提取结果中不同维度的目标分辨率特征具有不同特性,根据所述不同特性可增强特征表达能力。通过本步骤解决了目标跟踪过程中经常出现的尺度变化问题。
步骤2,计算所述目标分辨率特征的相关滤波响应图;
步骤3,利用目标检测网络得到目标的检测结果,所述目标的检测结果将目标的运动状态定义为一个8维空间,分别表示轨迹在某个时刻的状态;
步骤4,将所述目标的检测结果与预测的轨迹进行匹配,获得匹配结果,所述匹配结果包括融合运动信息和表观信息两种度量的值;
步骤5,将所述两种度量融合的值与预设匹配阈值进行对比,获得目标跟踪结果。
进一步地,在一种实现方式中,所述步骤2,包括:
步骤2-1,对所述不同维度的所述目标分辨率特征进行插值操作,将所述不同分辨率的特征转换到连续空间域,插值算子J d表示为:
Figure PCTCN2021119796-appb-000001
其中,b d∈L 2(T),属于差值函数,每个样本都包含D维的特征通道,N d表示特征通道中空间采样点的数目,d∈{0,1,2,…},不同分辨率的特征被转换到连续的空间域[0,T)∈R,T表示支持区域的大小,t表示跟踪目标在图像中的位置,t∈[0,T),n表示离散空间变量n∈{0,…N d-1};
步骤2-2,通过最小化损失函数,求出相关滤波器;
傅里叶域中相应的损失函数可推导为:
Figure PCTCN2021119796-appb-000002
其中,f为滤波器,P是特征矩阵;z表示插值特征图,惩罚函数w∈L 2(T)是一个空间正则化项,C表示为C维特征图,λ表示为权重参数,F表示滤波器f经过傅里叶变化后的结果;
步骤2-3,进行因式分解的卷积操作求出相关滤波器的响应,相关性用来描述两个信号的联系,分为互相关和正相关,本实施例中,所述的相关指的是正相关;
新的滤波响应R c表示为矩阵向量乘积Pf,所述滤波响应R c因式分解的卷积算子表示为:
Figure PCTCN2021119796-appb-000003
其中,每个位置t的特征向量J{x}(t)首先与矩阵P T相乘,然后将生成的特征图与滤波器进行卷积,P dc表示学习系数,可以紧凑地表示为D×C的矩阵P=(P dc);式中,每个位置t的特征向量J{x}(t)表示为J{x};
步骤2-4,对所述跟踪目标采用视觉显著性检测;本发明中,通过对所述跟踪目标采用视觉显著性检测,能够快速定位所述跟踪目标,并提高定位的准确性;
步骤2-5,由获得的滤波响应R c和当前帧的显著性R S相乘,最终的响应图R f=R c·R S,当所述最终的响应图R f取最大值时,将响应值最大的位置映射到原图,得到所述目标在后续帧中的位置,即获得预测的轨迹。
进一步地,在一种实现方式中,所述步骤2-4,包括:
步骤2-4-1,假设输入图像为I,若已知一个跟踪目标的目标区域,即矩形框区域,以及环绕区域时,在图像处的像素属于目标像素的概率是:
Figure PCTCN2021119796-appb-000004
其中,m表示分离出的目标像素,O表示目标区域,S表示环绕区域,b m表示分配给输入图像I的颜色分量;
所述分配给输入图像I的颜色分量b m属于目标区域O和环绕区域S的概率分别表示为:
Figure PCTCN2021119796-appb-000005
其中,
Figure PCTCN2021119796-appb-000006
表示在目标区域O∈I上计算的非标准化直方图H的第b m个计算区间,
Figure PCTCN2021119796-appb-000007
表示在环绕区域S∈I上计算的非标准化直方图H的第b m个计算区间;
步骤2-4-2,分配给背景像素值的最大熵为0.5,在目标跟踪过程中,给定第一帧的目标位置,在后续帧中,在前一帧的位置周围进行一个矩形区域的搜索,当前帧的显著性R S计算公式为:
R S=s v(O t)s d(O t),
Figure PCTCN2021119796-appb-000008
其中,s v(O t)表示基于对象模型的概率分数,s d(O t)表示目标到前一帧的目标中心c t-1的欧氏距离的距离分数,P 1:t-1表示从第一帧到前一帧的概率分数,σ表示为正态分布的标准差。
进一步地,在一种实现方式中,所述步骤3,包括:利用目标检测网络得到目标的检测结果,将目标的运动状态定义为一个8维空间(x t,y t,r t,h t,x *,y *,r *,h *),分别表示轨迹在某个时刻的状态,其中,x t,y t表示检测框的中心在图像坐标系中的坐标,r t表示检测框的长宽比,h t表示检测框的高度;x *,y *,r *,h *表示在图像坐标中对应的速度信息。具体的,本实施例中,所述目标检测网络可采用yolov4。
进一步地,在一种实现方式中,所述步骤4,包括:
步骤4-1,使用所述目标的检测结果与预测的轨迹之间的距离表示运动匹配程度:
Figure PCTCN2021119796-appb-000009
其中,d jk表示第j个目标的第k个的状态,y ik表示第i条轨迹的第k个状态;
所述运动匹配程度表示第j个目标的检测结果和第i条轨迹之间的匹配程度;
其中,S i是轨迹预测得到的在当前时刻观测空间的协方差矩阵,y i是轨迹在当前时刻的预测观测量,d j是第j个目标的状态。
步骤4-2,使用所述目标的检测结果与轨迹包含的目标的特征向量之间的最小余弦距离作为目标与轨迹之间的表观匹配程度;
第j个目标的检测结果和第i条轨迹之间的余弦相似度为:
Figure PCTCN2021119796-appb-000010
余弦距离=1-余弦相似度,所述目标与轨迹之间的表观匹配程度为:
Figure PCTCN2021119796-appb-000011
现有技术中,单独使用运动信息作为匹配度度量会导致追踪目标的ID变化过于严重,因此,本发明通过联合表观匹配度进行追踪,相较于现有技术能够有效减少追踪目标的ID变化。
步骤4-3,利用加权平均的方式对两种度量方式,即对运动距离匹配度和表观信息进行融合,获得所述两种度量方式融合的值ω i,j
Figure PCTCN2021119796-appb-000012
即,
Figure PCTCN2021119796-appb-000013
其中,μ为超参数,用于调整不同项的去权重。
具体的,本实施例中,所述运动距离匹配度度量对于短期的预测和匹配效果很好,而表观信息对于长时间丢失的轨迹而言,匹配度度量的比较有效。超参数的选择要看具体的数据集,如果想取相通的重要程度,μ应该取0.1左右。
进一步地,在一种实现方式中,所述步骤5,包括:
步骤5-1,若所述两种度量融合的值ω i,j大于或等于预设匹配阈值T hres,则所述目标跟踪结果为匹配成功;
若所述两种度量融合的ω i,j小于预设匹配阈值T hres,则所述目标跟踪结果为匹配失败;
步骤5-2,已知轨迹的初始状态为T ini,若视频在处理过程中连续n帧匹配成功,将所述轨迹从初始状态T ini转为确定状态T cofr,视为跟踪成功;
若视频连续匹配成功的帧数小于n帧,计当前帧数为z,z=z+1;返回所述步骤1,重新进行匹配;
若视频连续n帧都匹配失败,将轨迹从初始状态T ini转为删除状态T dele,视为跟踪失败,将当前轨迹从视频中删除。
本发明提出了一种复杂背景下高精度多目标跟踪方法,该方法将传统的跟踪算法进行改进。传统的方法在进行检测目标与轨迹匹配时,由于缺乏足够的特征信息,容易造成ID switch,就是检测框的ID不停的进行更换,缺乏准确性与鲁棒性。本文通过增加了一个提取特征的残差网络,提取目标的多分辨率特征,将匹配的过程结合运动信息以及表观信息,更大限度的提高了匹配过程的准确性。
附图说明
为了更清楚地说明本发明的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例部分提供的一种复杂背景下高精度多目标跟踪方法的基本流程框架示意图;
图2是本发明实施例部分提供的一种复杂背景下高精度多目标跟踪方法中目标区域 和环绕区域示意图。
具体实施方式
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
如图1所示,本发明实施例公开一种复杂背景下高精度多目标跟踪方法,应用于复杂背景下多目标任务的跟踪,包括以下步骤:
在步骤1之前,首先获取视频数据;本实施例中,可利用摄像头时实抓拍视频并发送到计算机,或者计算机直接读取本地视频。具体的,所述摄像头和计算机可采用任意型号。
步骤1,将获取的视频数据输入至残差网络,进行目标分辨率特征提取,在输出端输出提取结果,所述提取结果包括不同维度的目标分辨率特征,具体的,本实施例中,所述残差网络可采用ResNet。
本实施例中,所述提取结果中不同维度的目标分辨率特征具有不同特性,根据所述不同特性可增强特征表达能力。通过本步骤解决了目标跟踪过程中经常出现的尺度变化问题。
步骤2,计算所述目标分辨率特征的相关滤波响应图;
步骤3,利用目标检测网络得到目标的检测结果,所述目标的检测结果将目标的运动状态定义为一个8维空间,分别表示轨迹在某个时刻的状态;
步骤4,将所述目标的检测结果与预测的轨迹进行匹配,获得匹配结果,所述匹配结果包括融合运动信息和表观信息两种度量的值;
步骤5,将所述两种度量融合的值与预设匹配阈值进行对比,获得目标跟踪结果。
本发明实施例所述的一种复杂背景下高精度多目标跟踪方法中,所述步骤2,包括:
步骤2-1,对所述不同维度的所述目标分辨率特征进行插值操作,将所述不同分辨率的特征转换到连续空间域,插值算子J d表示为:
Figure PCTCN2021119796-appb-000014
其中,b d∈L 2(T),属于差值函数,每个样本都包含D维的特征通道,N d表示特征通道中空间采样点的数目,d∈{0,1,2,…},不同分辨率的特征被转换到连续的空间域[0,T)∈R,T表示支持区域的大小,t表示跟踪目标在图像中的位置,t∈[0,T),n表示离散空间变量n∈{0,…N d-1};
步骤2-2,通过最小化损失函数,求出相关滤波器;
傅里叶域中相应的损失函数可推导为:
Figure PCTCN2021119796-appb-000015
其中,f为滤波器,P是特征矩阵;z表示插值特征图,惩罚函数w∈L 2(T)是一个空间正则化项,C表示为C维特征图,λ表示为权重参数,F表示滤波器f经过傅里叶变化后的结果;
步骤2-3,进行因式分解的卷积操作求出相关滤波器的响应,相关性用来描述两个信号的联系,分为互相关和正相关,本实施例中,所述的相关指的是正相关;
新的滤波响应R c表示为矩阵向量乘积Pf,所述滤波响应R c因式分解的卷积算子表示为:
Figure PCTCN2021119796-appb-000016
其中,每个位置t的特征向量J{x}(t)首先与矩阵P T相乘,然后将生成的特征图与滤波器进行卷积,P dc表示学习系数,可以紧凑地表示为D×C的矩阵P=(P dc);式中,每个位置t的特征向量J{x}(t)表示为J{x};
步骤2-4,对所述跟踪目标采用视觉显著性检测;本实施例中,通过对所述跟踪目标采用视觉显著性检测,能够快速定位所述跟踪目标,并提高定位的准确性;
本发明实施例所述的一种复杂背景下高精度多目标跟踪方法中,所述步骤2-4,包括:
步骤2-4-1,如图2所示,假设输入图像为I,若已知一个跟踪目标的目标区域,即 矩形框区域,以及环绕区域时,在图像处的像素属于目标像素的概率是:
Figure PCTCN2021119796-appb-000017
其中,m表示分离出的目标像素,O表示目标区域,S表示环绕区域,b m表示分配给输入图像I的颜色分量;
所述分配给输入图像I的颜色分量b m属于目标区域O和环绕区域S的概率分别表示为:
Figure PCTCN2021119796-appb-000018
其中,
Figure PCTCN2021119796-appb-000019
表示在目标区域O∈I上计算的非标准化直方图H的第b m个计算区间,
Figure PCTCN2021119796-appb-000020
表示在环绕区域S∈I上计算的非标准化直方图H的第b个计算区间;
步骤2-4-2,在目标跟踪过程中,给定第一帧的目标位置,在后续帧中,在前一帧的位置周围进行一个矩形区域的搜索,当前帧的显著性R S计算公式为:
R S=s v(O t)s d(O t),
Figure PCTCN2021119796-appb-000021
其中,s v(O t)表示基于对象模型的概率分数,s d(O t)表示目标到前一帧的目标中心c t-1的欧氏距离的距离分数,P 1:t-1表示从第一帧到前一帧的概率分数,σ表示为正态分布的标准差。
步骤2-5,由获得的滤波响应R c和当前帧的显著性R S相乘,最终的响应图R f=R c·R S,当所述最终的响应图R f取最大值时,将响应值最大的位置映射到原图,得到所述目标在后续帧中的位置,即获得预测的轨迹。
本发明实施例所述的一种复杂背景下高精度多目标跟踪方法中,所述步骤3,包括:利用目标检测网络得到目标的检测结果,将目标的运动状态定义为一个8维空间(x t,y t,r t,h t,x *,y *,r *,h *),分别表示轨迹在某个时刻的状态,其中,x t,y t表示检测框的中心在图像坐标系中的坐标,r t表示检测框的长宽比,h t表示检测框的高度;x *,y *,r *,h *表示在图像坐标中对应的速度信息。具体的,本实施例中,所述目标检测网络可采用yolov4。
本发明实施例所述的一种复杂背景下高精度多目标跟踪方法中,所述步骤4,包括:
步骤4-1,使用所述目标的检测结果与预测的轨迹之间的距离表示运动匹配程度:
Figure PCTCN2021119796-appb-000022
所述运动匹配程度表示第j个目标的检测结果和第i条轨迹之间的匹配程度;
其中,S i是轨迹预测得到的在当前时刻观测空间的协方差矩阵,y i是轨迹在当前时刻的预测观测量,d j是第j个目标的状态。
步骤4-2,使用所述目标的检测结果与轨迹包含的目标的特征向量之间的最小余弦距离作为目标与轨迹之间的表观匹配程度;
第j个目标的检测结果和第i条轨迹之间的余弦相似度为:
Figure PCTCN2021119796-appb-000023
其中,d jk表示第j个目标的第k个的状态,y ik表示第i条轨迹的第k个状态;
余弦距离=1-余弦相似度,所述目标与轨迹之间的表观匹配程度为:
Figure PCTCN2021119796-appb-000024
现有技术中,单独使用运动信息作为匹配度度量会导致追踪目标的ID变化过于严重,因此,本发明通过联合表观匹配度进行追踪,相较于现有技术能够有效减少追踪目标的ID变化。
步骤4-3,利用加权平均的方式对两种度量方式,即对运动距离匹配度和表观信息进行融合,获得所述两种度量方式融合的值ω i,j
Figure PCTCN2021119796-appb-000025
即,
Figure PCTCN2021119796-appb-000026
其中,μ为超参数,用于调整不同项的去权重。
具体的,本实施例中,所述运动距离匹配度度量对于短期的预测和匹配效果很好,而表观信息对于长时间丢失的轨迹而言,匹配度度量的比较有效。超参数的选择要看具体的数据集,如果想取相通的重要程度,μ应该取0.1左右。
本发明实施例所述的一种复杂背景下高精度多目标跟踪方法中,所述步骤5,包括:
步骤5-1,若所述两种度量融合的值ω i,j大于或等于预设匹配阈值T hres,则所述目标跟踪结果为匹配成功;
若所述两种度量融合的ω i,j小于预设匹配阈值T hres,则所述目标跟踪结果为匹配失败;
步骤5-2,已知轨迹的初始状态为T ini,若视频在处理过程中连续n帧匹配成功,将所述轨迹从初始状态T ini转为确定状态T cofr,视为跟踪成功;
若视频连续匹配成功的帧数小于n帧,计当前帧数为z,z=z+1;返回所述步骤1,重新进行匹配;
若视频连续n帧都匹配失败,将轨迹从初始状态T ini转为删除状态T dele,视为跟踪失败,将当前轨迹从视频中删除。
具体的,本实施例中,n=3;当前帧匹配结束后,z=z+1;重新返回所述步骤1,视频进入下一帧图像的目标匹配跟踪。
本发明提出了一种复杂背景下高精度多目标跟踪方法,该方法将传统的跟踪算法进行改进。传统的方法在进行检测目标与轨迹匹配时,由于缺乏足够的特征信息,容易造成ID switch,就是检测框的ID不停的进行更换,缺乏准确性与鲁棒性。本文通过增加了一个提取特征的残差网络,提取目标的多分辨率特征,将匹配的过程结合运动信息以及表观信息,更大限度的提高了匹配过程的准确性。
具体实现中,本发明还提供一种计算机存储介质,其中,该计算机存储介质可存储 有程序,该程序执行时可包括本发明提供的一种复杂背景下高精度多目标跟踪方法的各实施例中的部分或全部步骤。所述的存储介质可为磁碟、光盘、只读存储记忆体(read-only memory,ROM)或随机存储记忆体(random access memory,RAM)等。
本领域的技术人员可以清楚地了解到本发明实施例中的技术可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本发明实施例中的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例或者实施例的某些部分所述的方法。
本说明书中各个实施例之间相同相似的部分互相参见即可。以上所述的本发明实施方式并不构成对本发明保护范围的限定。

Claims (6)

  1. 一种复杂背景下高精度多目标跟踪方法,其特征在于,包括如下步骤:
    步骤1,将获取的视频数据输入至残差网络,进行目标分辨率特征提取,在输出端输出提取结果,所述提取结果包括不同维度的目标分辨率特征;
    步骤2,计算所述目标分辨率特征的相关滤波响应图;
    步骤3,利用目标检测网络得到目标的检测结果,所述目标的检测结果将目标的运动状态定义为一个8维空间,分别表示轨迹在某个时刻的状态;
    步骤4,将所述目标的检测结果与预测的轨迹进行匹配,获得匹配结果,所述匹配结果包括融合运动信息和表观信息两种度量的值;
    步骤5,将所述两种度量融合的值与预设匹配阈值进行对比,获得目标跟踪结果。
  2. 根据权利要求1所述的一种复杂背景下高精度多目标跟踪方法,其特征在于,所述步骤2,包括:
    步骤2-1,对所述不同维度的所述目标分辨率特征进行插值操作,将所述不同分辨率的特征转换到连续空间域,插值算子J d表示为:
    Figure PCTCN2021119796-appb-100001
    其中,b d∈L 2(T),属于差值函数,每个样本都包含D维的特征通道,N d表示特征通道中空间采样点的数目,d∈{0,1,2,…},不同分辨率的特征被转换到连续的空间域[0,T)∈R,T表示支持区域的大小,t表示跟踪目标在图像中的位置,t∈[0,T),n表示离散空间变量n∈{0,…N d-1};
    步骤2-2,通过最小化损失函数,求出相关滤波器;
    傅里叶域中相应的损失函数可推导为:
    Figure PCTCN2021119796-appb-100002
    其中,f为滤波器,P是特征矩阵;z表示插值特征图,惩罚函数w∈L 2(T)是一个空 间正则化项,C表示为C维特征图,λ表示为权重参数,F表示滤波器f经过傅里叶变化后的结果;
    步骤2-3,进行因式分解的卷积操作求出相关滤波器的响应;
    新的滤波响应R c表示为矩阵向量乘积Pf,所述滤波响应R c因式分解的卷积算子表示为:
    Figure PCTCN2021119796-appb-100003
    其中,每个位置t的特征向量J{x}(t)首先与矩阵P T相乘,然后将生成的特征图与滤波器进行卷积,P dc表示学习系数,可以紧凑地表示为D×C的矩阵P=(P dc);式中,每个位置t的特征向量J{x}(t)表示为J{x};
    步骤2-4,对所述跟踪目标采用视觉显著性检测;
    步骤2-5,由获得的滤波响应R c和当前帧的显著性R S相乘,最终的响应图R f=R c·R S,当所述最终的响应图R f取最大值时,将响应值最大的位置映射到原图,得到所述目标在后续帧中的位置,即获得预测的轨迹。
  3. 根据权利要求2所述的一种复杂背景下高精度多目标跟踪方法,其特征在于,所述步骤2-4,包括:
    步骤2-4-1,假设输入图像为I,若已知一个跟踪目标的目标区域,即矩形框区域,以及环绕区域时,在图像处的像素属于目标像素的概率是:
    Figure PCTCN2021119796-appb-100004
    其中,m表示分离出的目标像素,O表示目标区域,S表示环绕区域,b m表示分配给输入图像I的颜色分量;
    所述分配给输入图像I的颜色分量b m属于目标区域O和环绕区域S的概率分别表示为:
    Figure PCTCN2021119796-appb-100005
    其中,
    Figure PCTCN2021119796-appb-100006
    表示在目标区域O∈I上计算的非标准化直方图H的第b m个计算区间,
    Figure PCTCN2021119796-appb-100007
    表示在环绕区域S∈I上计算的非标准化直方图H的第b m个计算区间;
    步骤2-4-2,在目标跟踪过程中,给定第一帧的目标位置,在后续帧中,在前一帧的位置周围进行一个矩形区域的搜索,当前帧的显著性R S计算公式为:
    R S=s v(O t)s d(O t),
    Figure PCTCN2021119796-appb-100008
    其中,s v(O t)表示基于对象模型的概率分数,s d(O t)表示目标到前一帧的目标中心c t-1的欧氏距离的距离分数,P 1:t-1表示从第一帧到前一帧的概率分数,σ表示为正态分布的标准差。
  4. 根据权利要求1所述的一种复杂背景下高精度多目标跟踪方法,其特征在于,所述步骤3,包括:利用目标检测网络得到目标的检测结果,将目标的运动状态定义为一个8维空间(x t,y t,r t,h t,x *,y *,r *,h *),分别表示轨迹在某个时刻的状态,其中,x t,y t表示检测框的中心在图像坐标系中的坐标,r t表示检测框的长宽比,h t表示检测框的高度;x *,y *,r *,h *表示在图像坐标中对应的速度信息。
  5. 根据权利要求1所述的一种复杂背景下高精度多目标跟踪方法,其特征在于,所述步骤4,包括:
    步骤4-1,使用所述目标的检测结果与预测的轨迹之间的距离表示运动匹配程度:
    Figure PCTCN2021119796-appb-100009
    其中,d jk表示第j个目标的第k个的状态,y ik表示第i条轨迹的第k个状态;
    所述运动匹配程度表示第j个目标的检测结果和第i条轨迹之间的匹配程度;
    其中,S i是轨迹预测得到的在当前时刻观测空间的协方差矩阵,y i是轨迹在当前时刻的预测观测量,d j是第j个目标的状态。
    步骤4-2,使用所述目标的检测结果与轨迹包含的目标的特征向量之间的最小余弦距离作为目标与轨迹之间的表观匹配程度;
    第j个目标的检测结果和第i条轨迹之间的余弦相似度为:
    Figure PCTCN2021119796-appb-100010
    余弦距离=1-余弦相似度,所述目标与轨迹之间的表观匹配程度为:
    Figure PCTCN2021119796-appb-100011
    步骤4-3,利用加权平均的方式对两种度量方式,即对运动距离匹配度和表观信息进行融合,获得所述两种度量方式融合的值ω i,j
    Figure PCTCN2021119796-appb-100012
    即,
    Figure PCTCN2021119796-appb-100013
    其中,μ为超参数,用于调整不同项的去权重。
  6. 根据权利要求1所述的一种复杂背景下高精度多目标跟踪方法,其特征在于,所述步骤5,包括:
    步骤5-1,若所述两种度量融合的值ω i,j大于或等于预设匹配阈值T hres,则所述目标跟踪结果为匹配成功;
    若所述两种度量融合的ω i,j小于预设匹配阈值T hres,则所述目标跟踪结果为匹配失败;
    步骤5-2,已知轨迹的初始状态为T ini,若视频在处理过程中连续n帧匹配成功,将 所述轨迹从初始状态T ini转为确定状态T cofr,视为跟踪成功;
    若视频连续匹配成功的帧数小于n帧,计当前帧数为z,z=z+1;返回所述步骤1,重新进行匹配;
    若视频连续n帧都匹配失败,将轨迹从初始状态T ini转为删除状态T dele,视为跟踪失败,将当前轨迹从视频中删除。
PCT/CN2021/119796 2021-04-15 2021-09-23 一种复杂背景下高精度多目标跟踪方法 WO2022217840A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110404599.5 2021-04-15
CN202110404599.5A CN113012203B (zh) 2021-04-15 2021-04-15 一种复杂背景下高精度多目标跟踪方法

Publications (1)

Publication Number Publication Date
WO2022217840A1 true WO2022217840A1 (zh) 2022-10-20

Family

ID=76389386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119796 WO2022217840A1 (zh) 2021-04-15 2021-09-23 一种复杂背景下高精度多目标跟踪方法

Country Status (2)

Country Link
CN (1) CN113012203B (zh)
WO (1) WO2022217840A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272420A (zh) * 2022-09-28 2022-11-01 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 一种长时目标跟踪方法、系统及存储介质
CN115984846A (zh) * 2023-02-06 2023-04-18 山东省人工智能研究院 一种基于深度学习的高分辨率图像中小目标的智能识别方法
CN116129332A (zh) * 2023-04-12 2023-05-16 武汉理工大学 多船舶目标的跟踪识别方法、装置、电子设备及存储介质
CN116343125A (zh) * 2023-03-30 2023-06-27 北京国泰星云科技有限公司 一种基于计算机视觉的集装箱箱底锁头检测方法
CN116452791A (zh) * 2023-03-27 2023-07-18 广州市斯睿特智能科技有限公司 多相机点位的缺陷区域定位方法、系统、装置及存储介质
CN116563348A (zh) * 2023-07-06 2023-08-08 中国科学院国家空间科学中心 基于双特征模板的红外弱小目标多模态跟踪方法及系统
CN116596958A (zh) * 2023-07-18 2023-08-15 四川迪晟新达类脑智能技术有限公司 一种基于在线样本增广的目标跟踪方法及装置
CN116681660A (zh) * 2023-05-18 2023-09-01 中国长江三峡集团有限公司 一种目标对象缺陷检测方法、装置、电子设备及存储介质
CN116721132A (zh) * 2023-06-20 2023-09-08 中国农业大学 一种工厂化养殖的鱼类多目标跟踪方法、系统及设备
CN116758119A (zh) * 2023-06-27 2023-09-15 重庆比特数图科技有限公司 基于运动补偿、联动的多目标循环检测跟踪方法及系统
CN116758110A (zh) * 2023-08-15 2023-09-15 中国科学技术大学 复杂运动场景下的鲁棒多目标跟踪方法
CN116993779A (zh) * 2023-08-03 2023-11-03 重庆大学 一种适于监控视频下的车辆目标跟踪方法
CN117214881A (zh) * 2023-07-21 2023-12-12 哈尔滨工程大学 一种复杂场景下基于Transformer网络的多目标跟踪方法
CN117455955A (zh) * 2023-12-14 2024-01-26 武汉纺织大学 一种基于无人机视角下的行人多目标跟踪方法
CN117746304A (zh) * 2024-02-21 2024-03-22 浪潮软件科技有限公司 基于计算机视觉的冰箱食材识别定位方法及系统
CN117809054A (zh) * 2024-02-29 2024-04-02 南京邮电大学 一种基于特征解耦融合网络的多目标跟踪方法
CN117853759A (zh) * 2024-03-08 2024-04-09 山东海润数聚科技有限公司 一种多目标跟踪方法、系统、设备和存储介质
CN117437261B (zh) * 2023-10-08 2024-05-31 南京威翔科技有限公司 一种适用于边缘端远距离目标的跟踪方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113012203B (zh) * 2021-04-15 2023-10-20 南京莱斯电子设备有限公司 一种复杂背景下高精度多目标跟踪方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197502A (zh) * 2019-06-06 2019-09-03 山东工商学院 一种基于身份再识别的多目标跟踪方法及系统
CN110490901A (zh) * 2019-07-15 2019-11-22 武汉大学 抗姿态变化的行人检测跟踪方法
CN111476826A (zh) * 2020-04-10 2020-07-31 电子科技大学 一种基于ssd目标检测的多目标车辆跟踪方法
CN111652909A (zh) * 2020-04-21 2020-09-11 南京理工大学 一种基于深度哈希特征的行人多目标追踪方法
CN113012203A (zh) * 2021-04-15 2021-06-22 南京莱斯电子设备有限公司 一种复杂背景下高精度多目标跟踪方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019006632A1 (zh) * 2017-07-04 2019-01-10 深圳大学 一种视频多目标跟踪方法及装置
CN107818571B (zh) * 2017-12-11 2018-07-20 珠海大横琴科技发展有限公司 基于深度学习网络和均值漂移的船只自动跟踪方法及系统
CN108198209B (zh) * 2017-12-22 2020-05-01 天津理工大学 在发生遮挡和尺度变化情况下行人跟踪方法
CN108875588B (zh) * 2018-05-25 2022-04-15 武汉大学 基于深度学习的跨摄像头行人检测跟踪方法
US10970856B2 (en) * 2018-12-27 2021-04-06 Baidu Usa Llc Joint learning of geometry and motion with three-dimensional holistic understanding
CN110321937B (zh) * 2019-06-18 2022-05-17 哈尔滨工程大学 一种Faster-RCNN结合卡尔曼滤波的运动人体跟踪方法
CN110660083B (zh) * 2019-09-27 2022-12-23 国网江苏省电力工程咨询有限公司 一种结合视频场景特征感知的多目标跟踪方法
CN111008997B (zh) * 2019-12-18 2023-04-07 南京莱斯电子设备有限公司 一种车辆检测与跟踪一体化方法
CN111508002B (zh) * 2020-04-20 2020-12-25 北京理工大学 一种小型低飞目标视觉检测跟踪系统及其方法
CN112084914B (zh) * 2020-08-31 2024-04-26 的卢技术有限公司 一种融合空间运动和表观特征学习的多目标跟踪方法
CN112308881B (zh) * 2020-11-02 2023-08-15 西安电子科技大学 一种基于遥感图像的舰船多目标跟踪方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197502A (zh) * 2019-06-06 2019-09-03 山东工商学院 一种基于身份再识别的多目标跟踪方法及系统
CN110490901A (zh) * 2019-07-15 2019-11-22 武汉大学 抗姿态变化的行人检测跟踪方法
CN111476826A (zh) * 2020-04-10 2020-07-31 电子科技大学 一种基于ssd目标检测的多目标车辆跟踪方法
CN111652909A (zh) * 2020-04-21 2020-09-11 南京理工大学 一种基于深度哈希特征的行人多目标追踪方法
CN113012203A (zh) * 2021-04-15 2021-06-22 南京莱斯电子设备有限公司 一种复杂背景下高精度多目标跟踪方法

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272420A (zh) * 2022-09-28 2022-11-01 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 一种长时目标跟踪方法、系统及存储介质
CN115984846B (zh) * 2023-02-06 2023-10-10 山东省人工智能研究院 一种基于深度学习的高分辨率图像中小目标的智能识别方法
CN115984846A (zh) * 2023-02-06 2023-04-18 山东省人工智能研究院 一种基于深度学习的高分辨率图像中小目标的智能识别方法
CN116452791A (zh) * 2023-03-27 2023-07-18 广州市斯睿特智能科技有限公司 多相机点位的缺陷区域定位方法、系统、装置及存储介质
CN116452791B (zh) * 2023-03-27 2024-03-22 广州市斯睿特智能科技有限公司 多相机点位的缺陷区域定位方法、系统、装置及存储介质
CN116343125A (zh) * 2023-03-30 2023-06-27 北京国泰星云科技有限公司 一种基于计算机视觉的集装箱箱底锁头检测方法
CN116343125B (zh) * 2023-03-30 2024-04-02 北京国泰星云科技有限公司 一种基于计算机视觉的集装箱箱底锁头检测方法
CN116129332A (zh) * 2023-04-12 2023-05-16 武汉理工大学 多船舶目标的跟踪识别方法、装置、电子设备及存储介质
US12008801B1 (en) 2023-04-12 2024-06-11 Wuhan University Of Technology Tracking and identification method, device, electronic device, and storage medium for multiple vessel targets
CN116681660B (zh) * 2023-05-18 2024-04-19 中国长江三峡集团有限公司 一种目标对象缺陷检测方法、装置、电子设备及存储介质
CN116681660A (zh) * 2023-05-18 2023-09-01 中国长江三峡集团有限公司 一种目标对象缺陷检测方法、装置、电子设备及存储介质
CN116721132B (zh) * 2023-06-20 2023-11-24 中国农业大学 一种工厂化养殖的鱼类多目标跟踪方法、系统及设备
CN116721132A (zh) * 2023-06-20 2023-09-08 中国农业大学 一种工厂化养殖的鱼类多目标跟踪方法、系统及设备
CN116758119A (zh) * 2023-06-27 2023-09-15 重庆比特数图科技有限公司 基于运动补偿、联动的多目标循环检测跟踪方法及系统
CN116758119B (zh) * 2023-06-27 2024-04-19 重庆比特数图科技有限公司 基于运动补偿、联动的多目标循环检测跟踪方法及系统
CN116563348A (zh) * 2023-07-06 2023-08-08 中国科学院国家空间科学中心 基于双特征模板的红外弱小目标多模态跟踪方法及系统
CN116563348B (zh) * 2023-07-06 2023-11-14 中国科学院国家空间科学中心 基于双特征模板的红外弱小目标多模态跟踪方法及系统
CN116596958B (zh) * 2023-07-18 2023-10-10 四川迪晟新达类脑智能技术有限公司 一种基于在线样本增广的目标跟踪方法及装置
CN116596958A (zh) * 2023-07-18 2023-08-15 四川迪晟新达类脑智能技术有限公司 一种基于在线样本增广的目标跟踪方法及装置
CN117214881A (zh) * 2023-07-21 2023-12-12 哈尔滨工程大学 一种复杂场景下基于Transformer网络的多目标跟踪方法
CN116993779A (zh) * 2023-08-03 2023-11-03 重庆大学 一种适于监控视频下的车辆目标跟踪方法
CN116993779B (zh) * 2023-08-03 2024-05-14 重庆大学 一种适于监控视频下的车辆目标跟踪方法
CN116758110B (zh) * 2023-08-15 2023-11-17 中国科学技术大学 复杂运动场景下的鲁棒多目标跟踪方法
CN116758110A (zh) * 2023-08-15 2023-09-15 中国科学技术大学 复杂运动场景下的鲁棒多目标跟踪方法
CN117437261B (zh) * 2023-10-08 2024-05-31 南京威翔科技有限公司 一种适用于边缘端远距离目标的跟踪方法
CN117455955B (zh) * 2023-12-14 2024-03-08 武汉纺织大学 一种基于无人机视角下的行人多目标跟踪方法
CN117455955A (zh) * 2023-12-14 2024-01-26 武汉纺织大学 一种基于无人机视角下的行人多目标跟踪方法
CN117746304A (zh) * 2024-02-21 2024-03-22 浪潮软件科技有限公司 基于计算机视觉的冰箱食材识别定位方法及系统
CN117746304B (zh) * 2024-02-21 2024-05-14 浪潮软件科技有限公司 基于计算机视觉的冰箱食材识别定位方法及系统
CN117809054A (zh) * 2024-02-29 2024-04-02 南京邮电大学 一种基于特征解耦融合网络的多目标跟踪方法
CN117809054B (zh) * 2024-02-29 2024-05-10 南京邮电大学 一种基于特征解耦融合网络的多目标跟踪方法
CN117853759B (zh) * 2024-03-08 2024-05-10 山东海润数聚科技有限公司 一种多目标跟踪方法、系统、设备和存储介质
CN117853759A (zh) * 2024-03-08 2024-04-09 山东海润数聚科技有限公司 一种多目标跟踪方法、系统、设备和存储介质

Also Published As

Publication number Publication date
CN113012203B (zh) 2023-10-20
CN113012203A (zh) 2021-06-22

Similar Documents

Publication Publication Date Title
WO2022217840A1 (zh) 一种复杂背景下高精度多目标跟踪方法
CN111627045B (zh) 单镜头下的多行人在线跟踪方法、装置、设备及存储介质
CN109543641B (zh) 一种实时视频的多目标去重方法、终端设备及存储介质
CN110660082A (zh) 一种基于图卷积与轨迹卷积网络学习的目标跟踪方法
CN111476817A (zh) 一种基于yolov3的多目标行人检测跟踪方法
Patil et al. Msednet: multi-scale deep saliency learning for moving object detection
CN113674328A (zh) 一种多目标车辆跟踪方法
Jellal et al. LS-ELAS: Line segment based efficient large scale stereo matching
CN111402303A (zh) 一种基于kfstrcf的目标跟踪架构
Jiao et al. Magicvo: End-to-end monocular visual odometry through deep bi-directional recurrent convolutional neural network
Yang et al. Increaco: incrementally learned automatic check-out with photorealistic exemplar augmentation
Akok et al. Robust object tracking by interleaving variable rate color particle filtering and deep learning
Liu et al. Mean shift fusion color histogram algorithm for nonrigid complex target tracking in sports video
Wang et al. Pmds-slam: Probability mesh enhanced semantic slam in dynamic environments
CN108346158B (zh) 基于主块数据关联的多目标跟踪方法及系统
Dhassi et al. Visual tracking based on adaptive mean shift multiple appearance models
Abdelali et al. Object tracking in video via particle filter
Maia et al. Visual object tracking by an evolutionary self-organizing neural network
Xiao et al. Research on scale adaptive particle filter tracker with feature integration
Ren et al. An improved ORB-SLAM2 algorithm based on image information entropy
CN116580066B (zh) 一种低帧率场景下的行人目标跟踪方法及可读存储介质
Ren et al. A face tracking method in videos based on convolutional neural networks
Jemilda et al. Tracking Moving Objects in Video.
Rai et al. Pearson's correlation and background subtraction (BGS) based approach for object's motion detection in infrared video frame sequences
Guo et al. Fast Visual Tracking using Memory Gradient Pursuit Algorithm.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21936709

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21936709

Country of ref document: EP

Kind code of ref document: A1