TW202341072A

TW202341072A - A high-speed data association method for multi-object tracking

Info

Publication number: TW202341072A
Application number: TW111113024A
Authority: TW
Inventors: 蔡奇謚
Original assignee: 淡江大學學校財團法人淡江大學
Priority date: 2022-04-06
Filing date: 2022-04-06
Publication date: 2023-10-16
Also published as: TWI790957B

Abstract

A high-speed data association method for multi-object tracking disclosed in the present invention is an improved high-speed data association method in a real-time multi-object tracking system. This invention uses a simple filtering operation method to delete some impossible matches based on the measured distance between the object in the previous frame and the current frame, then uses a linear weighted sum method to fuse the remaining distance information as the cost matrix of Hungarian algorithm matching, and only performs Hungarian algorithm matching twice. The high-speed data association method for multi-object tracking provided by the present invention can effectively improve the processing speed of multi-object tracking operations, especially to achieve real-time computing capabilities in embedded platforms with limited computing resources.

Description

A high-speed data association method for multi-target tracking

本發明係有關於一種追蹤系統的數據關聯方法，特別是關於一種多目標追蹤的高速數據關聯方法。The present invention relates to a data association method for a tracking system, and in particular to a high-speed data association method for multi-target tracking.

多目標追蹤(Multi-Object Tracking, MOT)是電腦視覺領域最具挑戰性的任務之一。現有的多目標追蹤方法中，偵測後追蹤(Tracking By Detection, TBD)演算法已成為此領域的主流架構，其先利用物件偵測模型偵測每一幀中所有目標，隨後將每一幀的偵測結果進行數據關聯，例如SORT與DeepSORT就是透過現有的Faster-RCNN做為物件偵測模型，達到強健的多目標追蹤性能。Multi-Object Tracking (MOT) is one of the most challenging tasks in the field of computer vision. Among the existing multi-target tracking methods, the Tracking By Detection (TBD) algorithm has become the mainstream architecture in this field. It first uses the object detection model to detect all targets in each frame, and then Data correlation is performed on the detection results. For example, SORT and DeepSORT use the existing Faster-RCNN as the object detection model to achieve robust multi-target tracking performance.

TBD將多目標追蹤看成數據關聯(Data Association)的問題，其目的是將影片序列中跨幀的偵測結果透過數據關聯進行匹配；現有方法中，SORT是透過Faster-RCNN偵測出當前幀中目標物的類別與位置資訊後，透過卡爾曼濾波器預測上一幀成功被追蹤的目標在當前幀的位置資訊，隨後計算出兩幀之間目標物的IoU距離(Intersection over Union Distance)，即計算兩目標物之間的邊界框(bounding box)重疊率，並將兩幀之間目標物的IoU距離作為成本矩陣(Cost Matrix)並透過匈牙利匹配進行數據關聯；SORT偵測模型很大程度上決定了多目標追蹤系統的追蹤準確率，其透過更換偵測模型可使追蹤準確率提高18.9%。TBD regards multi-target tracking as a problem of data association. Its purpose is to match the detection results across frames in the video sequence through data association. Among the existing methods, SORT detects the current frame through Faster-RCNN. After obtaining the category and position information of the target object, the Kalman filter is used to predict the position information of the target that was successfully tracked in the previous frame in the current frame, and then the IoU distance (Intersection over Union Distance) of the target object between the two frames is calculated. That is, calculate the overlap rate of the bounding box between the two targets, use the IoU distance of the target between the two frames as the cost matrix (Cost Matrix), and perform data association through Hungarian matching; the SORT detection model is very The above determines the tracking accuracy of the multi-target tracking system. By changing the detection model, the tracking accuracy can be increased by 18.9%.

儘管SORT在MOT取得良好的追蹤準確率，不過其在追蹤過程中發生大量的身分交換(identity switches)，主要原因為其採用的關聯度量(association metric)僅在狀態估計不確定性較低時才是準確的；為了改善此問題而提出DeepSORT架構，其基於SORT的架構加入外觀模型(appearance model)，換句話說，DeepSORT為了更好地進行數據關聯效果，將多目標追蹤細分成物件偵測與外觀提取兩步驟；根據這樣的特點，此架構被稱呼為Two-Step TBD；然而，Two-Step TBD架構在運算處理速度方面表現不理想，因為物件偵測與外觀提取都需要大量計算，並且外觀提取是針對影像重新計算，這造成一定程度的重複計算；因此，近年來One-Shot TBD的架構越來越受到重視，其核心思想是將外觀提取模型融入物件偵測模型中，成為一個多目標追蹤模型來同時完成物件偵測與外觀提取，以此共享大部份計算來減少運算時間。Although SORT achieves good tracking accuracy in MOT, a large number of identity switches occur during the tracking process. The main reason is that the association metric used by SORT only occurs when the state estimation uncertainty is low. is accurate; in order to improve this problem, the DeepSORT architecture was proposed. Its SORT-based architecture adds an appearance model. In other words, DeepSORT subdivides multi-target tracking into object detection and Two steps for appearance extraction; based on these characteristics, this architecture is called Two-Step TBD; however, the Two-Step TBD architecture does not perform satisfactorily in terms of computing processing speed because both object detection and appearance extraction require a large amount of calculations, and the appearance Extraction is recalculated for the image, which results in a certain degree of repeated calculations; therefore, the architecture of One-Shot TBD has received more and more attention in recent years. Its core idea is to integrate the appearance extraction model into the object detection model to become a multi-objective Tracking models are used to complete object detection and appearance extraction at the same time, thereby sharing most of the calculations and reducing computing time.

隨著物件偵測技術不斷突破，多目標追蹤模型得到最直接的效益就是追蹤準確率也跟著不斷上升，但其背後所帶來的是不斷增加的模型大小與處理時間；為了改善此問題而提出JDE架構，其捨棄其他多目標追蹤方法大多使用二階段物件偵測模型RCNN系列進行設計，改用一階段的YOLOv3，並依據One-Shot TBD架構進行設計。With the continuous breakthroughs in object detection technology, the most direct benefit of the multi-target tracking model is that the tracking accuracy continues to increase, but what this brings is the increasing model size and processing time; in order to improve this problem, it is proposed The JDE architecture abandons other multi-target tracking methods and mostly uses the two-stage object detection model RCNN series for design, instead using the one-stage YOLOv3 and designing based on the One-Shot TBD architecture.

然而，上述的方法在使用高效的運算架構之桌上型電腦運行也只達到近乎即時運算的水平，無法實現真正即時處理性能，具體來說，根據IPVM(IP Video Market)報告得知工業應用中實時視覺系統的平均幀速率在11到20FPS之間，而現有的方法都無法達到平均幀速率在11到20FPS之間的需求，更無法用於需要運行在計算資源有限的嵌入式平台上。However, the above method can only reach a near-real-time computing level when running on a desktop computer using an efficient computing architecture, and cannot achieve true real-time processing performance. Specifically, according to the IPVM (IP Video Market) report, in industrial applications The average frame rate of a real-time vision system is between 11 and 20 FPS. However, existing methods cannot meet the requirement of an average frame rate between 11 and 20 FPS, and cannot be used on embedded platforms that need to run on limited computing resources.

有鑑於此，如何為計算資源有限的嵌入式系統開發輕量化MOT方法，同時保持適當的追蹤精度及高於10FPS運算速度仍一項亟待解決之問題，本發明基於One-Shot TBD的架構，提出了一種輕量化即時多目標追蹤系統，其基於輕量化網路Mobilenet-SSDv2作為物件偵測模型，並搭配一種多目標追蹤的高速數據關聯方法，使用簡單篩選（Simple Filtering）之過濾操作的方法來替換卡爾曼濾波器，基於前一幀中的目標與當前幀之間的度量距離(例如IoU距離和餘弦距離)，刪除一些不可能的匹配後，使用線性加權求和的方法融合剩餘的距離信息作為匈牙利匹配的成本矩陣，並只執行兩次該匈牙利匹配，以實現一個輕量化的多目標追蹤系統，藉此改善現有方法因運算量龐大而無法在運算資源有限的嵌入式平台中進行即時運算的應用限制。In view of this, how to develop lightweight MOT methods for embedded systems with limited computing resources while maintaining appropriate tracking accuracy and operation speeds higher than 10FPS is still an urgent problem to be solved. Based on the One-Shot TBD architecture, the present invention proposes A lightweight real-time multi-target tracking system is developed, which is based on the lightweight network Mobilenet-SSDv2 as the object detection model, and is equipped with a high-speed data association method for multi-target tracking, using the filtering operation method of Simple Filtering. Replace the Kalman filter, based on the metric distance between the target in the previous frame and the current frame (such as IoU distance and cosine distance), after deleting some impossible matches, use the linear weighted summation method to fuse the remaining distance information As the cost matrix of Hungarian matching, and only perform the Hungarian matching twice to implement a lightweight multi-object tracking system, thereby improving the existing method's large computational load and inability to perform real-time calculations in embedded platforms with limited computing resources. application restrictions.

本發明之多目標追蹤的高速數據關聯方法，可以解決現有技術無法克服之困難點，包含： (1) 提出一種低複雜度的數據關聯方法，其可與現有的TBD多目標追蹤技術結合，為最後端的數據關聯模組。 (2) 提出的數據關聯方法具備高速運算特性，可在運算資源有限的嵌入式平台中即時運算，實驗結果顯示在嵌入式平台中可達到12 FPS的處理速度。 (3) 提出的數據關聯方法仍可維持多目標追蹤系統的追蹤精度及強健性，實驗結果顯示在MOT16資料集中可達到58.3% MOTA的追蹤精度及48.0% IDF1的追蹤強健性。 (4) 提出的數據關聯方法可運用於基於TBD多目標追蹤之監控系統，如自駕車的視覺環境感知系統、交通監控系統、安全監控系統等。 The high-speed data association method for multi-target tracking of the present invention can solve the difficulties that cannot be overcome by the existing technology, including: (1) Propose a low-complexity data association method, which can be combined with the existing TBD multi-target tracking technology and serve as the final data association module. (2) The proposed data association method has high-speed computing characteristics and can be operated instantly in embedded platforms with limited computing resources. Experimental results show that the processing speed can reach 12 FPS in embedded platforms. (3) The proposed data association method can still maintain the tracking accuracy and robustness of the multi-target tracking system. The experimental results show that it can achieve 58.3% of the tracking accuracy of MOTA and 48.0% of the tracking robustness of IDF1 in the MOT16 data set. (4) The proposed data association method can be applied to monitoring systems based on TBD multi-target tracking, such as self-driving visual environment perception systems, traffic monitoring systems, safety monitoring systems, etc.

本發明提供一種多目標追蹤的高速數據關聯方法，包含下述步驟：(a) 輸入當前幀中M個偵測物件資訊、及前一幀中N個追蹤器資訊；(b) 計算第i個偵測物件之特徵向量與第j個追蹤器之特徵向量之間的特徵度量距離；(c) 計算第i個偵測物件之邊界框與第j個追蹤器之邊界框之間的邊界框度量距離；(d) 將特徵度量距離及邊界框度量距離套用簡單篩選之過濾操作，分別獲得過濾特徵度量指標及過濾邊界框度量指標，其中i=1～M，j=1～N；(e) 將過濾特徵度量指標及過濾邊界框度量指標進行權重和運算，以產生大小為M × N的成本矩陣C；(f) 將成本矩陣C套用第一次線性指派算法，以找到當前偵測物件與先前追蹤器之間的最佳匹配對集合、未匹配偵測物件集合、及未匹配追蹤器集合；(g) 根據第一次最佳匹配對集合中的當前偵測物件資訊來更新所對應的追蹤器資訊，並將追蹤器放置於活動追蹤池中，同時清除追蹤器的未匹配幀數計數器，使計數器數值為0；(h) 根據第一次未匹配追蹤器集合中的每個先前追蹤器，若連續未匹配的幀數超過K幀則刪除追蹤器，若否，則將追蹤器放置於未活動追蹤池中，並更新追蹤器的未匹配幀數計數器；(i) 根據步驟(a)~步驟(e)，計算剩餘的M1個未匹配偵測物件與未活動追蹤池中全部N1個追蹤器，產生大小為M1 × N1的成本矩陣C1；(j) 將成本矩陣C1套用第二次線性指派算法，以找到當前未匹配偵測物件與未活動追蹤器之間的最佳匹配對集合、未匹配偵測物件集合、及未匹配追蹤器集合；(k) 根據第二次最佳匹配對集合中的當前未匹配偵測物件資訊來更新所對應的未活動追蹤器資訊，並將追蹤器放置於活動追蹤池中，同時清除追蹤器的未匹配幀數計數器，使計數器數值為0；(l) 根據第二次未匹配追蹤器集合中的每個未活動追蹤器，若連續未匹配的幀數超過K幀則刪除未活動追蹤器，若否，則將未活動追蹤器放置於未活動追蹤池中，並更新未活動追蹤器的未匹配幀數計數器；(m) 根據每個第二次未匹配偵測物件資訊，建立新的追蹤器負責追蹤物件，並將追蹤器放置於未活動追蹤池中；(n) 輸出所有活動追蹤池中的追蹤器結果。 The present invention provides a high-speed data association method for multi-target tracking, which includes the following steps: (a) input M detection object information in the current frame and N tracker information in the previous frame; (b) calculate the i-th The feature metric distance between the feature vector of the detected object and the feature vector of the jth tracker ;(c) Calculate the bounding box metric distance between the bounding box of the i-th detected object and the bounding box of the j-th tracker ; (d) Apply the simple filtering operation to the feature metric distance and the bounding box metric distance to obtain the filtered feature metric index respectively. and filter bounding box metrics , where i=1~M, j=1~N; (e) filter the feature metrics and filter bounding box metrics Perform a weighted sum operation to generate a cost matrix C of size M × N; (f) Apply the first linear assignment algorithm to the cost matrix C to find the best matching pair set between the current detected object and the previous tracker , unmatched detection object set, and unmatched tracker set; (g) update the corresponding tracker information based on the current detection object information in the first best matching pair set, and place the tracker in the activity In the tracking pool, clear the unmatched frame counter of the tracker so that the counter value is 0; (h) According to each previous tracker in the first unmatched tracker set, if the number of consecutive unmatched frames exceeds K frame, delete the tracker, if not, place the tracker in the inactive tracking pool, and update the tracker's unmatched frame counter; (i) Calculate the remaining M1 according to steps (a) to (e) Unmatched detection objects and all N1 trackers in the inactive tracking pool generate a cost matrix C1 of size M1 × N1; (j) Apply the second linear assignment algorithm to the cost matrix C1 to find the current unmatched detection objects. The best matching pair set between detection objects and inactive trackers, the unmatched detection object set, and the unmatched tracker set; (k) based on the current unmatched detection objects in the second best matching pair set information to update the corresponding inactive tracker information, place the tracker in the active tracking pool, and clear the unmatched frame counter of the tracker so that the counter value is 0; (l) According to the second unmatched tracking For each inactive tracker in the tracker collection, if the number of consecutive unmatched frames exceeds K frames, the inactive tracker will be deleted. If not, the inactive tracker will be placed in the inactive tracking pool and the inactive tracking will be updated. The unmatched frame counter of the device; (m) Based on each second unmatched detected object information, create a new tracker responsible for tracking the object, and place the tracker in the inactive tracking pool; (n) Output all Tracker results in the active tracking pool.

在一實施例中，在步驟(a)中，每個輸入的偵測物件資訊及追蹤器資訊，均包含至少一個邊界框資訊及至少一個特徵向量資訊。In one embodiment, in step (a), each input detected object information and tracker information includes at least one bounding box information and at least one feature vector information.

在一實施例中，在步驟(b)中，特徵度量距離為餘弦距離，且是套用下述公式(1)之藉由偵測物件之特徵向量及追蹤器之特徵向量來求得；公式(1)： =1 - 公式(1)中，符號表示兩個向量之間的內積運算子。 In one embodiment, in step (b), the feature metric distance is the cosine distance, and is obtained by applying the following formula (1) by using the feature vector of the detected object and the feature vector of the tracker; Formula (1): =1- In formula (1), the symbol Represents the inner product operator between two vectors.

在一實施例中，在步驟(c)中，邊界框度量距離為IoU距離，且是套用下述公式(2)之藉由偵測物件之邊界框及追蹤器之邊界框來求得；公式(2)： =1 – IoU( BBox _i , BBox _j ) 公式(2)中，IoU函數定義為：其中， BBox _i 及 BBox _j 表示當前幀中第 i個檢測對象及前一幀中第 j個追蹤對象的邊界框；函數area(A)為計算輸入集A的面積；符號表示兩個集合的交集及並集算子。 In one embodiment, in step (c), the bounding box metric distance is the IoU distance, and is obtained by applying the following formula (2) by detecting the bounding box of the object and the bounding box of the tracker; Formula (2): =1 – IoU ( BBox _i , BBox _j ) In formula (2), the IoU function is defined as: Among them, BBox _i and BBox _j represent the bounding box of the i -th detection object in the current frame and the j -th tracking object in the previous frame; the function area(A) is to calculate the area of the input set A; the symbol Represents the intersection and union operators of two sets.

在一實施例中，在步驟(d)中，簡單篩選之過濾操作為給定閾值 t後，經由下述公式(3)及公式(4)對特徵度量距離及邊界框度量距離進行濾除動作：公式(3)：公式(3)中，SF表示簡單篩選之過濾操作；公式(4)： = SF( , t ^E )， = SF( , t ^l ) 公式(4)中， t ^E 表示過濾特徵度量指標的閾值， t ^l 表示過濾邊界框度量指標的閾值。 In one embodiment, in step (d), the filtering operation of simple filtering is to measure the distance of features through the following formula (3) and formula (4) after a given threshold t and bounding box metric distance Perform filtering action: Formula (3): In formula (3), SF represents the filtering operation of simple filtering; formula (4): =SF( , t ^E ), =SF( , t ^l ) In formula (4), t ^E represents the threshold of the filtering feature metric, and t ^l represents the threshold of the filtering bounding box metric.

在一實施例中，在步驟(e)中，權重和運算為給定權重值 w後，經由下述公式(5)對過濾特徵度量指標及過濾邊界框度量指標，以一對一的方式進行權重融合運算動作，以產生大小為M × N的二維矩陣：公式(5)： C _ij= w + (1- w) 公式(5)中，C _ij表示成本矩陣中位置(i,j)的成本值， w表示用於融合兩個度量的權重參數。 In one embodiment, in step (e), after the weight sum operation is given as the weight value w , the filter feature metric index is calculated through the following formula (5) and filter bounding box metrics , perform weight fusion operations in a one-to-one manner to generate a two-dimensional matrix of size M × N: Formula (5): C _ij = w + (1- w ) In formula (5), C _ij represents the cost value at position (i, j) in the cost matrix, and w represents the weight parameter used to fuse the two measures.

在一實施例中，在步驟(f)中，第一次線性指派算法為匈牙利算法。In one embodiment, in step (f), the first linear assignment algorithm is the Hungarian algorithm.

在一實施例中，在步驟(g)中，當前偵測物件資訊來更新所對應的追蹤器資訊，包含儲存當前偵測物件之邊界框到追蹤器中，以及將當前偵測物件之特徵向量及追蹤器之特徵向量的權重和運算。In one embodiment, in step (g), the currently detected object information is used to update the corresponding tracker information, including storing the bounding box of the currently detected object in the tracker, and converting the feature vector of the currently detected object And the weight sum operation of the tracker's feature vector.

在一實施例中，在步驟(h)中，K值為任意大於0的正整數；更新追蹤器的未匹配幀數計數器為計數器數值加1。In one embodiment, in step (h), the K value is any positive integer greater than 0; the unmatched frame counter of the tracker is updated by adding 1 to the counter value.

在一實施例中，在步驟(i)中，滿足M1≤M。In an embodiment, in step (i), M1≤M is satisfied.

在一實施例中，在步驟(j)中，第二次線性指派算法為匈牙利算法。In an embodiment, in step (j), the second linear assignment algorithm is the Hungarian algorithm.

在一實施例中，在步驟(k)中，當前未匹配偵測物件資訊來更新所對應的未活動追蹤器資訊，包含儲存當前未匹配偵測物件之邊界框到未活動追蹤器中，以及將當前未匹配偵測物件之特徵向量及未活動追蹤器之特徵向量的權重和運算，並將未活動追蹤器之狀態更新為活動。In one embodiment, in step (k), the currently unmatched detected object information is used to update the corresponding inactive tracker information, including storing the bounding box of the currently unmatched detected object in the inactive tracker, and Calculate the weighted sum of the feature vectors of the currently unmatched detected objects and the feature vectors of the inactive tracker, and update the status of the inactive tracker to active.

在一實施例中，在步驟(l)中，K值為任意大於0的正整數。In one embodiment, in step (l), the K value is any positive integer greater than 0.

在一實施例中，在步驟(m)中，建立新的追蹤器負責追蹤物件，包含儲存當前未匹配偵測物件之邊界框及特徵向量到新的追蹤器中，並將新的追蹤器之狀態設定為未活動、未匹配幀數計數器數值為0。In one embodiment, in step (m), creating a new tracker responsible for tracking the object includes storing the bounding box and feature vector of the currently unmatched detected object in the new tracker, and assigning the new tracker to the new tracker. The status is set to inactive and the unmatched frame counter value is 0.

本發明另提供一種多目標追蹤的高速數據關聯方法，包含下述步驟：(a) 輸入當前幀中M個偵測物件資訊、及前一幀中N個追蹤器資訊；(b) 計算第i個偵測物件之特徵向量與第j個追蹤器之特徵向量之間的特徵度量距離；(c) 計算第i個偵測物件之邊界框與第j個追蹤器之邊界框之間的邊界框度量距離；(d) 將特徵度量距離及邊界框度量距離套用簡單篩選之過濾操作，分別獲得過濾特徵度量指標及過濾邊界框度量指標，其中i=1～M，j=1～N；(e) 將過濾特徵度量指標及過濾邊界框度量指標進行權重和運算，以產生大小為M × N的成本矩陣C；(f) 將成本矩陣C套用第一次線性指派算法，以找到當前偵測物件與先前追蹤器之間的最佳匹配對集合、未匹配偵測物件集合、及未匹配追蹤器集合；(g) 根據第一次最佳匹配對集合中的當前偵測物件資訊來更新所對應的追蹤器資訊，並將追蹤器放置於活動追蹤池中，同時清除追蹤器的未匹配幀數計數器，使計數器數值為0；並經由運動預測器來預測活動追蹤池中每個追蹤器之邊界框於下一幀的位置，以及更新每個追蹤器之邊界框的位置；(h) 根據第一次未匹配追蹤器集合中的每個先前追蹤器，若連續未匹配的幀數超過K幀則刪除追蹤器，若否，則將追蹤器放置於未活動追蹤池中，並更新追蹤器的未匹配幀數計數器；並經由運動預測器來預測未活動追蹤池中每個追蹤器之邊界框於下一幀的位置，以及更新每個追蹤器之邊界框的位置；(i) 根據步驟(a)~步驟(e)，計算剩餘的M1個未匹配偵測物件與未活動追蹤池中全部N1個追蹤器，產生大小為M1 × N1的成本矩陣C1；(j) 將成本矩陣C1套用第二次線性指派算法，以找到當前未匹配偵測物件與未活動追蹤器之間的最佳匹配對集合、未匹配偵測物件集合、及未匹配追蹤器集合；(k) 根據第二次最佳匹配對集合中的當前未匹配偵測物件資訊來更新所對應的未活動追蹤器資訊，並將追蹤器放置於活動追蹤池中，同時清除追蹤器的未匹配幀數計數器，使計數器數值為0；(l) 根據第二次未匹配追蹤器集合中的每個未活動追蹤器，若連續未匹配的幀數超過K幀則刪除未活動追蹤器，若否，則將未活動追蹤器放置於未活動追蹤池中，並更新未活動追蹤器的未匹配幀數計數器；(m) 根據每個第二次未匹配偵測物件資訊，建立新的追蹤器負責追蹤物件，並將追蹤器放置於未活動追蹤池中；(n) 輸出所有活動追蹤池中的追蹤器結果。 The present invention also provides a high-speed data association method for multi-target tracking, which includes the following steps: (a) inputting M detection object information in the current frame and N tracker information in the previous frame; (b) calculating the i-th The feature metric distance between the feature vector of the detected object and the feature vector of the jth tracker ;(c) Calculate the bounding box metric distance between the bounding box of the i-th detected object and the bounding box of the j-th tracker ; (d) Apply the simple filtering operation to the feature metric distance and the bounding box metric distance to obtain the filtered feature metric index respectively. and filter bounding box metrics , where i=1~M, j=1~N; (e) filter the feature metrics and filter bounding box metrics Perform a weighted sum operation to generate a cost matrix C of size M × N; (f) Apply the first linear assignment algorithm to the cost matrix C to find the best matching pair set between the current detected object and the previous tracker , unmatched detection object set, and unmatched tracker set; (g) update the corresponding tracker information based on the current detection object information in the first best matching pair set, and place the tracker in the activity In the tracking pool, the unmatched frame counter of the tracker is cleared at the same time, so that the counter value is 0; and the motion predictor is used to predict the position of the bounding box of each tracker in the active tracking pool in the next frame, and update each The position of the bounding box of the tracker; (h) According to each previous tracker in the first unmatched tracker set, if the number of consecutive unmatched frames exceeds K frames, the tracker will be deleted; if not, the tracker will be deleted. Place it in the inactive tracking pool and update the tracker's unmatched frame counter; and use the motion predictor to predict the position of the bounding box of each tracker in the inactive tracking pool in the next frame, and update each tracking The position of the bounding box of the device; (i) According to steps (a) to step (e), calculate the remaining M1 unmatched detection objects and all N1 trackers in the inactive tracking pool, and generate a size of M1 × N1 Cost matrix C1; (j) Apply the second linear assignment algorithm to the cost matrix C1 to find the best matching pair set between the current unmatched detection objects and the inactive tracker, the unmatched detection object set, and the inactive tracker. Match tracker set; (k) Update the corresponding inactive tracker information based on the current unmatched detection object information in the second best matching pair set, place the tracker in the active tracking pool, and clear it at the same time The unmatched frame counter of the tracker, so that the counter value is 0; (l) According to each inactive tracker in the second unmatched tracker set, if the number of consecutive unmatched frames exceeds K frames, the inactive tracker will be deleted tracker, if not, place the inactive tracker in the inactive tracking pool, and update the unmatched frame counter of the inactive tracker; (m) Based on each second unmatched detected object information, create The new tracker is responsible for tracking objects and placing the tracker in the inactive tracking pool; (n) Output the results of all trackers in the active tracking pool.

在一實施例中，在步驟(g)及步驟(h)中，運動預測器為卡曼濾波器、資訊濾波器、或其他系統狀態估測器。In one embodiment, in steps (g) and (h), the motion predictor is a Kalman filter, an information filter, or other system state estimator.

為利貴審查委員了解本發明之技術特徵、內容與優點及其所能達到之功效，茲將本發明配合附圖及附件，並以實施例之表達形式詳細說明如下，而其中所使用之圖式，其主旨僅為示意及輔助說明書之用，未必為本發明實施後之真實比例與精準配置，故不應就所附之圖式的比例與配置關係解讀、侷限本發明於實際實施上的申請範圍，合先敘明。In order to help the review committee understand the technical features, content and advantages of the present invention and the effects it can achieve, the present invention is described in detail below in the form of embodiments with the accompanying drawings and attachments, and the drawings used therein are , its purpose is only for illustration and auxiliary description, and may not represent the actual proportions and precise configurations after implementation of the present invention. Therefore, the proportions and configuration relationships of the attached drawings should not be interpreted or limited to the actual implementation of the present invention. The scope shall be stated first.

請參閱圖2，為本發明之輕量化多目標追蹤系統架構圖，該模型是基於現有的One-Shot TBD架構，分成輕量化MobileNet-JDE模型及後處理模塊，本發明以MobileNet-SSDv2作為提出MOT模型的基礎，並使用IR(Inverted Residual)模塊來實現輕量化預測器(Mobile Predictor)，其中分類、回歸及嵌入分支分別用於預測對象的類別類型、位置信息、及外觀資訊。Please refer to Figure 2, which is an architecture diagram of the lightweight multi-target tracking system of the present invention. This model is based on the existing One-Shot TBD architecture and is divided into a lightweight MobileNet-JDE model and a post-processing module. The present invention uses MobileNet-SSDv2 as the proposed The basis of the MOT model, and the IR (Inverted Residual) module is used to implement a lightweight predictor (Mobile Predictor), in which the classification, regression and embedding branches are used to predict the category type, location information and appearance information of the object respectively.

後處理模塊包括非最大抑制(NMS)及數據關聯處理；NMS從MOT模型輸出的檢測信息中刪除置信度分數低、重疊率高的檢測信息；數據關聯處理將剩餘的檢測信息與當前的追蹤信息進行匹配；在數據關聯處理中，本發明使用簡單篩選之過濾操作的方法，基於前一幀中的目標與當前幀之間的度量距離(例如IoU距離及餘弦距離)，刪除一些不可能的匹配後，使用線性加權求和的方法融合剩餘的距離信息作為匈牙利匹配的成本矩陣。The post-processing module includes non-maximum suppression (NMS) and data association processing; NMS deletes detection information with low confidence scores and high overlap rates from the detection information output by the MOT model; data association processing combines the remaining detection information with the current tracking information Matching; in the data association processing, the present invention uses a simple filtering operation method to delete some impossible matches based on the metric distance between the target in the previous frame and the current frame (such as IoU distance and cosine distance) Finally, the linear weighted summation method is used to fuse the remaining distance information as the cost matrix of Hungarian matching.

傳統上的數據關聯方法是基於卡曼濾波器，並執行三次匈牙利樣本匹配(或線性分配)演算法以提高追蹤精度(請參考圖1)，但卻大幅增加了計算成本；請參閱圖3，為本發明之一高速數據關聯動作流程圖，為了在保持追蹤精度的同時提高處理速度，本發明只需要執行兩次的樣本匹配操作，並提出使用簡單篩選之過濾操作的方法來代替卡曼濾波器。The traditional data association method is based on the Kalman filter and performs a cubic Hungarian sample matching (or linear allocation) algorithm to improve tracking accuracy (see Figure 1), but it greatly increases the computational cost; see Figure 3, This is a high-speed data association action flow chart of the present invention. In order to increase the processing speed while maintaining tracking accuracy, the present invention only needs to perform two sample matching operations, and proposes a method of using a simple filtering operation to replace the Kalman filter. device.

以下對於本發明之高速數據關聯方法做更進一步的說明：The following is a further explanation of the high-speed data association method of the present invention:

首先，輸入當前幀中複數個偵測物件資訊、及前一幀中複數個追蹤器資訊。First, input multiple detected object information in the current frame and multiple tracker information in the previous frame.

計算第i個偵測物件之特徵向量與第j個追蹤器之特徵向量之間的一特徵度量距離；在一實施例中，此特徵度量距離是餘弦距離，定義如下述公式(1)：公式(1)： =1 - 公式(1)中，符號表示兩個向量之間的內積運算子。 Calculate a feature metric distance between the feature vector of the i-th detected object and the feature vector of the j-th tracker ; In one embodiment, the feature metric distance is a cosine distance, defined as the following formula (1): Formula (1): =1- In formula (1), the symbol Represents the inner product operator between two vectors.

計算第i個偵測物件之邊界框與第j個追蹤器之邊界框之間的一邊界框度量距離；在一實施例中，此邊界框度量距離是IoU距離定義如下述公式(2)、公式(3)： Calculate a bounding box metric distance between the bounding box of the i-th detected object and the bounding box of the j-th tracker ; In one embodiment, the bounding box metric distance is the IoU distance defined as the following formula (2) and formula (3):

令( BBox _i , )及 (BBox _j, , )分別表示當前幀中第 i個檢測對象及前一幀中第 j個追蹤對象的邊界框及嵌入特徵向量：公式(2)： =1 – IoU( BBox _i , BBox _j ) 其中IoU函數定義為：其中函數area(A)為計算輸入集A的面積；符號表示兩個集合的交集及並集算子。 Let ( BBox _i , ) and (BBox _j, , ) respectively represent the bounding box and embedded feature vector of the i- th detection object in the current frame and the j -th tracking object in the previous frame: Formula (2): =1 – IoU ( BBox _i , BBox _j ) where the IoU function is defined as: The function area(A) is to calculate the area of the input set A; the symbol Represents the intersection and union operators of two sets.

為了加快樣本匹配處理，本發明的簡單篩選之過濾操作的方法旨在盡快過濾掉具有較大度量值的樣本匹配對，因此，本發明應用閾值操作來加速這個過濾過程，定義如下述公式(3)：公式(3)： In order to speed up the sample matching process, the simple filtering operation method of the present invention aims to filter out sample matching pairs with larger metric values as soon as possible. Therefore, the present invention applies a threshold operation to speed up the filtering process, which is defined as the following formula (3 ): Formula (3):

閾值 t是根據不同的指標確定的；在一實施例中，以 t ^E 表示所述過濾特徵度量指標的閾值，以 t ^l 表示所述過濾邊界框度量指標的閾值，使得滿足下述公式(4)：公式(4)： = SF( , tE)及 = SF( , tl) The threshold t is determined based on different indicators; in one embodiment, t ^E represents the threshold of the filtering feature metric, and t ^l represents the threshold of the filtering bounding box metric, so that the following formula (4 ): Formula (4): =SF( , tE) and =SF( , tl)

上述方法可以用來有效地從總共 i × j個匹配對中，檢測出一些不好的匹配對。 The above method can be used to effectively detect some bad matching pairs from a total of i × j matching pairs.

接著，基於公式(4)給出的兩個過濾指標的加權和，創建一個 i-by- j成本矩陣，使得滿足下述公式(5)：公式(5)： C _ij= w + (1- w) 公式(5)中，C _ij表示成本矩陣中位置(i,j)的成本值， w表示用於融合兩個度量的權重參數。 Then, based on the weighted sum of the two filtering indicators given by formula (4), an i -by- j cost matrix is created such that the following formula (5) is satisfied: Formula (5): C _ij = w + (1- w ) In formula (5), C _ij represents the cost value at position (i, j) in the cost matrix, and w represents the weight parameter used to fuse the two metrics.

最後，將匈牙利算法應用於成本矩陣，以找到當前檢測與先前追蹤器之間的最佳匹配對集合。Finally, the Hungarian algorithm is applied to the cost matrix to find the best set of matching pairs between the current detection and previous trackers.

特別注意的是，本發明使用不同的參數設置執行兩次樣本匹配處理，第一次匹配旨在確定當前檢測與活動追蹤池中的所有追蹤器之間的最佳匹配對集合，第二次匹配旨在確定當前未匹配檢測與非活動追蹤池中的所有追蹤器之間的最佳匹配對集合。It is particularly noteworthy that the present invention performs two sample matching processes using different parameter settings. The first matching is intended to determine the best matching pair set between the current detection and all trackers in the active tracking pool. The second matching Aims to determine the best set of matching pairs between currently unmatched detections and all trackers in the inactive tracking pool.

在一實施例中，第一次匹配的參數設置為( t ^E, t ^l, w ) = (0.8, 0.5, 0.8)。 In one embodiment, the parameters of the first matching are set to ( t ^E , t ^l , w ) = (0.8, 0.5, 0.8).

在一實施例中，第二次匹配的參數設置為( t ^E, t ^l, w ) = (0.8, 1.0, 1.0)。 In one embodiment, the parameters of the second matching are set to ( t ^E , t ^l , w ) = (0.8, 1.0, 1.0).

當前幀中所有剩餘的未匹配檢測將用於創建在非活動追蹤池中初始化的新追蹤器；否則，超過30幀未激活的非活動追蹤器將被刪除。All remaining unmatched detections in the current frame will be used to create new trackers initialized in the inactive tracking pool; otherwise, inactive trackers that have not been active for more than 30 frames will be deleted.

請參閱圖4，為本發明之另一高速數據關聯動作流程圖，本發明之高速數據關聯方法也可以包含使用運動預測器，例如經由運動預測器來預測活動追蹤池中每個追蹤器之邊界框於下一幀的位置，或者經由運動預測器來預測未活動追蹤池中每個追蹤器之邊界框於下一幀的位置。Please refer to FIG. 4 , which is another high-speed data association action flow chart of the present invention. The high-speed data association method of the present invention may also include using a motion predictor, for example, using a motion predictor to predict the boundaries of each tracker in the activity tracking pool. The position of the bounding box in the next frame for each tracker in the inactive tracking pool is predicted by the motion predictor.

運動預測器的種類並不受限制，例如可以為卡曼濾波器、資訊濾波器、或其他系統狀態估測器；在一實施例中，運動預測器為卡曼濾波器。The type of motion predictor is not limited, and may be, for example, a Kalman filter, an information filter, or other system state estimators; in one embodiment, the motion predictor is a Kalman filter.

請參閱下述表1，顯示與傳統之基於VGG-SSD的多目標跟蹤器相比，本發明之MOT模型(MobileNetV2)在桌上型電腦上的性能評估結果，其中VGG-SSD的資訊來自於2016年12月在荷蘭阿姆斯特丹之歐洲電腦視覺會議(European Conference on Computer Vision)中第21-37頁；從表3中可知，當使用簡單篩選之過濾操作的方法來代替卡曼濾波器時，其處理速度大幅提升至50.5 FPS，其追蹤性能也顯著提高7.3% MOTA及3.1 IDF1；由於簡單篩選之過濾操作的方法沒有預測被追蹤目標在前一幀的運動，因此追蹤結果中的IDSW數量顯著增加，使處理速度達到最佳水平；這一優勢有助於提高嵌入式系統上運行的MOT模型的即時處理性能。Please refer to Table 1 below, which shows the performance evaluation results of the MOT model (MobileNetV2) of the present invention on a desktop computer compared with the traditional multi-object tracker based on VGG-SSD, in which the information of VGG-SSD comes from Pages 21-37 of the European Conference on Computer Vision in Amsterdam, Netherlands, December 2016; from Table 3, it can be seen that when a simple filtering operation method is used to replace the Kalman filter, its The processing speed has been greatly increased to 50.5 FPS, and its tracking performance has also been significantly improved by 7.3% MOTA and 3.1 IDF1; since the simple filtering operation method does not predict the movement of the tracked target in the previous frame, the number of IDSWs in the tracking results has increased significantly. , enabling the processing speed to reach optimal levels; this advantage helps improve the real-time processing performance of MOT models running on embedded systems.

[表1]，主幹網路模型數據關聯 MOTA (↑) IDF1 (↑) FP (↓) FN (↓) IDSW (↓) FPS* (↑) VGG-SSD 卡曼濾波器 51.0% 44.9% 8951 77862 2564 34.3 MobileNetV2 簡單篩選 58.3% 48.0% 9420 63270 3358 50.5 與VGG-SSD 相比性能提升用簡單篩選替代卡曼濾波器 +7.3% +3.1% +469 -14592 +794 +16.2 *FPS: 包含模型推斷及後處理之MOT系統整體每秒處理的幀數 [Table 1], Backbone network model Data association MOTA (↑) IDF1 (↑) FP (↓) FN (↓) IDSW (↓) FPS* (↑) VGG-SSD Kalman filter 51.0% 44.9% 8951 77862 2564 34.3 MobileNetV2 Simple filter 58.3% 48.0% 9420 63270 3358 50.5 Performance improvement compared to VGG-SSD Replace the Kalman filter with simple filtering +7.3% +3.1% +469 -14592 +794 +16.2 *FPS: The number of frames per second processed by the entire MOT system including model inference and post-processing

請參閱下述表2，表2顯示本發明與現有方法相比的運算效率；從表2可知，基於使用簡單篩選之數據關聯方法可以在不增加記憶體使用大小及參數數量的情況下大大提高處理速度；並且，當多目標追蹤器的主幹網路模型越強健，簡單篩選之數據關聯方法對追蹤性能的降低影響就越小。Please refer to the following Table 2. Table 2 shows the computing efficiency of the present invention compared with the existing method. From Table 2, it can be seen that the data association method based on the use of simple filtering can be greatly improved without increasing the memory usage size and the number of parameters. Processing speed; and, when the backbone network model of a multi-object tracker is more robust, the simple filtering data association method will have less impact on tracking performance.

[表2] 方法主幹網路模型新的錨框設計數據關聯 FLOPs 記憶體使用大小參數數量模型大小 FPS on Desktop FPS on Xavier VGG-SSD VGG-SDD No FPN No 卡曼濾波器 183.3 G 2.0 GB 36.0 M 144.1 MB 34.3 5.2 JDE-1088 Dark Net53 ConV FPN No 卡曼濾波器 271.9 G 2.8 GB 73.1 M 292.6 MB 1835 2.9 HarDNet512-KF HarD Net Mobile FPN Yes 卡曼濾波器 117.2 G 2.3 GB 43.9 M 163.0 MB 32.7 4.0 HarDNet512-SF HarD Net Mobile FPN Yes 簡單篩選 40.2 4.7 MobileNet512-KF Mobile NetV2 Mobile FPN Yes 卡曼濾波器 18.6 G 2.0 GB 19.4 M 78.2 MB 41.2 10.6 MobileNet512-SF Mobile NetV2 Mobile FPN Yes 簡單篩選 50.5 12.6 [Table 2] method Backbone network model New anchor box design Data association FLOPs Memory usage Number of parameters Model size FPS on Desktop FPS on Xavier VGG-SSD VGG-SDD No FPN No Kalman filter 183.3G 2.0 GB 36.0M 144.1 MB 34.3 5.2 JDE-1088 Dark Net53 ConV FPN No Kalman filter 271.9G 2.8 GB 73.1M 292.6 MB 1835 2.9 HarDNet512-KF HarD Net Mobile FPN Yes Kalman filter 117.2G 2.3 GB 43.9M 163.0 MB 32.7 4.0 HarDNet512-SF HarD Net Mobile FPN Yes Simple filter 40.2 4.7 MobileNet512-KF Mobile NetV2 Mobile FPN Yes Kalman filter 18.6G 2.0 GB 19.4M 78.2 MB 41.2 10.6 MobileNet512-SF Mobile NetV2 Mobile FPN Yes Simple filter 50.5 12.6

綜上所述，在本發明中，提出一種基於MobileNet的實時輕量級MOT方法，以有效提高MOT處理速度；本發明提出的追蹤方法由輕量級MOT模型和後處理模塊組成；在後處理模塊中，提出一種簡單篩選之過濾操作方法來代替傳統數據關聯處理中所使用的卡爾曼濾波器，以加快處理速度；實驗結果顯示，提出的MOT方法在桌上型電腦及嵌入式平台上分別運行時可以達到每秒50.5幀(FPS)及12.6 FPS的高速處理速度；此外，與現有的MOT方法相比，所提出的方法提供具有競爭力的追蹤性能；這些優點使得本發明之方法適用於在嵌入式平台上運行的許多應用，例如視覺監控、移動機器人的視覺追蹤控制、人機互動等。To sum up, in the present invention, a real-time lightweight MOT method based on MobileNet is proposed to effectively improve the MOT processing speed; the tracking method proposed by the present invention consists of a lightweight MOT model and a post-processing module; in the post-processing In the module, a simple filtering operation method is proposed to replace the Kalman filter used in traditional data association processing to speed up the processing. The experimental results show that the proposed MOT method is effective on desktop computers and embedded platforms respectively. When running, it can reach high-speed processing speeds of 50.5 frames per second (FPS) and 12.6 FPS; in addition, compared with the existing MOT method, the proposed method provides competitive tracking performance; these advantages make the method of the present invention suitable for Many applications run on embedded platforms, such as visual surveillance, visual tracking control of mobile robots, human-computer interaction, etc.

以上僅表達了本發明的其中的實施例，但並非對本發明專利範圍的限制，對於本領域的具通常知識者來說，在不脫離本發明構思的前提下，還可以做出若干變形和改進，這些都屬於本發明的保護範圍。The above only expresses the embodiments of the present invention, but does not limit the patent scope of the present invention. For those with ordinary knowledge in the art, several modifications and improvements can be made without departing from the concept of the present invention. , these all belong to the protection scope of the present invention.

無。without.

圖1為先前技術之數據關聯動作流程圖；圖2為本發明之輕量化多目標追蹤系統架構圖；圖3為本發明之一高速數據關聯動作流程圖；圖4為本發明之另一高速數據關聯動作流程圖。 Figure 1 is a flow chart of data association actions in the prior art; Figure 2 is an architecture diagram of the lightweight multi-target tracking system of the present invention; Figure 3 is a flow chart of high-speed data correlation actions according to the present invention; Figure 4 is another high-speed data correlation operation flow chart of the present invention.

Claims

A high-speed data association method for multi-target tracking, including the following steps: (a) Input the information of M detected objects in the current frame and the information of N trackers in the previous frame; (b) Calculate the i-th detected object A feature metric distance between the feature vector of and the feature vector of the jth tracker ; (c) Calculate a bounding box metric distance between the bounding box of the i-th detected object and the bounding box of the j-th tracker ; (d) Apply a simple filtering operation to the feature metric distance and the bounding box metric distance to obtain a filtered feature metric index respectively. and a filtered bounding box metric , where i=1～M, j=1～N; (e) The filtering feature measurement index and the filtered bounding box metric Perform a weighted sum operation to generate a cost matrix C of size M × N; (f) Apply a first linear assignment algorithm to the cost matrix C to find the distance between the current detected object and the previous tracker A best matching pair set, an unmatched detection object set, and an unmatched tracker set; (g) update the corresponding detection object information according to the current detection object information in the first best matching pair set Tracker information, and place the tracker in the activity tracking pool, and clear the unmatched frame counter of the tracker so that the counter value is 0; (h) According to the first unmatched tracking For each previous tracker in the tracker set, if the number of consecutive unmatched frames exceeds K frames, the tracker is deleted. If not, the tracker is placed in the inactive tracking pool and the tracker is updated. Unmatched frame counter; (i) According to the steps (a) to (e), calculate the remaining M1 unmatched detection objects and all N1 trackers in the inactive tracking pool, and generate a size is a cost matrix C1 of M1 × N1; (j) apply a second linear assignment algorithm to the cost matrix C1 to find a best matching pair set between the current unmatched detection object and the inactive tracker , an unmatched detection object set, and an unmatched tracker set; (k) update the corresponding inactive tracker information according to the current unmatched detection object information in the second best matching pair set , and place the tracker in the activity tracking pool, and at the same time clear the unmatched frame counter of the tracker, so that the counter value is 0; (l) According to the second unmatched tracker For each inactive tracker in the set, if the number of consecutive unmatched frames exceeds K frames, the inactive tracker will be deleted. If not, the inactive tracker will be placed in the inactive tracking pool. And update the unmatched frame counter of the inactive tracker; (m) Based on the unmatched detected object information for each second time, create a new tracker responsible for tracking the object, and transfer the tracking Place the tracker in the inactive tracking pool; (n) Output the results of all trackers in the active tracking pool.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (a), each input of the detected object information and the tracker information includes at least one bounding box information and At least one feature vector information.

The high-speed data association method for multi-target tracking as described in claim 1, in step (b), the feature measurement distance is the cosine distance, and is obtained by applying the following formula (1) by using the feature vector of the detected object and the feature vector of the tracker; Formula (1): =1- In the formula (1), the symbol Represents the inner product operator between two vectors.

The high-speed data association method for multi-target tracking as described in claim 1, in step (c), the bounding box measures distance is the IoU distance (Intersection over union distance), and is obtained by applying the following formula (2) by detecting the bounding box of the object and the bounding box of the tracker; Formula (2): =1 – IoU ( BBox _i , BBox _j ) In the formula (2), the IoU function is defined as: Among them, BBox _i and BBox _j represent the bounding box of the i -th detection object in the current frame and the j -th tracking object in the previous frame; the function area(A) is to calculate the area of the input set A; the symbol Represents the intersection and union operators of two sets.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (d), the filtering operation of the simple filtering is to give a threshold t , through the following formula (3) and formula ( 4) Measure the distance to the feature and the bounding box metric distance Perform filtering action: Formula (3): In the formula (3), SF represents the filtering operation of simple screening; Formula (4): =SF( , t ^E ), =SF( , t ^l ) In the formula (4), t ^E represents the threshold of the filtering feature metric, and t ^l represents the threshold of the filtering bounding box metric.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (e), the weight sum operation is as follows: after a weight value w is given, the weight sum calculation is performed through the following formula (5) Filter feature metrics and the filtered bounding box metric , perform weight fusion operations in a one-to-one manner to generate the two-dimensional matrix of size M × N: Formula (5): C _ij = w + (1- w ) In the formula (5), C _ij represents the cost value at position (i, j) in the cost matrix, and w represents the weight parameter used to fuse the two metrics.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (f), the first linear assignment algorithm is the Hungarian algorithm.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (g), the currently detected object information updates the corresponding tracker information, including storing the current detected object information. The bounding box is added to the tracker, and the weighted sum of the feature vector of the currently detected object and the feature vector of the tracker is calculated.

The high-speed data association method for multi-target tracking as described in request item 1, in the step (h), the K value is any positive integer greater than 0; the unmatched frame counter of the tracker is updated. Add 1 to the counter value.

The high-speed data association method for multi-target tracking as described in claim 1, in step (i), M1≤M is satisfied.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (j), the second linear assignment algorithm is the Hungarian algorithm.

The high-speed data association method for multi-target tracking as described in claim 1, in the step (k), the currently unmatched detected object information is used to update the corresponding inactive tracker information, including storing the current unmatched detected object information. The bounding box of the unmatched detected object is added to the inactive tracker, and the weighted sum of the feature vector of the currently unmatched detected object and the feature vector of the inactive tracker is calculated, and the unmatched detected object is added to the weighted feature vector of the inactive tracker. The status of the activity tracker is updated to active.

As for the high-speed data association method for multi-target tracking described in claim 1, in the step (1), the K value is any positive integer greater than 0.

The high-speed data association method for multi-target tracking as described in claim 1, in step (m), creating a new tracker responsible for tracking the object includes storing the current unmatched detected object. The bounding box and feature vector are added to the new tracker, and the state of the new tracker is set to inactive and the unmatched frame counter value is 0.

A high-speed data association method for multi-target tracking, including the following steps: (a) Input the information of M detected objects in the current frame and the information of N trackers in the previous frame; (b) Calculate the i-th detected object A feature metric distance between the feature vector of and the feature vector of the jth tracker ; (c) Calculate a bounding box metric distance between the bounding box of the i-th detected object and the bounding box of the j-th tracker ; (d) Apply a simple filtering operation to the feature metric distance and the bounding box metric distance to obtain a filtered feature metric index respectively. and a filtered bounding box metric , where i=1～M, j=1～N; (e) The filtering feature measurement index and the filtered bounding box metric Perform a weighted sum operation to generate a cost matrix C of size M × N; (f) Apply a first linear assignment algorithm to the cost matrix C to find the distance between the current detected object and the previous tracker A best matching pair set, an unmatched detection object set, and an unmatched tracker set; (g) update the corresponding detection object information according to the current detection object information in the first best matching pair set Tracker information, and place the tracker in the activity tracking pool, and clear the unmatched frame counter of the tracker so that the counter value is 0; and predict the activity tracking through a motion predictor The position of the bounding box of each tracker in the pool at the next frame, and updating the position of the bounding box of each tracker; (h) based on each previous tracking in the first set of unmatched trackers If the number of consecutive unmatched frames exceeds K frames, delete the tracker. If not, place the tracker in the inactive tracking pool and update the unmatched frame counter of the tracker; and Predict the position of the bounding box of each tracker in the inactive tracking pool in the next frame via a motion predictor, and update the position of the bounding box of each tracker; (i) according to the step (i) a) to step (e), calculate the remaining M1 unmatched detection objects and all N1 trackers in the inactive tracking pool, and generate a cost matrix C1 of size M1 × N1; (j) All the The cost matrix C1 applies a second linear assignment algorithm to find a best matching pair set between the current unmatched detection objects and inactive trackers, a set of unmatched detection objects, and an unmatched tracker. Set; (k) Update the corresponding inactive tracker information according to the current unmatched detection object information in the second best matching pair set, and place the tracker in the active tracking pool , and at the same time clear the unmatched frame counter of the tracker, so that the counter value is 0; (l) According to each inactive tracker in the second unmatched tracker set, if there are consecutive unmatched If the number of frames exceeds K frames, delete the inactive tracker. If not, place the inactive tracker in the inactive tracking pool, and update the unmatched frame counter of the inactive tracker; (m) Based on each unmatched detected object information for the second time, create a new tracker responsible for tracking the object, and place the tracker in the inactive tracking pool; (n) Output Tracker results from all active tracking pools.

The high-speed data association method for multi-target tracking as described in claim 15, in the step (g) and the step (h), the motion predictor is a Kalman filter, an information filter, or other systems State estimator.